- Store HDFS data and run tasks
- Manage the cluster and track task status
- Provide a gateway for accessing the cluster
- Run tasks but do not host HDFS data