Hi all, As we know MRv2 (the MapReduce library in YARN) has changed significantly. We have a cost model built for the MapReduce in Hadoop and are going to migrate to MRv2. Can anyone give us a pointer to the fundamental differences between them? Also, below are some of my understandings and feel free to correct me.
1. JT has been replaced by a central RM and a per-application AM. 2. TT has been replaced by the NM and the task slots have been replaced by the containers. The containers can be allocated dynamically thus both the number and the memory size of the containers can vary on demand. 3. The shuffle service has become independent from the Map. Thanks, Jie
