[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129523#comment-13129523 ] Hudson commented on MAPREDUCE-279: -- Integrated in Hadoop-Mapreduce-0.23-Commit #17 (See [https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Commit/17/]) MAPREDUCE-279. Adding a changelog to branch-0.23. acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1185489 Files : * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279-script-20110817.sh, MR-279-script-final.sh, MR-279-script.sh, MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move-20110817.txt, MR-279_MR_files_to_move.txt, MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, NodeManager.gv, NodeManager.png, ResourceManager.gv, ResourceManager.png, capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, multi-column-stable-sort-default-theme.png, post-move-patch-20110817.2.txt, post-move-patch-final.txt, post-move.patch, post-move.patch, post-move.patch, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. Check it out by following [the instructions|http://goo.gl/rSJJC]. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129693#comment-13129693 ] Hudson commented on MAPREDUCE-279: -- Integrated in Hadoop-Hdfs-0.23-Build #43 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/43/]) MAPREDUCE-279. Adding a changelog to branch-0.23. acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1185489 Files : * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279-script-20110817.sh, MR-279-script-final.sh, MR-279-script.sh, MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move-20110817.txt, MR-279_MR_files_to_move.txt, MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, NodeManager.gv, NodeManager.png, ResourceManager.gv, ResourceManager.png, capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, multi-column-stable-sort-default-theme.png, post-move-patch-20110817.2.txt, post-move-patch-final.txt, post-move.patch, post-move.patch, post-move.patch, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129710#comment-13129710 ] Hudson commented on MAPREDUCE-279: -- Integrated in Hadoop-Mapreduce-trunk #864 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/864/]) MAPREDUCE-279. Adding a changelog to branch-0.23. acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1185488 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279-script-20110817.sh, MR-279-script-final.sh, MR-279-script.sh, MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move-20110817.txt, MR-279_MR_files_to_move.txt, MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, NodeManager.gv, NodeManager.png, ResourceManager.gv, ResourceManager.png, capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, multi-column-stable-sort-default-theme.png, post-move-patch-20110817.2.txt, post-move-patch-final.txt, post-move.patch, post-move.patch, post-move.patch, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129728#comment-13129728 ] Hudson commented on MAPREDUCE-279: -- Integrated in Hadoop-Hdfs-trunk #834 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/834/]) MAPREDUCE-279. Adding a changelog to branch-0.23. acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1185488 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279-script-20110817.sh, MR-279-script-final.sh, MR-279-script.sh, MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move-20110817.txt, MR-279_MR_files_to_move.txt, MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, NodeManager.gv, NodeManager.png, ResourceManager.gv, ResourceManager.png, capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, multi-column-stable-sort-default-theme.png, post-move-patch-20110817.2.txt, post-move-patch-final.txt, post-move.patch, post-move.patch, post-move.patch, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129509#comment-13129509 ] Hudson commented on MAPREDUCE-279: -- Integrated in Hadoop-Hdfs-trunk-Commit #1179 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1179/]) MAPREDUCE-279. Adding a changelog to branch-0.23. acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1185488 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279-script-20110817.sh, MR-279-script-final.sh, MR-279-script.sh, MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move-20110817.txt, MR-279_MR_files_to_move.txt, MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, NodeManager.gv, NodeManager.png, ResourceManager.gv, ResourceManager.png, capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, multi-column-stable-sort-default-theme.png, post-move-patch-20110817.2.txt, post-move-patch-final.txt, post-move.patch, post-move.patch, post-move.patch, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. Check it out by following [the instructions|http://goo.gl/rSJJC]. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129510#comment-13129510 ] Hudson commented on MAPREDUCE-279: -- Integrated in Hadoop-Common-trunk-Commit #1100 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1100/]) MAPREDUCE-279. Adding a changelog to branch-0.23. acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1185488 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279-script-20110817.sh, MR-279-script-final.sh, MR-279-script.sh, MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move-20110817.txt, MR-279_MR_files_to_move.txt, MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, NodeManager.gv, NodeManager.png, ResourceManager.gv, ResourceManager.png, capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, multi-column-stable-sort-default-theme.png, post-move-patch-20110817.2.txt, post-move-patch-final.txt, post-move.patch, post-move.patch, post-move.patch, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. Check it out by following [the instructions|http://goo.gl/rSJJC]. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129512#comment-13129512 ] Hudson commented on MAPREDUCE-279: -- Integrated in Hadoop-Common-0.23-Commit #15 (See [https://builds.apache.org/job/Hadoop-Common-0.23-Commit/15/]) MAPREDUCE-279. Adding a changelog to branch-0.23. acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1185489 Files : * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279-script-20110817.sh, MR-279-script-final.sh, MR-279-script.sh, MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move-20110817.txt, MR-279_MR_files_to_move.txt, MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, NodeManager.gv, NodeManager.png, ResourceManager.gv, ResourceManager.png, capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, multi-column-stable-sort-default-theme.png, post-move-patch-20110817.2.txt, post-move-patch-final.txt, post-move.patch, post-move.patch, post-move.patch, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. Check it out by following [the instructions|http://goo.gl/rSJJC]. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129513#comment-13129513 ] Hudson commented on MAPREDUCE-279: -- Integrated in Hadoop-Hdfs-0.23-Commit #16 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Commit/16/]) MAPREDUCE-279. Adding a changelog to branch-0.23. acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1185489 Files : * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279-script-20110817.sh, MR-279-script-final.sh, MR-279-script.sh, MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move-20110817.txt, MR-279_MR_files_to_move.txt, MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, NodeManager.gv, NodeManager.png, ResourceManager.gv, ResourceManager.png, capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, multi-column-stable-sort-default-theme.png, post-move-patch-20110817.2.txt, post-move-patch-final.txt, post-move.patch, post-move.patch, post-move.patch, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. Check it out by following [the instructions|http://goo.gl/rSJJC]. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129517#comment-13129517 ] Hudson commented on MAPREDUCE-279: -- Integrated in Hadoop-Mapreduce-trunk-Commit #1119 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1119/]) MAPREDUCE-279. Adding a changelog to branch-0.23. acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1185488 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279-script-20110817.sh, MR-279-script-final.sh, MR-279-script.sh, MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move-20110817.txt, MR-279_MR_files_to_move.txt, MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, NodeManager.gv, NodeManager.png, ResourceManager.gv, ResourceManager.png, capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, multi-column-stable-sort-default-theme.png, post-move-patch-20110817.2.txt, post-move-patch-final.txt, post-move.patch, post-move.patch, post-move.patch, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. Check it out by following [the instructions|http://goo.gl/rSJJC]. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13101159#comment-13101159 ] Hudson commented on MAPREDUCE-279: -- Integrated in Hadoop-Hdfs-trunk #788 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/788/]) Adding back hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources which was missed during the merge of MAPREDUCE-279. acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1166972 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/META-INF/services/org.apache.hadoop.mapreduce.protocol.ClientProtocolProvider Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279-script-20110817.sh, MR-279-script-final.sh, MR-279-script.sh, MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move-20110817.txt, MR-279_MR_files_to_move.txt, MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, NodeManager.gv, NodeManager.png, ResourceManager.gv, ResourceManager.png, capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, multi-column-stable-sort-default-theme.png, post-move-patch-20110817.2.txt, post-move-patch-final.txt, post-move.patch, post-move.patch, post-move.patch, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. Check it out by following [the instructions|http://goo.gl/rSJJC]. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13101246#comment-13101246 ] Hudson commented on MAPREDUCE-279: -- Integrated in Hadoop-Mapreduce-trunk #812 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/812/]) Adding back hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources which was missed during the merge of MAPREDUCE-279. acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1166972 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/META-INF/services/org.apache.hadoop.mapreduce.protocol.ClientProtocolProvider Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279-script-20110817.sh, MR-279-script-final.sh, MR-279-script.sh, MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move-20110817.txt, MR-279_MR_files_to_move.txt, MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, NodeManager.gv, NodeManager.png, ResourceManager.gv, ResourceManager.png, capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, multi-column-stable-sort-default-theme.png, post-move-patch-20110817.2.txt, post-move-patch-final.txt, post-move.patch, post-move.patch, post-move.patch, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. Check it out by following [the instructions|http://goo.gl/rSJJC]. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13100902#comment-13100902 ] Hudson commented on MAPREDUCE-279: -- Integrated in Hadoop-Common-trunk-Commit #857 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/857/]) Adding back hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources which was missed during the merge of MAPREDUCE-279. acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1166972 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/META-INF/services/org.apache.hadoop.mapreduce.protocol.ClientProtocolProvider Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279-script-20110817.sh, MR-279-script-final.sh, MR-279-script.sh, MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move-20110817.txt, MR-279_MR_files_to_move.txt, MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, NodeManager.gv, NodeManager.png, ResourceManager.gv, ResourceManager.png, capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, multi-column-stable-sort-default-theme.png, post-move-patch-20110817.2.txt, post-move-patch-final.txt, post-move.patch, post-move.patch, post-move.patch, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. Check it out by following [the instructions|http://goo.gl/rSJJC]. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13100906#comment-13100906 ] Hudson commented on MAPREDUCE-279: -- Integrated in Hadoop-Mapreduce-trunk-Commit #868 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/868/]) Adding back hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources which was missed during the merge of MAPREDUCE-279. acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1166972 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/META-INF/services/org.apache.hadoop.mapreduce.protocol.ClientProtocolProvider Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279-script-20110817.sh, MR-279-script-final.sh, MR-279-script.sh, MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move-20110817.txt, MR-279_MR_files_to_move.txt, MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, NodeManager.gv, NodeManager.png, ResourceManager.gv, ResourceManager.png, capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, multi-column-stable-sort-default-theme.png, post-move-patch-20110817.2.txt, post-move-patch-final.txt, post-move.patch, post-move.patch, post-move.patch, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. Check it out by following [the instructions|http://goo.gl/rSJJC]. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13096835#comment-13096835 ] Sharad Agarwal commented on MAPREDUCE-279: -- Thanks Binglin, it is incredibly useful. I have filed MAPREDUCE-2930 where you may want to contribute the patch. It will help to keep the graphs up to date. Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279-script-20110817.sh, MR-279-script-final.sh, MR-279-script.sh, MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move-20110817.txt, MR-279_MR_files_to_move.txt, MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, NodeManager.gv, NodeManager.png, ResourceManager.gv, ResourceManager.png, capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, multi-column-stable-sort-default-theme.png, post-move-patch-20110817.2.txt, post-move-patch-final.txt, post-move.patch, post-move.patch, post-move.patch, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. Check it out by following [the instructions|http://goo.gl/rSJJC]. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13096591#comment-13096591 ] Binglin Chang commented on MAPREDUCE-279: - bq. Ultimately a version of these should be produced natively in some StateMachine method (toDot()?), and I think Chris Douglas may take that up eventually. However, some of the desirable info (e.g., which states send events to or receive them from other state machines) can't really be discovered automatically, so there will continue to be a place for hand-rolled graphs. What's the current progress of this work? I find visualization of state machine really help when reading learning MRv2 code, both YARN MRv2. I add some code in yarn-common to generate graphviz dot file automatically when I try to learn YARN code yesterday, it works fine for me, maybe it is useful for others too. Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279-script-20110817.sh, MR-279-script-final.sh, MR-279-script.sh, MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move-20110817.txt, MR-279_MR_files_to_move.txt, MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, multi-column-stable-sort-default-theme.png, post-move-patch-20110817.2.txt, post-move-patch-final.txt, post-move.patch, post-move.patch, post-move.patch, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. Check it out by following [the instructions|http://goo.gl/rSJJC]. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086354#comment-13086354 ] Thomas Graves commented on MAPREDUCE-279: - I think the move of mapreduce to hadoop-mapreduce got lost in the latest MR-279-script-20110817.sh. Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279-script-20110817.sh, MR-279-script.sh, MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move-20110817.txt, MR-279_MR_files_to_move.txt, MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, multi-column-stable-sort-default-theme.png, post-move-patch-20110817.2.txt, post-move.patch, post-move.patch, post-move.patch, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. Check it out by following [the instructions|http://goo.gl/rSJJC]. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13085979#comment-13085979 ] Philip Zeyliger commented on MAPREDUCE-279: --- I will return on the 24th. For urgent matters, please contact my teammates or Amr. Thanks, -- Philip Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt, MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, multi-column-stable-sort-default-theme.png, post-move.patch, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. Check it out by following [the instructions|http://goo.gl/rSJJC]. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086039#comment-13086039 ] Alejandro Abdelnur commented on MAPREDUCE-279: -- I've just applied the patch following instructions and it compiles fine. Yesterday I've opened a JIRA with things to improve, MAPREDUCE-2842. IMO, most of those can be done incrementally after this patch goes in. What I think it should be done as part of this patch (MAPREDUCE-279) is the artifact/maven-module-dir names. All artifact names should be prefixed with 'hadoop-' (the JARs get the artifact names and it will be easier to troubleshoot, identify the JARS). In addition, the maven-module-dir should be the same to make it easier to developers to find their way around. The reason for proposing doing this as part of this patch is to avoid doing 2 huge moves of files in SVN. (HDFS-2096 is aligned to this naming already) Thanks Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt, MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, multi-column-stable-sort-default-theme.png, post-move.patch, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. Check it out by following [the instructions|http://goo.gl/rSJJC]. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086048#comment-13086048 ] Alejandro Abdelnur commented on MAPREDUCE-279: -- I've just updated MAPREDUCE-2842 with the a propose naming for artifacts/module-dirs. Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt, MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, multi-column-stable-sort-default-theme.png, post-move.patch, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. Check it out by following [the instructions|http://goo.gl/rSJJC]. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086053#comment-13086053 ] Arun C Murthy commented on MAPREDUCE-279: - Thanks Alejandro, I do agree that we should avoid 2 huge svn moves if we can avoid it - let me try to fix up scripts to be in line with your proposals. Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt, MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, multi-column-stable-sort-default-theme.png, post-move.patch, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. Check it out by following [the instructions|http://goo.gl/rSJJC]. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086052#comment-13086052 ] Arun C Murthy commented on MAPREDUCE-279: - Thanks Alejandro, I do agree that we should avoid 2 huge svn moves if we can avoid it - let me try to fix up scripts to be in line with your proposals. Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt, MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, multi-column-stable-sort-default-theme.png, post-move.patch, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. Check it out by following [the instructions|http://goo.gl/rSJJC]. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13085217#comment-13085217 ] Hudson commented on MAPREDUCE-279: -- Integrated in Hadoop-Common-trunk-Commit #742 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/742/]) MAPREDUCE-2837. Ported bug fixes from y-merge to prepare for MAPREDUCE-279 merge. acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1157249 Files : * /hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/mapreduce/security/TestTokenCache.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/ACLsManager.java * /hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/mapreduce/security/TestBinaryTokenFile.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/MapTask.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/Job.java * /hadoop/common/trunk/mapreduce/src/test/mapred-site.xml * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/task/reduce/Shuffle.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/security/TokenCache.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/task/reduce/MergeManager.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/MapOutputFile.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/ReduceTask.java * /hadoop/common/trunk/mapreduce/src/webapps/job/jobdetailshistory.jsp * /hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/security/TestMapredGroupMappingServiceRefresh.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/TaskTracker.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/TaskFinishedEvent.java * /hadoop/common/trunk/mapreduce/CHANGES.txt * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/JobACLsManager.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/TaskMemoryManagerThread.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/MROutputFiles.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/JobHistoryParser.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/Task.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/MRConfig.java * /hadoop/common/trunk/mapreduce/src/examples/org/apache/hadoop/examples/terasort/TeraInputFormat.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/LocalJobRunner.java * /hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/mapred/TestMapRed.java Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. Check it out by following [the instructions|http://goo.gl/rSJJC]. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084826#comment-13084826 ] Hudson commented on MAPREDUCE-279: -- Integrated in Hadoop-Mapreduce-trunk #754 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/754/]) MAPREDUCE-2837. Ported bug fixes from y-merge to prepare for MAPREDUCE-279 merge. acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1157249 Files : * /hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/mapreduce/security/TestTokenCache.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/ACLsManager.java * /hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/mapreduce/security/TestBinaryTokenFile.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/MapTask.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/Job.java * /hadoop/common/trunk/mapreduce/src/test/mapred-site.xml * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/task/reduce/Shuffle.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/security/TokenCache.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/task/reduce/MergeManager.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/MapOutputFile.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/ReduceTask.java * /hadoop/common/trunk/mapreduce/src/webapps/job/jobdetailshistory.jsp * /hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/security/TestMapredGroupMappingServiceRefresh.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/TaskTracker.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/TaskFinishedEvent.java * /hadoop/common/trunk/mapreduce/CHANGES.txt * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/JobACLsManager.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/TaskMemoryManagerThread.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/MROutputFiles.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/JobHistoryParser.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/Task.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/MRConfig.java * /hadoop/common/trunk/mapreduce/src/examples/org/apache/hadoop/examples/terasort/TeraInputFormat.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/LocalJobRunner.java * /hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/mapred/TestMapRed.java Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. Check it out by following [the instructions|http://goo.gl/rSJJC]. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084375#comment-13084375 ] Hudson commented on MAPREDUCE-279: -- Integrated in Hadoop-Mapreduce-trunk-Commit #763 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/763/]) MAPREDUCE-2837. Ported bug fixes from y-merge to prepare for MAPREDUCE-279 merge. acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1157249 Files : * /hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/mapreduce/security/TestTokenCache.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/ACLsManager.java * /hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/mapreduce/security/TestBinaryTokenFile.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/MapTask.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/Job.java * /hadoop/common/trunk/mapreduce/src/test/mapred-site.xml * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/task/reduce/Shuffle.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/security/TokenCache.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/task/reduce/MergeManager.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/MapOutputFile.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/ReduceTask.java * /hadoop/common/trunk/mapreduce/src/webapps/job/jobdetailshistory.jsp * /hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/security/TestMapredGroupMappingServiceRefresh.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/TaskTracker.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/TaskFinishedEvent.java * /hadoop/common/trunk/mapreduce/CHANGES.txt * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/JobACLsManager.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/TaskMemoryManagerThread.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/MROutputFiles.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/JobHistoryParser.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/Task.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/MRConfig.java * /hadoop/common/trunk/mapreduce/src/examples/org/apache/hadoop/examples/terasort/TeraInputFormat.java * /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/LocalJobRunner.java * /hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/mapred/TestMapRed.java Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. Check it out by following [the instructions|http://goo.gl/rSJJC]. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061781#comment-13061781 ] Giridharan Kesavan commented on MAPREDUCE-279: -- Build setup on MR-279 branch https://builds.apache.org/job/Hadoop-MR-279-Build/ Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt, capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. Check it out by following [the instructions|http://goo.gl/rSJJC]. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061249#comment-13061249 ] Vinod Kumar Vavilapalli commented on MAPREDUCE-279: --- bq. What kind of results or terminal output should I expect? Bill, the terminal output should 'almost' be similar to what you see with Hadoop 0.20. Please create separate tickets or use mapreduce-...@hadoop.apache.org mailing list. This one is an umbrella ticket that so many are watching. Thanks. Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt, capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. Check it out by following [the instructions|http://goo.gl/rSJJC]. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059688#comment-13059688 ] Bill Lee commented on MAPREDUCE-279: In this page: http://svn.apache.org/repos/asf/hadoop/common/branches/MR-279/mapreduce/INSTALL After running the last command: $HADOOP_COMMON_HOME/bin/hadoop jar $HADOOP_MAPRED_HOME/build/hadoop-mapred-examples-0.22.0-SNAPSHOT.jar randomwriter -Dmapreduce.job.user.name=$USER -Dmapreduce.randomwriter.bytespermap=1 -Ddfs.blocksize=536870912 -Ddfs.block.size=536870912 -libjars $HADOOP_YARN_INSTALL/hadoop-mapreduce-1.0-SNAPSHOT/modules/hadoop-mapreduce-client-jobclient-1.0-SNAPSHOT.jar output What kind of results or terminal output should I expect? Thank you. Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt, capacity-scheduler-dark-theme.png, multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. Check it out by following [the instructions|http://goo.gl/rSJJC]. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059690#comment-13059690 ] eric baldeschwieler commented on MAPREDUCE-279: --- I have joined Hortonworks and am no longer at Yahoo!. Please re-send your message to my non-Yahoo! email address. Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt, capacity-scheduler-dark-theme.png, multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. Check it out by following [the instructions|http://goo.gl/rSJJC]. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13058060#comment-13058060 ] Giridharan Kesavan commented on MAPREDUCE-279: -- Nigel/Arun, I can help setup a build on MR-279 Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt, capacity-scheduler-dark-theme.png, multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. Check it out by following [the instructions|http://goo.gl/rSJJC]. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057015#comment-13057015 ] Nigel Daley commented on MAPREDUCE-279: --- Arun, are you planning to get a Jenkins build running on this branch before merge? Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt, capacity-scheduler-dark-theme.png, multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. Check it out by following [the instructions|http://goo.gl/rSJJC]. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049918#comment-13049918 ] Haoyuan Li commented on MAPREDUCE-279: -- This page doesn't work anymore: http://svn.apache.org/repos/asf/hadoop/mapreduce/branches/MR-279/INSTALL Is there any new page to replace this? Thank you. Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt, capacity-scheduler-dark-theme.png, multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049927#comment-13049927 ] Mahadev konar commented on MAPREDUCE-279: - haoyuan, Because of the svn unsplit things have moved. The new link is: http://svn.apache.org/repos/asf/hadoop/common/branches/MR-279/mapreduce/INSTALL Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt, capacity-scheduler-dark-theme.png, multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13050094#comment-13050094 ] Haoyuan Li commented on MAPREDUCE-279: -- Thank you Mahadev. Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt, capacity-scheduler-dark-theme.png, multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049257#comment-13049257 ] Eli Collins commented on MAPREDUCE-279: --- Is there an MR2 design doc? A couple of people have asked me about this would be very useful to share. Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt, capacity-scheduler-dark-theme.png, multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049360#comment-13049360 ] Arun C Murthy commented on MAPREDUCE-279: - Sigh, I keep missing this. I have a slightly old version I'll spruce up and post. Thanks for the reminder. Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt, capacity-scheduler-dark-theme.png, multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13048423#comment-13048423 ] Nigel Daley commented on MAPREDUCE-279: --- Given these build issues (and just good engineering practice), I'd like to see a Jenkins CI build on this branch so we know when merged to trunk the builds won't be (more) broken. Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt, capacity-scheduler-dark-theme.png, multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13047867#comment-13047867 ] Arun C Murthy commented on MAPREDUCE-279: - Yep, it would be nice to completely mavenize and I strongly believe it should be our goal. Maybe we can do it in stages, have a hybrid one on day one when we merge MR-279 into trunk and then do the whole nine yards? That way each can proceed independently. Currently it's becoming painful to manage a large branch and hence my suggestion to get it into trunk and do mavenization independently. Thoughts? Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt, capacity-scheduler-dark-theme.png, multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13039993#comment-13039993 ] Tom White commented on MAPREDUCE-279: - I'm wondering what the maven modules might look like for this when integrated into trunk. Something like: * api - containing the user-facing public API of MapReduce (from org.apache.hadoop.mapred(uce)). When MAPREDUCE-1638 is done it will be possible to split the API into a self-contained tree (no dependencies on other parts of MapReduce). * lib - containing the user-facing public MapReduce libraries (from org.apache.hadoop.mapred and org.apache.hadoop.mapred(uce).lib). There's a patch in MAPREDUCE-1478 to perform this separation. * classic-impl - containing the implementation classes for MapReduce. This is what's left over after doing MAPREDUCE-1638 and MAPREDUCE-1478. * nextgen-impl - this is mr-client in the MR-279 branch, which I think should be renamed, since it's not immediately clear what it's a client of in the context of the whole MapReduce project. It has submodules app, common, hs, jobclient, shuffle. * yarn - the yarn framework from the MR-279 branch. Yarn is broken into submodules too. Given the progress on mavenizing common (HADOOP-6671), is it worth integrating MAPREDUCE-279 at the same time as doing the full Mavenization of MapReduce? That would seem ideal, but perhaps there's an alternative I haven't considered. Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt, capacity-scheduler-dark-theme.png, multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13020520#comment-13020520 ] Amr Awadallah commented on MAPREDUCE-279: - I am out of office this week and will be slower than usual in responding to emails. If this is urgent then please call my cell phone (or send an SMS), otherwise I will reply to your email when I get back. Thanks for your patience, -- amr Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt, capacity-scheduler-dark-theme.png, multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, yarn-state-machine.task.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009098#comment-13009098 ] Michael Lee commented on MAPREDUCE-279: --- cannot build: failed when building hadoop-mapred-279 ( follow instructions in http://svn.apache.org/repos/asf/hadoop/mapreduce/branches/MR-279/INSTALL) when build hadoop-mapred-279: [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.3.2:testCompile (default-testCompile) on project yarn-common: Compilation failure [ERROR] /home/michael/work/hadoop-mapred-279/yarn/yarn-common/src/test/java/org/apache/hadoop/yarn/TestRPC.java:[80,37] incompatible types [ERROR] found : java.util.ArrayListjava.lang.CharSequence [ERROR] required: org.apache.avro.generic.GenericArrayjava.lang.CharSequence [ERROR] - [Help 1] [ERROR] My ENV: Maven: Apache Maven 3.0.3 (r1075438; 2011-03-01 01:31:09+0800) Maven home: /home/michael/local/apache-maven-3.0.3 Java version: 1.6.0_07, vendor: Sun Microsystems Inc. Java home: /home/michael/local/java6/jre Default locale: en_US, platform encoding: ANSI_X3.4-1968 OS name: linux, version: 2.6.9_5-4-0-3, arch: amd64, family: unix JDK: java version 1.6.0_07 Java(TM) SE Runtime Environment (build 1.6.0_07-b06) Java HotSpot(TM) 64-Bit Server VM (build 10.0-b23, mixed mode) Ant: Apache Ant version 1.7.0 compiled on December 13 2006 avro-maven-plugin: using snapshot from: https://github.com/phunt/avro-maven-plugin/, 1.4.0 branch Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009183#comment-13009183 ] Arun C Murthy commented on MAPREDUCE-279: - bq. Looking through the code a bit more I came across Hamlet. Luke can provide more details, but I believe he took this route due to the lack of a better 'embeddable' alternative. Having said that, echo'ing eric14, please feel free to open a jira with an alternate proposal and we can consider moving over to something more standard that satisfies our constraints. Alternately, in the long run, we could move Hamlet out to a separate (incubator?) project to attempt build a community around. Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009185#comment-13009185 ] Arun C Murthy commented on MAPREDUCE-279: - I've opened MAPREDUCE-2399 to discuss Hamlet. Please use that jira so that we can keep MAPREDUCE-279 focussed on the next-gen MR framework. Thanks. Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009200#comment-13009200 ] Doug Cutting commented on MAPREDUCE-279: Sharad Had to live with .genavro as the maven plugin (https://github.com/phunt/avro-maven-plugin) not been updated yet to work with the new extension. FYI, a Maven plugin is included in Avro 1.5.0 that uses the .avdl file suffix. Todd Does AvroIDL convert javadoc-style comments on records/protocols into JavaDoc on generated code? Not yet (AVRO-296). Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009381#comment-13009381 ] Konstantin Boudnik commented on MAPREDUCE-279: -- Not to start a religious war or anything, but I am kinda wondering why not to use a standard Java webapp framework such as Grails ? There's a huge community working on it and there's a lot of people with expertise which will help to ease the development of user applications on top of MR2.0. Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt, capacity-scheduler-dark-theme.png, multi-column-stable-sort-default-theme.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009385#comment-13009385 ] Arun C Murthy commented on MAPREDUCE-279: - Cos, again, can you please use MAPREDUCE-2399 to discuss the specifics of the UI? Thanks. Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt, capacity-scheduler-dark-theme.png, multi-column-stable-sort-default-theme.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009547#comment-13009547 ] Tom White commented on MAPREDUCE-279: - Also, as you pointed out, the changes to classes in src/java/org/apache/hadoop/mapred(uce) are very minor Yes, but we still need to be sure that they don't break compatibility, which is hard to see in the current patch. However, I agree that collaborating on this part by way of working on MAPREDUCE-1638 and changes in trunk will make the separation cleaner and clarify the changes required for MR2. Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt, capacity-scheduler-dark-theme.png, multi-column-stable-sort-default-theme.png Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009012#comment-13009012 ] eric baldeschwieler commented on MAPREDUCE-279: --- I'll let luke comment on the details. I'd support patches to convert the UI to something more standard, if we can agree on the right thing. Having a good UI is a plus. Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13008757#comment-13008757 ] Sharad Agarwal commented on MAPREDUCE-279: -- bq. Is the correct suffix still .genavro? Had to live with .genavro as the maven plugin (https://github.com/phunt/avro-maven-plugin) not been updated yet to work with the new extension. bq. Does AvroIDL convert javadoc-style comments on records/protocols into JavaDoc on generated code? No. I don't see the comments in the generated code. Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13008881#comment-13008881 ] Todd Lipcon commented on MAPREDUCE-279: --- Looking through the code a bit more I came across Hamlet. It seems you've written your own MVC framework and Java implementation of Haml as part of Yarn. Can you shed some light on why existing web frameworks were found to be insufficient? Do we really want a custom HTML generation framework as part of a resource scheduler? I don't have much experience with web programming in Java, but I can't imagine we have any use cases that are _that_ unique that they couldn't be satisfied using a popular framework like Spring MVC. I also have strong doubts that a bunch of systems hackers like we have in our community can do a better job at designing and implementing a web framework compared to people who do web programming all day long (witness the completely incorrect job we do of input parameter escaping we do in the current webapps) Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13008882#comment-13008882 ] Philip Zeyliger commented on MAPREDUCE-279: --- I'm traveling and will return to the office on Monday, March 28th. For urgent matters, please contact Aparna Ramani. Thanks! -- Philip Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13008452#comment-13008452 ] Arun C Murthy commented on MAPREDUCE-279: - Thanks for your f/b Tom. bq. I wonder if it would be easier not to move the src/java/org/apache/hadoop/mapred(uce) trees at this stage. The main issue is the dependency chain - currently the mr-client depends purely on apis in yarn package. In the alternate proposal (which we considered) mr-client would need to depend on yarn and src/java for the runtime. The current scheme is both more modular and enforces discipline by ensuring that the MapReduce runtime (map, sort, shuffle, merge, reduce) cannot, even accidentally, start relying on classes in the server package i.e. JT/TT etc. This also seems like the right end-state for the project. Also, as you pointed out, the changes to classes in src/java/org/apache/hadoop/mapred(uce) are very minor and the 'svn mv' is both well documented (MR-279_MR_files_to_move.txt, MR-279.sh) and straight-forward. bq. MAPREDUCE-1638 is highly relevant for this work Thanks! MAPREDUCE-1638 is very relevant. MAPREDUCE-279 already has some of the changes you proposed there i.e. keeping server classes in a separate source structure from the implementation classes - we should collaborate both on trunk and on the MR-279 branch to ensure consistency. I'm happy to merge if necessary. Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13008517#comment-13008517 ] Todd Lipcon commented on MAPREDUCE-279: --- Hi Arun. I spent the train ride this morning looking over yarn/src/main/avro in the branch. Here are a few comments, sorry for the somewhat stream-of-consciousness format. - Is the correct suffix still .genavro? Thought we'd changed the name to .avroidl or something? - Apache licenses needed on these files - Does AvroIDL convert javadoc-style comments on records/protocols into JavaDoc on generated code? If so we should do more of that. - AMRMProtocol: -- the release parameter to allocate is strange: (a) it seems the function is misnamed if you can also release things as you call it, and (b) why isn't it an arrayContainerId? -- if you want to cancel previous resource requests, do you submit a new one with a negative numContainers? - ApplicationSubmissionContext: -- would be good to have some kind of scheduler-specific parameters here? eg maybe a scheduler has something beyond just priority (eg. perhaps a deadline) -- using just URL type directly for resources - seems not quite flexible enough? eg one useful construct would be a URL + checksum -- what's resources_todo going to be? -- passing user - agreed, this should be more flexible than simple string. -- Why not contain a ContainerLaunchContext to specify the container in which to run the AM? Seems like lots of duplicated fields. - ContainerManager: -- not following YarnContainerTags - these are opaque enums, how do they get interpolated in a string? -- how does one access stderr/stdout contents? both while they're being written and after a container has terminated? (maybe I just haven't gotten to that bit yet somewhere else) - yarn-types.avro: -- For the typesafe ID classes, do we need to specify explicit comparison orderings? I don't know Avro behavior here. -- Did you consider making the ids all strings instead of ints? The pro would be that there could be canonical formats, like AM-hex id for app masters vs C-hex id for containers. AWS does a good job of this. -- Resource: field names should include units, like int memoryMB -- what are ContainerTokens? could use some extra doc at the protocol layer here. (I assume this is for security?) -- The Container type doesn't appear -- the URL record is missing user/password used for http basic auth or s3n auth -- there are some hard tabs in this file -- ApplicationMaster: --- httpPort seems like it would be better described as something like httpStatusURL? -- LocalResourceVisibility: --- just to clarify, APPLICATION visibility means only to this application submitted by this user. ie if joe and bob both submit MapReduce 2.x.y jobs with identical jars, it still won't share, even if sha1s match? --- if bob submits the same application (ie MR 2.x.y) twice, do APPLICATION visibility files get shared? Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13008648#comment-13008648 ] Chris Douglas commented on MAPREDUCE-279: - bq. Why not contain a ContainerLaunchContext to specify the container in which to run the AM? Seems like lots of duplicated fields. Agreed. Fixing this also addresses the URL as insufficient for resources. The \_todo form was introduced to effect this, and remains in-progress. bq. how does one access stderr/stdout contents? both while they're being written and after a container has terminated? (maybe I just haven't gotten to that bit yet somewhere else) This is still a TODO (working on it now). In the short term, something similar to what the TT does is probably sufficient, I hope. bq. Did you consider making the ids all strings instead of ints? The pro would be that there could be canonical formats, like AM-hex id for app masters vs C-hex id for containers. Some of the implementation ended up relying on a consistent mapping of int ids to strings, so going all the way could make sense. On the other hand, parsing strings to determine relationships between containers and applications is regrettable. bq. the URL record is missing user/password used for http basic auth or s3n auth Agreed, full URIs should be supported, though pushing that all the way through FileContext and FileSystem could be painful. bq. just to clarify, APPLICATION visibility means only to this application submitted by this user. ie if joe and bob both submit MapReduce 2.x.y jobs with identical jars, it still won't share, even if sha1s match? Right. The target layout for the NodeManager looks roughly like this: {noformat} for x in localdir: $x/filecache # public cache $x/usercache $x/usercache/$user $x/usercache/filecache # private cache $x/usercache/$user/appcache $x/usercache/$user/appcache/$appid $x/usercache/$user/appcache/$appid/filecache # application cache $x/usercache/$user/appcache/$appid/$containerid $x/usercache/$user/appcache/$appid/output # output retained after container exits, i.e. intermediate data {noformat} So the end of the container and application can just delete those subdirs. Matching a job jar between invocations would require one to register that resource as PUBLIC/PRIVATE. The APPLICATION scope is more for job.xml and the like. Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13008655#comment-13008655 ] Chris Douglas commented on MAPREDUCE-279: - Sorry, the location of the private cache is {{$x/usercache/$user/filecache}}, not {{$x/usercache/filecache}}. Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13008034#comment-13008034 ] Arun C Murthy commented on MAPREDUCE-279: - I'm going to commit this to a dev branch (MR-279?) if no one objects. Thanks. Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13008036#comment-13008036 ] Todd Lipcon commented on MAPREDUCE-279: --- sure, +1 for putting this on a dev (non-release) branch Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13008110#comment-13008110 ] Arun C Murthy commented on MAPREDUCE-279: - Thanks Todd. I've commited to a dev-branch: MR-279. Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13008327#comment-13008327 ] Tom White commented on MAPREDUCE-279: - There's a lot to digest here, but here are a couple of quick initial high-level comments from a packaging and staging perspective. I wonder if it would be easier not to move the src/java/org/apache/hadoop/mapred(uce) trees at this stage. MR 2 could just depend on the MapReduce JAR produced by the ant file, just like it does for Common. This would make the introduction of the codebase easier. There are some changes required in the existing classes, but by the look of things they are fairly minor and by introducing them in situ (in separate JIRAs) we can be sure they won't break existing users, and the changes would be easier to track. Alternatively this work could depend on full mavenization (at least of MapReduce), but that's probably some way off. MAPREDUCE-1638 is highly relevant for this work, since it aims to split out the MR API from the implementation. I've got an in-progress patch for this, which I'll post soon for discussion. Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13007772#comment-13007772 ] eric baldeschwieler commented on MAPREDUCE-279: --- Hi Folks, I'm back part-time, but I'm mainly focused on catching up, annual focal reviews and adjusting to life with a newborn at home. Todd Papaioannou (p9u) remains acting head of Hadoop this week. Most line issues can continue to go to Amol, Kazi, Satish, Avik or Senthil as appropriate. I am about, drop me a line on my personal email or call my cell if you need rapid response, but I am reading mail now. CUSoon, E14 Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Attachments: MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13003086#comment-13003086 ] Scott Carey commented on MAPREDUCE-279: --- Re: Shuffle. See https://issues.apache.org/jira/browse/MAPREDUCE-318 Those changes are in 0.21+ (and perhaps Y!'s distro but not Cloudera's), I believe. This doesn't do everything mentioned but is a significant improvement. Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13002543#comment-13002543 ] MengWang commented on MAPREDUCE-279: @All How shuffle works in MapReduce 2.0 ? Our study shows that shuffle is a performance bottleneck of mapreduce computing. There are some problems of shuffle: (1)Shuffle and reduce are tightly-coupled, usually shuffle phase doesn't consume too much memory and CPU, so theoretically, reducetasks's slot can be used for other computing tasks when copying data from maps. This method will enhance cluster utilization. Furthermore, should shuffle be separated from reduce? Then shuffle will not use reduce's slot,we need't distinguish between map slots and reduce slots at all. (2)For large jobs, shuffle will use too many network connections, Data transmitted by each network connection is very little, which is inefficient. From 0.21.0 one connection can transfer several map outputs, but i think this is not enough. Maybe we can use a per node shuffle client progress(like tasktracker) to shuffle data for all reduce tasks on this node, then we can shuffle more data trough one connection. (3)Too many concurrent connections will cause shuffle server do massive random IO, which is inefficient. Maybe we can aggregate http request(like delay scheduler), then random IO will be sequential. (4)How to manage memory used by shuffle efficiently. We use buddy memory allocation, which will waste a considerable amount of memory. (5)If shuffle separated from reduce, then we must figure out how to do reduce locality? (6)Can we store map outputs in a Storage system(like hdfs)? (7)Can shuffle be a general data transfer service, which not only for map/reduce paradigm? Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13002677#comment-13002677 ] Arun C Murthy commented on MAPREDUCE-279: - bq. I'm very interested in this project. How can I join? Thanks Ozawa! We are very glad. An update is that we have pretty much closed all the loops internally to get the code out into a branch, we'd love to start having everyone involved... Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13002679#comment-13002679 ] Arun C Murthy commented on MAPREDUCE-279: - bq. How shuffle works in MapReduce 2.0 ? Meng - pretty much the same as currently, the map-outputs are served over http. We have discussed improvements to shuffle along the lines you have suggested for a long while now (I just don't have the jiras handy) and I agree, they are excellent ideas. Our hope is that with MRv2 we open up Map-Reduce to significant innovation so that folks can try various ideas like the ones you suggested... make sense? Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13002947#comment-13002947 ] Lianhui Wang commented on MAPREDUCE-279: i think that the MR2.0 may resolve the thing: JT doesnot monitor the status of the every job and tasks,because many TT must RPC to the one JT every few seconds. and many clients get the job's status through rpc the one JT every few seconds. so the majority nodes of the cluster connect to the JT discontinuously,that lead to degrade the performance of the JT.especially the number of cluster increase,example 10K. like hdfs's Federation Branch, the MR project must create a new branch for the 2.0. Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13001797#comment-13001797 ] OZAWA Tsuyoshi commented on MAPREDUCE-279: -- Hadoop needs to become more modular internally +1 There are a lot of domain-specific programming model by extending MapReduce (e.g haloop, twiter, and so on), so this evolution is good to deal with the fashion. @Arun I'm very interested in this project. How can I join? Or, is there some repository to access your prototype code? Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13001855#comment-13001855 ] OZAWA Tsuyoshi commented on MAPREDUCE-279: -- (e.g haloop, twiter, and so on) s/twiter/Twister/g Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13000464#comment-13000464 ] Arun C Murthy commented on MAPREDUCE-279: - @Min bq. How does ApplicationMaster know its resource requirements before it launches tasks? The assumption is that the AM has a basic idea about resource requirements for it's application which is feasible for our primary use case: Map-Reduce. OTOH, an AM for other applications has the ability to launch a few tasks, watch their resource consumption/utilization and update future resource requests. bq. Even common users can deploy their ApplicationMaster over the cluster they have no any permissions on that? From the framework (i.e. RM/NN) perspective, everything in the cluster including AMs is 'user-land'. Thus as long as a user implements the protocols for AMs they can deploy any applications... they do not need any permission to deploy. I'm working very hard to get the codebase committed to a branch, once there we would love your f/b on the protocols etc. Hopefully that should help you understand how to implement a custom AM if you so choose... appreciate your patience while I work the system! *smile* Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13000576#comment-13000576 ] Mahadev konar commented on MAPREDUCE-279: - @Scott, With respect your comments on ResourceManager/ZooKeeper/RPC: We intend to take it slow with ZooKeeper, initially the intention is to put just the allocations (what each job/application is allocated in ZooKeeper, this is mainly for ResourceManager and Application Master restart). I am not really in favor of using ZK notification for getting rid of RPC's. For the scale we are talking abt, the first to get the work will take it approach will cause herd affect and will definitely be a cause for concern. I think ZK can be used much more than what we have proposed but itll be gradual process to see what all we can offload to ZK. I am pretty hesitant to put RPC load onto ZK and use it as a workload queue for something like this. Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13000752#comment-13000752 ] Leitao Guo commented on MAPREDUCE-279: -- @Arun I also suspect the assumption that the AM has a basic idea about resource requirements. For example in hive scenario, how does AM know the requirements for resources when facing all kinds of query requests? At the same time, if AM finds the request for resources is not enough for the application, will it re-request more resources or just fail? Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12999745#comment-12999745 ] Min Zhou commented on MAPREDUCE-279: @Arun How does ApplicationMaster know its resource requirements before it launches tasks? IMHO, the biggest problem of resources allocation is that we could't determine the CPU/memory/disk/network requirements unless when the task is running. User defined requirements by the configuration files are always improper. From your words, the architecture allows end-users to implement any application-specific framework by implementing a custom ApplicationMaster. Even common users can deploy their ApplicationMaster over the cluster they have no any permissions on that? Can you illustrate how to achieve it? Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12998977#comment-12998977 ] Arun C Murthy commented on MAPREDUCE-279: - bq. +1 An easy way to achieve this here would be to put the resource manager code and new MapReduce ApplicationMaster code into separate source trees under mapreduce. Agreed! We have done exactly that in the prototype and plan to continue improving modularity. bq. Going further, the work to separate out the API and libraries from the implementation should help this effort too, since it will involve removing hard dependencies on the jobtracker from the API classes +1 Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12998654#comment-12998654 ] Tom White commented on MAPREDUCE-279: - bq. Hadoop needs to become more modular internally +1 An easy way to achieve this here would be to put the resource manager code and new MapReduce ApplicationMaster code into separate source trees under mapreduce. This will help enforce dependencies from the beginning: MR2 depends on MR1 and RM, but RM doesn't depend on anything else (except common for RPC?). Going further, the work to separate out the API and libraries from the implementation should help this effort too, since it will involve removing hard dependencies on the jobtracker from the API classes (see MAPREDUCE-1478, MAPREDUCE-1638). Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12997589#comment-12997589 ] Scott Carey commented on MAPREDUCE-279: --- Good stuff! Does the NodeManager communicate to the ResourceManager similar to now (ping - response RPC)? I ask because some of the bottlenecks and complexities now are due to this style of RPC. I've changed a couple systems in the past from ping - response to register - callback in the past and these became more efficient and the code became simpler. With ZooKeeper in there, I wonder how much of the communicaton now uses ZooKeeper watches for efficiency and low latency. When a Job starts up in the ApplicationMaster, does it have to wait for pings to get resources from the scheduler? Or is the data all there in ZK, so that ramp-up times for jobs is much faster and resource reassignment for jobs with short lived tasks isn't completely throttled by the rate of pings? In any case, the new architecture is decoupled and it should be much easier to make enhancements with this separation. Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12997600#comment-12997600 ] Arun C Murthy commented on MAPREDUCE-279: - bq. With ZooKeeper in there, I wonder how much of the communicaton now uses ZooKeeper watches for efficiency and low latency. Scott - We seriously considered this, but had to continue to use Hadoop RPC for a couple of reasons: a) Mahadev, our resident ZK (and the new ResourceManager) expert, was very vary of using ZK watches for scalability reasons. Consider a 10k node cluster with 25-30 containers per node and 10k running jobs - we'd need at least 10k * 10k watches which is a *lot* for ZK b) Security on ZK is still largely unknown, eventually ZK will get there but we'd need a lot of work to do for delegation tokens etc. since we can't do kerberos everywhere. Having said that... bq. In any case, the new architecture is decoupled and it should be much easier to make enhancements with this separation. Exactly. This is something we should definitely re-visit in a subsequent release. Hopefully that makes sense, thanks! Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12997599#comment-12997599 ] Jeff Hammerbacher commented on MAPREDUCE-279: - Hey Arun, As long as that evolution is happening in a branch, that seems totally reasonable to me. When it comes time to migrate the code into trunk, I hope for the same end state as Matei: I think the resource management system should be a separate project from MapReduce so that each system can evolve and release separately. When we have more clients than just MapReduce for the resource manager, we'll want those new clients to evolve as separate projects rather than all living under the Apache Hadoop umbrella. Now seems like an excellent time to facilitate that end state. More specifically, in an ideal world, we'd have four separate projects here: Common (probably folded into Guava or Apache Commons), HDFS, Yahoo! Cluster Manager (Resource Manager + Node Manager), and MapReduce (the ApplicationMaster for MapReduce, I guess). Then, if someone wanted to write Pregel to run against the Cluster Manager, they could implement their own ApplicationMaster in a separate project. Similarly, if someone wanted to run MapReduce against a different cluster manager, that would be simple. More practically, we have the opportunity to get the Cluster Manager project started up as a separate ASF project once it has gestated in a branch here for a bit. Are there any technical barriers to making that happen? I'm a huge fan of this work, and having watched a number of ASF projects evolve over the past several years, I suspect that a small, focused project dedicated to cluster resource management will have the best chance of moving quickly. Thanks, Jeff Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12997614#comment-12997614 ] eric baldeschwieler commented on MAPREDUCE-279: --- Hi Jeff, A couple of thoughts: 1) Discussions on how to reorganize the hadoop universe probably should be moved from this bug to their own thread. Can we restrict this thread to discussions about the design and implementation of this work? Feel free to start this discussion on general or in JIRA. 2) I agree with you that it is important that we structure hadoop so that it is easy to plugin and use other technologies and I would welcome your contribution of code to help make that a reality in this case. 3) My experience with the project split has been very negative. It is becoming much harder, not easier to evolve the hadoop code base. Hence nigel's suggestions (which I support) to actually move the projects closer together. Since map-reduce is the core of Hadoop, I think it is import that Hadoop remain able to deliver the worlds best MR solution within the project. 4) We consider this work a natural evolution of the MR project. Please don't refer to it as Yahoo! cluster manager. That will just confuse the discussion. The intent is to complete this work in apache and others are more then welcome to help us with it. Thanks, E14 Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12997648#comment-12997648 ] Scott Carey commented on MAPREDUCE-279: --- {quote} Consider a 10k node cluster with 25-30 containers per node and 10k running jobs - we'd need at least 10k * 10k watches which is a lot for ZK {quote} Thanks for the info Arun. There would be a lot to work out to mix in ZK and not run into a scalability wall. If you assume that each node has to watch every job, its not going to scale. If each node is only watching one thing when in need of work (Is there work for me?) you can get a large chunk of the RPC that causes delayed task starts gone. I'm mainly thinking of the is there work for me now? what about now? And now? RPC that goes on in hadoop today. That could be inverted into flag three nodes with local data simultaneously that there is work for them, the first to grab the item wins. How valuable is replacing just part of the RPC? I'm not sure. It would help my clusters, but they aren't that big. The other part of the scheduling problem you allude to that requires scanning all available jobs and assigning resources would need some clever work to do in ZK without scalability problems. On a related item, I am glad that job submission includes a DAG of tasks. There is a lot of opportunity to reduce latency in job flows there and consolidate work from a half-dozen projects duplicating effort. {quote} It is becoming much harder, not easier to evolve the hadoop code base. {quote} The choice to have all three projects be in their own trunk/tags/branches was a mistake IMO. I've done the same elsewhere and learned the hard way: don't put projects under different version trees unless you intend to actually completely decouple them *and* release them separately. Hadoop needs more modularity and plugability, but making Cluster Management and Application Management plug-able does not depend on separate projects, its the other way around. Hadoop needs to become more modular internally, its build more sophisticated, and the build outputs more flexible. After a user can swap out foo-resource-manager.jar with hadoop-resource-manager.jar behind resource-manager-api.jar and expect it to work, a separate project for the hadoop-resource-manager could make sense. That said, I agree with Eric's #1 -- future modularity and this work are separate discussions / items. IMO any greater project restructuring related to cluster management depends on this, and not the other way around. A project split should not be the enforcer of for modularity, actual proven modularity should be justification for a split. If one is afraid that without a project split, things are bound to be intertwined, other solutions should be found. Releasing separate jars for the components is one way to move forward that does not need a project split -- though it might require Maven to make it easy to manage and make a split much easier. Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12997654#comment-12997654 ] Joydeep Sen Sarma commented on MAPREDUCE-279: - I have been working on maintaining/enhancing MR for FB's use case for last 6 months or so. Here are a few priority areas for us that are relevant to this discussion: # Latency. This is, imho, #1 priority wrt scheduling. As Scott has already remarked, the ping-response model is broken. So is preemption as an after-thought. We need to get small/medium jobs scheduled instantly. Period. # Scalability. We have made a number of vital fixes to keep the JT working at our scale - but we have merely bought some time. # (wrt. ResourceManager) Open API By which i mean something like Thrift/PB/Avro. We are, of course, most comfortable with Thrift and it would be nice (but not critical) if it were possible to build a Thrift wrapper (even if one was not baked in from scratch). One thing i have found is that writing Thrift services is a breeze because of inbuilt service framework. Everything else on the serialization side being equal - this has been a big win for me personally as a developer (and something to be considered as other distributed execution frameworks try to use the RM). # Ability to back-plug into older Hadoop versions This is related to #3. Unlike many other organizations - we cannot make big jumps in hadoop revisions anymore. We have too many custom changes and we don't have a QA department. Unlike in the past, where we could have depended on Yahoo's QA'ed releases - we don't have that luxury anymore (because we are now both running software at similar versions - and we can't wait until Yahoo has deployed/QA'ed new versions before deploying newer upgrades). If the RM api is open (and satisfactory from design perspective) - we can do the work in-house to our older version of Hadoop to use it. This is critical for us (although i am not sure it applies to other users). I cannot emphasize the urgency around #1. Whether we continue to use Hadoop or not is predicated on big improvements in latency and efficiency (the latter is a different topic). I hope #3 and #4 contribute to the discussion around component architecture. At our scale - i don't think we can build services using large software that is tightly integrated. We need too much customization and we can't afford the long upgrade cycles of such tightly integrated software. Of course, this is specific to our deployment and the requirements of most other deployments is likely to be quite different. As a developer - i have found the current JobTracker code totally unmaintainable - I hope the new version (broken across RM/App-Master) is better. There are several design points that have struck me as particularly evil: # synchronous RPC based architecture: limits concurrency and forces bad implementation choices (see #2) # crazy locking: this is just bad implementation for the most part - but i hope the new design/implementation clearly articulates some principles around the fundamental data structures and how transactional changes to these data structures are meant to be accomplished. # poor data structure maintenance: 99% of the data structures in the JT have a pattern of a: a. a primary collection (eg: list of all jobs in the system) b. several secondary indices/views (list of all runnable jobs from above, list of all completed jobs etc) Instead of modeling updates to such collections and related views through a common entry point - updates to primary and secondary data structures are at disjoint places throughout the code and make maintenance of code a nightmare. i can only hope that a big rewrite like this will try to address some of these issues (others - like hard-wiring to specific (M/R) task types - are already addressed i presume in the new RM). my 2c. Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12996851#comment-12996851 ] Arun C Murthy commented on MAPREDUCE-279: - Jeff, the prototype uses significant amount of Hadoop MapReduce, especially the MapReduce ApplicationMaster for running MR jobs. There is a new ResourceManager/NodeManager, but we still need to co-evolve and stabilize the entire codebase for serving our primary aim: running Hadoop MapReduce applications. After all, this is a re-factor of Hadoop MapReduce. Moving to different projects is premature... it will be very reminiscent of the issues we have with Common and HDFS i.e. every change might be spread over multiple projects, which is a logistical nightmare for developers. Eventually, once we have a few releases under our belt and successful deployments etc., we might be in a better place to revisit this proposal. Make sense? Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12995768#comment-12995768 ] Jeff Hammerbacher commented on MAPREDUCE-279: - Hey Arun, Wow, thanks for reviving one of my favorite old issues! One question: how much code does the prototype share with Hadoop MapReduce? My understanding is that it's mostly new code. If that's the case, have you considered creating a separate Apache Incubator project for the new two-level scheduler? Mesos, for example, has similar aims and is going the Apache Incubator route. I am aware of at least one other effort on this front, and having the projects gestate in the Incubator rather than as a branch of Hadoop would allow them to release more regularly while young and would be more in line with the dreams of the project split (having HDFS and MapReduce developed as separate projects). This seems like a great opportunity to continue the trend of keeping individual ASF projects small and focused so that releases require less work and can happen more regularly. What do you think? Later, Jeff Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12995771#comment-12995771 ] eric baldeschwieler commented on MAPREDUCE-279: --- I'm way out of the office, I'm helping with the newest addition to our family, Jack baldeschwieler Yoshikawa. Todd Papaioannou (p9u) is action head of Hadoop. Most line issues can continue to go to Amol, Kazi, Satish, Avik or Senthil as appropriate. I'll be back on roughly march 9th. CUSoon, E14 Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12996189#comment-12996189 ] Matei Zaharia commented on MAPREDUCE-279: - +1 on decoupling Hadoop MapReduce from the resource management system in a way that allows Hadoop to run on top of other cluster scheduling systems as well. Apart from simplifying experimentation with these types of two-level schedulers, I think this is would be a good thing for the MapReduce project in general as a way to make the project runnable in the maximum variety of environments. For example, there have already been efforts to get Hadoop running on HPC schedulers (e.g. Grid Engine) or Condor, and that would be quite a bit easier with the refactoring that Arun is doing. I imagine that there will be a lot of other work in cluster scheduling in future years, especially as people start running more non-MapReduce applications, so it would be nice to be able to run the Hadoop software stack in these environments. Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12994693#comment-12994693 ] eric baldeschwieler commented on MAPREDUCE-279: --- We're having a baby! Todd Papaioannou (p9u) is action head of Hadoop. Most line issues can continue to go to Amol, Kazi, Satish, Avik or Senthil as appropriate. I'll be back on roughly march 9th. CUSoon, E14 Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0
[ https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12994928#comment-12994928 ] Arun C Murthy commented on MAPREDUCE-279: - h5. Proposal The fundamental idea of the re-factor is to divide the two major functions of the JobTracker, resource management and job scheduling/monitoring, into separate components: a generic resource scheduler and a per-job, user-defined component that manages the application execution. The new ResourceManager manages the global assignment of compute resources to applications and the per-application ApplicationMaster manages the application's scheduling and coordination. An application is either a single job in the classic MapReduce jobs or a DAG of such jobs. The ResourceManager and per-machine NodeManager server, which manages the user processes on that machine, form the computation fabric. The per-application ApplicationMaster is, in effect, a framework specific library and is tasked with negotiating resources from the ResourceManager and working with the NodeManager(s) to execute and monitor the tasks. The ResourceManager is a pure scheduler in the sense that it performs no monitoring or tracking of status for the application. Also, it offers no guarantees on restarting failed tasks either due to application failure or hardware failures. The ResourceManager performs its scheduling function based the resource requirements of the applications; each application has multiple resource request types that represent the resources required for containers. The resource requests include memory, CPU, disk, network etc. Note that this is a significant change from the current model of fixed-type slots in Hadoop MapReduce, which leads to significant negative impact on cluster utilization. The ResourceManager has a scheduler policy plug-in, which is responsible for partitioning the cluster resources among various queues, applications etc. Scheduler plug-ins can be based, for e.g., on the current CapacityScheduler and FairScheduler. The NodeManager is the per-machine framework agent who is responsible for launching the applications' containers, monitoring their resource usage (cpu, memory, disk, network) and reporting the same to the Scheduler. The per-application ApplicationMaster has the responsibility of negotiating appropriate resource containers from the Scheduler, launching tasks, tracking their status monitoring for progress, handling task-failures and recovering from saved state on an ResourceManager fail-over. Since downtime is more expensive at scale high-availability is built-in from the beginning via Apache ZooKeeper for the ResourceManager and HDFS checkpoint for the MapReduce ApplicationMaster. Security and multi-tenancy support is critical to support many users on the larger clusters. The new architecture will also increase innovation and agility by allowing for user-defined versions of MapReduce runtime. Support for generic resource requests will increase cluster utilization by removing artificial bottlenecks such as hard-partitioning of resources into map and reduce slots. We have a *prototype* we'd like to commit to a branch soon, where we look forward to feedback. From there on, we would love to collaborate to get it committed to trunk. Map-Reduce 2.0 -- Key: MAPREDUCE-279 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker, tasktracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.23.0 Re-factor MapReduce into a generic resource scheduler and a per-job, user-defined component that manages the application execution. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira