[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-10-18 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129523#comment-13129523
 ] 

Hudson commented on MAPREDUCE-279:
--

Integrated in Hadoop-Mapreduce-0.23-Commit #17 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Commit/17/])
MAPREDUCE-279. Adding a changelog to branch-0.23.

acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1185489
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt


 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279-script-20110817.sh, MR-279-script-final.sh, 
 MR-279-script.sh, MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move-20110817.txt, MR-279_MR_files_to_move.txt, 
 MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, 
 NodeManager.gv, NodeManager.png, ResourceManager.gv, ResourceManager.png, 
 capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, 
 multi-column-stable-sort-default-theme.png, post-move-patch-20110817.2.txt, 
 post-move-patch-final.txt, post-move.patch, post-move.patch, post-move.patch, 
 yarn-state-machine.job.dot, yarn-state-machine.job.png, 
 yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, 
 yarn-state-machine.task.dot, yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution.
 Check it out by following [the instructions|http://goo.gl/rSJJC].

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-10-18 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129693#comment-13129693
 ] 

Hudson commented on MAPREDUCE-279:
--

Integrated in Hadoop-Hdfs-0.23-Build #43 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/43/])
MAPREDUCE-279. Adding a changelog to branch-0.23.

acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1185489
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt


 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279-script-20110817.sh, MR-279-script-final.sh, 
 MR-279-script.sh, MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move-20110817.txt, MR-279_MR_files_to_move.txt, 
 MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, 
 NodeManager.gv, NodeManager.png, ResourceManager.gv, ResourceManager.png, 
 capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, 
 multi-column-stable-sort-default-theme.png, post-move-patch-20110817.2.txt, 
 post-move-patch-final.txt, post-move.patch, post-move.patch, post-move.patch, 
 yarn-state-machine.job.dot, yarn-state-machine.job.png, 
 yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, 
 yarn-state-machine.task.dot, yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-10-18 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129710#comment-13129710
 ] 

Hudson commented on MAPREDUCE-279:
--

Integrated in Hadoop-Mapreduce-trunk #864 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/864/])
MAPREDUCE-279. Adding a changelog to branch-0.23.

acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1185488
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt


 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279-script-20110817.sh, MR-279-script-final.sh, 
 MR-279-script.sh, MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move-20110817.txt, MR-279_MR_files_to_move.txt, 
 MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, 
 NodeManager.gv, NodeManager.png, ResourceManager.gv, ResourceManager.png, 
 capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, 
 multi-column-stable-sort-default-theme.png, post-move-patch-20110817.2.txt, 
 post-move-patch-final.txt, post-move.patch, post-move.patch, post-move.patch, 
 yarn-state-machine.job.dot, yarn-state-machine.job.png, 
 yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, 
 yarn-state-machine.task.dot, yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-10-18 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129728#comment-13129728
 ] 

Hudson commented on MAPREDUCE-279:
--

Integrated in Hadoop-Hdfs-trunk #834 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/834/])
MAPREDUCE-279. Adding a changelog to branch-0.23.

acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1185488
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt


 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279-script-20110817.sh, MR-279-script-final.sh, 
 MR-279-script.sh, MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move-20110817.txt, MR-279_MR_files_to_move.txt, 
 MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, 
 NodeManager.gv, NodeManager.png, ResourceManager.gv, ResourceManager.png, 
 capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, 
 multi-column-stable-sort-default-theme.png, post-move-patch-20110817.2.txt, 
 post-move-patch-final.txt, post-move.patch, post-move.patch, post-move.patch, 
 yarn-state-machine.job.dot, yarn-state-machine.job.png, 
 yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, 
 yarn-state-machine.task.dot, yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-10-17 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129509#comment-13129509
 ] 

Hudson commented on MAPREDUCE-279:
--

Integrated in Hadoop-Hdfs-trunk-Commit #1179 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1179/])
MAPREDUCE-279. Adding a changelog to branch-0.23.

acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1185488
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt


 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279-script-20110817.sh, MR-279-script-final.sh, 
 MR-279-script.sh, MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move-20110817.txt, MR-279_MR_files_to_move.txt, 
 MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, 
 NodeManager.gv, NodeManager.png, ResourceManager.gv, ResourceManager.png, 
 capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, 
 multi-column-stable-sort-default-theme.png, post-move-patch-20110817.2.txt, 
 post-move-patch-final.txt, post-move.patch, post-move.patch, post-move.patch, 
 yarn-state-machine.job.dot, yarn-state-machine.job.png, 
 yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, 
 yarn-state-machine.task.dot, yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution.
 Check it out by following [the instructions|http://goo.gl/rSJJC].

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-10-17 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129510#comment-13129510
 ] 

Hudson commented on MAPREDUCE-279:
--

Integrated in Hadoop-Common-trunk-Commit #1100 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1100/])
MAPREDUCE-279. Adding a changelog to branch-0.23.

acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1185488
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt


 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279-script-20110817.sh, MR-279-script-final.sh, 
 MR-279-script.sh, MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move-20110817.txt, MR-279_MR_files_to_move.txt, 
 MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, 
 NodeManager.gv, NodeManager.png, ResourceManager.gv, ResourceManager.png, 
 capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, 
 multi-column-stable-sort-default-theme.png, post-move-patch-20110817.2.txt, 
 post-move-patch-final.txt, post-move.patch, post-move.patch, post-move.patch, 
 yarn-state-machine.job.dot, yarn-state-machine.job.png, 
 yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, 
 yarn-state-machine.task.dot, yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution.
 Check it out by following [the instructions|http://goo.gl/rSJJC].

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-10-17 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129512#comment-13129512
 ] 

Hudson commented on MAPREDUCE-279:
--

Integrated in Hadoop-Common-0.23-Commit #15 (See 
[https://builds.apache.org/job/Hadoop-Common-0.23-Commit/15/])
MAPREDUCE-279. Adding a changelog to branch-0.23.

acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1185489
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt


 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279-script-20110817.sh, MR-279-script-final.sh, 
 MR-279-script.sh, MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move-20110817.txt, MR-279_MR_files_to_move.txt, 
 MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, 
 NodeManager.gv, NodeManager.png, ResourceManager.gv, ResourceManager.png, 
 capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, 
 multi-column-stable-sort-default-theme.png, post-move-patch-20110817.2.txt, 
 post-move-patch-final.txt, post-move.patch, post-move.patch, post-move.patch, 
 yarn-state-machine.job.dot, yarn-state-machine.job.png, 
 yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, 
 yarn-state-machine.task.dot, yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution.
 Check it out by following [the instructions|http://goo.gl/rSJJC].

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-10-17 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129513#comment-13129513
 ] 

Hudson commented on MAPREDUCE-279:
--

Integrated in Hadoop-Hdfs-0.23-Commit #16 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Commit/16/])
MAPREDUCE-279. Adding a changelog to branch-0.23.

acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1185489
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt


 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279-script-20110817.sh, MR-279-script-final.sh, 
 MR-279-script.sh, MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move-20110817.txt, MR-279_MR_files_to_move.txt, 
 MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, 
 NodeManager.gv, NodeManager.png, ResourceManager.gv, ResourceManager.png, 
 capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, 
 multi-column-stable-sort-default-theme.png, post-move-patch-20110817.2.txt, 
 post-move-patch-final.txt, post-move.patch, post-move.patch, post-move.patch, 
 yarn-state-machine.job.dot, yarn-state-machine.job.png, 
 yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, 
 yarn-state-machine.task.dot, yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution.
 Check it out by following [the instructions|http://goo.gl/rSJJC].

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-10-17 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129517#comment-13129517
 ] 

Hudson commented on MAPREDUCE-279:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #1119 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1119/])
MAPREDUCE-279. Adding a changelog to branch-0.23.

acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1185488
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt


 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279-script-20110817.sh, MR-279-script-final.sh, 
 MR-279-script.sh, MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move-20110817.txt, MR-279_MR_files_to_move.txt, 
 MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, 
 NodeManager.gv, NodeManager.png, ResourceManager.gv, ResourceManager.png, 
 capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, 
 multi-column-stable-sort-default-theme.png, post-move-patch-20110817.2.txt, 
 post-move-patch-final.txt, post-move.patch, post-move.patch, post-move.patch, 
 yarn-state-machine.job.dot, yarn-state-machine.job.png, 
 yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, 
 yarn-state-machine.task.dot, yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution.
 Check it out by following [the instructions|http://goo.gl/rSJJC].

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-09-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13101159#comment-13101159
 ] 

Hudson commented on MAPREDUCE-279:
--

Integrated in Hadoop-Hdfs-trunk #788 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/788/])
Adding back 
hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources which 
was missed during the merge of MAPREDUCE-279.

acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1166972
Files : 
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/META-INF/services/org.apache.hadoop.mapreduce.protocol.ClientProtocolProvider


 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279-script-20110817.sh, MR-279-script-final.sh, 
 MR-279-script.sh, MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move-20110817.txt, MR-279_MR_files_to_move.txt, 
 MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, 
 NodeManager.gv, NodeManager.png, ResourceManager.gv, ResourceManager.png, 
 capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, 
 multi-column-stable-sort-default-theme.png, post-move-patch-20110817.2.txt, 
 post-move-patch-final.txt, post-move.patch, post-move.patch, post-move.patch, 
 yarn-state-machine.job.dot, yarn-state-machine.job.png, 
 yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, 
 yarn-state-machine.task.dot, yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution.
 Check it out by following [the instructions|http://goo.gl/rSJJC].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-09-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13101246#comment-13101246
 ] 

Hudson commented on MAPREDUCE-279:
--

Integrated in Hadoop-Mapreduce-trunk #812 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/812/])
Adding back 
hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources which 
was missed during the merge of MAPREDUCE-279.

acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1166972
Files : 
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/META-INF/services/org.apache.hadoop.mapreduce.protocol.ClientProtocolProvider


 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279-script-20110817.sh, MR-279-script-final.sh, 
 MR-279-script.sh, MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move-20110817.txt, MR-279_MR_files_to_move.txt, 
 MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, 
 NodeManager.gv, NodeManager.png, ResourceManager.gv, ResourceManager.png, 
 capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, 
 multi-column-stable-sort-default-theme.png, post-move-patch-20110817.2.txt, 
 post-move-patch-final.txt, post-move.patch, post-move.patch, post-move.patch, 
 yarn-state-machine.job.dot, yarn-state-machine.job.png, 
 yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, 
 yarn-state-machine.task.dot, yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution.
 Check it out by following [the instructions|http://goo.gl/rSJJC].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-09-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13100902#comment-13100902
 ] 

Hudson commented on MAPREDUCE-279:
--

Integrated in Hadoop-Common-trunk-Commit #857 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/857/])
Adding back 
hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources which 
was missed during the merge of MAPREDUCE-279.

acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1166972
Files : 
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/META-INF/services/org.apache.hadoop.mapreduce.protocol.ClientProtocolProvider


 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279-script-20110817.sh, MR-279-script-final.sh, 
 MR-279-script.sh, MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move-20110817.txt, MR-279_MR_files_to_move.txt, 
 MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, 
 NodeManager.gv, NodeManager.png, ResourceManager.gv, ResourceManager.png, 
 capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, 
 multi-column-stable-sort-default-theme.png, post-move-patch-20110817.2.txt, 
 post-move-patch-final.txt, post-move.patch, post-move.patch, post-move.patch, 
 yarn-state-machine.job.dot, yarn-state-machine.job.png, 
 yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, 
 yarn-state-machine.task.dot, yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution.
 Check it out by following [the instructions|http://goo.gl/rSJJC].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-09-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13100906#comment-13100906
 ] 

Hudson commented on MAPREDUCE-279:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #868 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/868/])
Adding back 
hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources which 
was missed during the merge of MAPREDUCE-279.

acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1166972
Files : 
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/META-INF/services/org.apache.hadoop.mapreduce.protocol.ClientProtocolProvider


 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279-script-20110817.sh, MR-279-script-final.sh, 
 MR-279-script.sh, MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move-20110817.txt, MR-279_MR_files_to_move.txt, 
 MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, 
 NodeManager.gv, NodeManager.png, ResourceManager.gv, ResourceManager.png, 
 capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, 
 multi-column-stable-sort-default-theme.png, post-move-patch-20110817.2.txt, 
 post-move-patch-final.txt, post-move.patch, post-move.patch, post-move.patch, 
 yarn-state-machine.job.dot, yarn-state-machine.job.png, 
 yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, 
 yarn-state-machine.task.dot, yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution.
 Check it out by following [the instructions|http://goo.gl/rSJJC].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-09-04 Thread Sharad Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13096835#comment-13096835
 ] 

Sharad Agarwal commented on MAPREDUCE-279:
--

Thanks Binglin, it is incredibly useful. I have filed MAPREDUCE-2930 where you 
may want to contribute the patch. It will help to keep the graphs up to date.

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279-script-20110817.sh, MR-279-script-final.sh, 
 MR-279-script.sh, MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move-20110817.txt, MR-279_MR_files_to_move.txt, 
 MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, 
 NodeManager.gv, NodeManager.png, ResourceManager.gv, ResourceManager.png, 
 capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, 
 multi-column-stable-sort-default-theme.png, post-move-patch-20110817.2.txt, 
 post-move-patch-final.txt, post-move.patch, post-move.patch, post-move.patch, 
 yarn-state-machine.job.dot, yarn-state-machine.job.png, 
 yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, 
 yarn-state-machine.task.dot, yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution.
 Check it out by following [the instructions|http://goo.gl/rSJJC].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-09-03 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13096591#comment-13096591
 ] 

Binglin Chang commented on MAPREDUCE-279:
-

bq. Ultimately a version of these should be produced natively in some 
StateMachine method (toDot()?), and I think Chris Douglas may take that up 
eventually. However, some of the desirable info (e.g., which states send events 
to or receive them from other state machines) can't really be discovered 
automatically, so there will continue to be a place for hand-rolled graphs.

What's the current progress of this work? I find visualization of state machine 
really help when reading  learning MRv2 code, both YARN  MRv2. I add some 
code in yarn-common to generate graphviz dot file automatically when I try to 
learn YARN code yesterday, it works fine for me, maybe it is useful for others 
too.

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279-script-20110817.sh, MR-279-script-final.sh, 
 MR-279-script.sh, MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move-20110817.txt, MR-279_MR_files_to_move.txt, 
 MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, 
 capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, 
 multi-column-stable-sort-default-theme.png, post-move-patch-20110817.2.txt, 
 post-move-patch-final.txt, post-move.patch, post-move.patch, post-move.patch, 
 yarn-state-machine.job.dot, yarn-state-machine.job.png, 
 yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, 
 yarn-state-machine.task.dot, yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution.
 Check it out by following [the instructions|http://goo.gl/rSJJC].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-08-17 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086354#comment-13086354
 ] 

Thomas Graves commented on MAPREDUCE-279:
-

I think the move of mapreduce to hadoop-mapreduce got lost in the latest 
MR-279-script-20110817.sh.

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279-script-20110817.sh, MR-279-script.sh, 
 MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move-20110817.txt, MR-279_MR_files_to_move.txt, 
 MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, 
 capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, 
 multi-column-stable-sort-default-theme.png, post-move-patch-20110817.2.txt, 
 post-move.patch, post-move.patch, post-move.patch, 
 yarn-state-machine.job.dot, yarn-state-machine.job.png, 
 yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, 
 yarn-state-machine.task.dot, yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution.
 Check it out by following [the instructions|http://goo.gl/rSJJC].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-08-16 Thread Philip Zeyliger (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13085979#comment-13085979
 ] 

Philip Zeyliger commented on MAPREDUCE-279:
---

I will return on the 24th.  For urgent matters, please contact my
teammates or Amr.

Thanks,

-- Philip


 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt, MR-279_MR_files_to_move.txt, 
 MapReduce_NextGen_Architecture.pdf, capacity-scheduler-dark-theme.png, 
 hadoop_contributors_meet_07_01_2011.pdf, 
 multi-column-stable-sort-default-theme.png, post-move.patch, 
 yarn-state-machine.job.dot, yarn-state-machine.job.png, 
 yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, 
 yarn-state-machine.task.dot, yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution.
 Check it out by following [the instructions|http://goo.gl/rSJJC].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-08-16 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086039#comment-13086039
 ] 

Alejandro Abdelnur commented on MAPREDUCE-279:
--

I've just applied the patch following instructions and it compiles fine.

Yesterday I've opened a JIRA with things to improve, MAPREDUCE-2842.

IMO, most of those can be done incrementally after this patch goes in.

What I think it should be done as part of this patch (MAPREDUCE-279) is the 
artifact/maven-module-dir names.

All artifact names should be prefixed with 'hadoop-' (the JARs get the artifact 
names and it will be easier to troubleshoot, identify the JARS).

In addition, the maven-module-dir should be the same to make it easier to 
developers to find their way around.

The reason for proposing doing this as part of this patch is to avoid doing 2 
huge moves of files in SVN.

(HDFS-2096 is aligned to this naming already)

Thanks



 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt, MR-279_MR_files_to_move.txt, 
 MapReduce_NextGen_Architecture.pdf, capacity-scheduler-dark-theme.png, 
 hadoop_contributors_meet_07_01_2011.pdf, 
 multi-column-stable-sort-default-theme.png, post-move.patch, 
 yarn-state-machine.job.dot, yarn-state-machine.job.png, 
 yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, 
 yarn-state-machine.task.dot, yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution.
 Check it out by following [the instructions|http://goo.gl/rSJJC].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-08-16 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086048#comment-13086048
 ] 

Alejandro Abdelnur commented on MAPREDUCE-279:
--

I've just updated MAPREDUCE-2842 with the a propose naming for 
artifacts/module-dirs.

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt, MR-279_MR_files_to_move.txt, 
 MapReduce_NextGen_Architecture.pdf, capacity-scheduler-dark-theme.png, 
 hadoop_contributors_meet_07_01_2011.pdf, 
 multi-column-stable-sort-default-theme.png, post-move.patch, 
 yarn-state-machine.job.dot, yarn-state-machine.job.png, 
 yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, 
 yarn-state-machine.task.dot, yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution.
 Check it out by following [the instructions|http://goo.gl/rSJJC].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-08-16 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086053#comment-13086053
 ] 

Arun C Murthy commented on MAPREDUCE-279:
-

Thanks Alejandro, I do agree that we should avoid 2 huge svn moves if we can 
avoid it - let me try to fix up scripts to be in line with your proposals.

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt, MR-279_MR_files_to_move.txt, 
 MapReduce_NextGen_Architecture.pdf, capacity-scheduler-dark-theme.png, 
 hadoop_contributors_meet_07_01_2011.pdf, 
 multi-column-stable-sort-default-theme.png, post-move.patch, 
 yarn-state-machine.job.dot, yarn-state-machine.job.png, 
 yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, 
 yarn-state-machine.task.dot, yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution.
 Check it out by following [the instructions|http://goo.gl/rSJJC].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-08-16 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086052#comment-13086052
 ] 

Arun C Murthy commented on MAPREDUCE-279:
-

Thanks Alejandro, I do agree that we should avoid 2 huge svn moves if we can 
avoid it - let me try to fix up scripts to be in line with your proposals.

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279-script.sh, MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt, MR-279_MR_files_to_move.txt, 
 MapReduce_NextGen_Architecture.pdf, capacity-scheduler-dark-theme.png, 
 hadoop_contributors_meet_07_01_2011.pdf, 
 multi-column-stable-sort-default-theme.png, post-move.patch, 
 yarn-state-machine.job.dot, yarn-state-machine.job.png, 
 yarn-state-machine.task-attempt.dot, yarn-state-machine.task-attempt.png, 
 yarn-state-machine.task.dot, yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution.
 Check it out by following [the instructions|http://goo.gl/rSJJC].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-08-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13085217#comment-13085217
 ] 

Hudson commented on MAPREDUCE-279:
--

Integrated in Hadoop-Common-trunk-Commit #742 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/742/])
MAPREDUCE-2837. Ported bug fixes from y-merge to prepare for MAPREDUCE-279 
merge.

acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1157249
Files : 
* 
/hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/mapreduce/security/TestTokenCache.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/ACLsManager.java
* 
/hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/mapreduce/security/TestBinaryTokenFile.java
* /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/MapTask.java
* /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/Job.java
* /hadoop/common/trunk/mapreduce/src/test/mapred-site.xml
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/task/reduce/Shuffle.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/security/TokenCache.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/task/reduce/MergeManager.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/MapOutputFile.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/ReduceTask.java
* /hadoop/common/trunk/mapreduce/src/webapps/job/jobdetailshistory.jsp
* 
/hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/security/TestMapredGroupMappingServiceRefresh.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/TaskTracker.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/TaskFinishedEvent.java
* /hadoop/common/trunk/mapreduce/CHANGES.txt
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/JobACLsManager.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/TaskMemoryManagerThread.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/MROutputFiles.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/JobHistoryParser.java
* /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/Task.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/MRConfig.java
* 
/hadoop/common/trunk/mapreduce/src/examples/org/apache/hadoop/examples/terasort/TeraInputFormat.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/LocalJobRunner.java
* 
/hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/mapred/TestMapRed.java


 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, 
 capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, 
 multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, 
 yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, 
 yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, 
 yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution.
 Check it out by following [the instructions|http://goo.gl/rSJJC].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-08-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084826#comment-13084826
 ] 

Hudson commented on MAPREDUCE-279:
--

Integrated in Hadoop-Mapreduce-trunk #754 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/754/])
MAPREDUCE-2837. Ported bug fixes from y-merge to prepare for MAPREDUCE-279 
merge.

acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1157249
Files : 
* 
/hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/mapreduce/security/TestTokenCache.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/ACLsManager.java
* 
/hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/mapreduce/security/TestBinaryTokenFile.java
* /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/MapTask.java
* /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/Job.java
* /hadoop/common/trunk/mapreduce/src/test/mapred-site.xml
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/task/reduce/Shuffle.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/security/TokenCache.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/task/reduce/MergeManager.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/MapOutputFile.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/ReduceTask.java
* /hadoop/common/trunk/mapreduce/src/webapps/job/jobdetailshistory.jsp
* 
/hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/security/TestMapredGroupMappingServiceRefresh.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/TaskTracker.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/TaskFinishedEvent.java
* /hadoop/common/trunk/mapreduce/CHANGES.txt
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/JobACLsManager.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/TaskMemoryManagerThread.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/MROutputFiles.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/JobHistoryParser.java
* /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/Task.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/MRConfig.java
* 
/hadoop/common/trunk/mapreduce/src/examples/org/apache/hadoop/examples/terasort/TeraInputFormat.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/LocalJobRunner.java
* 
/hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/mapred/TestMapRed.java


 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, 
 capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, 
 multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, 
 yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, 
 yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, 
 yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution.
 Check it out by following [the instructions|http://goo.gl/rSJJC].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-08-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084375#comment-13084375
 ] 

Hudson commented on MAPREDUCE-279:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #763 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/763/])
MAPREDUCE-2837. Ported bug fixes from y-merge to prepare for MAPREDUCE-279 
merge.

acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1157249
Files : 
* 
/hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/mapreduce/security/TestTokenCache.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/ACLsManager.java
* 
/hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/mapreduce/security/TestBinaryTokenFile.java
* /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/MapTask.java
* /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/Job.java
* /hadoop/common/trunk/mapreduce/src/test/mapred-site.xml
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/task/reduce/Shuffle.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/security/TokenCache.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/task/reduce/MergeManager.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/MapOutputFile.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/ReduceTask.java
* /hadoop/common/trunk/mapreduce/src/webapps/job/jobdetailshistory.jsp
* 
/hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/security/TestMapredGroupMappingServiceRefresh.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/TaskTracker.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/TaskFinishedEvent.java
* /hadoop/common/trunk/mapreduce/CHANGES.txt
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/JobACLsManager.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/TaskMemoryManagerThread.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/MROutputFiles.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/JobHistoryParser.java
* /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/Task.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/MRConfig.java
* 
/hadoop/common/trunk/mapreduce/src/examples/org/apache/hadoop/examples/terasort/TeraInputFormat.java
* 
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/LocalJobRunner.java
* 
/hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/mapred/TestMapRed.java


 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt, MapReduce_NextGen_Architecture.pdf, 
 capacity-scheduler-dark-theme.png, hadoop_contributors_meet_07_01_2011.pdf, 
 multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, 
 yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, 
 yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, 
 yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution.
 Check it out by following [the instructions|http://goo.gl/rSJJC].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-07-08 Thread Giridharan Kesavan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061781#comment-13061781
 ] 

Giridharan Kesavan commented on MAPREDUCE-279:
--

Build setup on MR-279 branch
https://builds.apache.org/job/Hadoop-MR-279-Build/

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt, capacity-scheduler-dark-theme.png, 
 hadoop_contributors_meet_07_01_2011.pdf, 
 multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, 
 yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, 
 yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, 
 yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution.
 Check it out by following [the instructions|http://goo.gl/rSJJC].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-07-07 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061249#comment-13061249
 ] 

Vinod Kumar Vavilapalli commented on MAPREDUCE-279:
---

bq. What kind of results or terminal output should I expect?
Bill, the terminal output should 'almost' be similar to what you see with 
Hadoop 0.20.

Please create separate tickets or use mapreduce-...@hadoop.apache.org mailing 
list. This one is an umbrella ticket that so many are watching.

Thanks.

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt, capacity-scheduler-dark-theme.png, 
 hadoop_contributors_meet_07_01_2011.pdf, 
 multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, 
 yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, 
 yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, 
 yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution.
 Check it out by following [the instructions|http://goo.gl/rSJJC].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-07-05 Thread Bill Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059688#comment-13059688
 ] 

Bill Lee commented on MAPREDUCE-279:


In this page: 
http://svn.apache.org/repos/asf/hadoop/common/branches/MR-279/mapreduce/INSTALL

After running the last command: $HADOOP_COMMON_HOME/bin/hadoop jar 
$HADOOP_MAPRED_HOME/build/hadoop-mapred-examples-0.22.0-SNAPSHOT.jar 
randomwriter -Dmapreduce.job.user.name=$USER 
-Dmapreduce.randomwriter.bytespermap=1 -Ddfs.blocksize=536870912 
-Ddfs.block.size=536870912 -libjars 
$HADOOP_YARN_INSTALL/hadoop-mapreduce-1.0-SNAPSHOT/modules/hadoop-mapreduce-client-jobclient-1.0-SNAPSHOT.jar
 output 

What kind of results or terminal output should I expect?

Thank you.

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt, capacity-scheduler-dark-theme.png, 
 multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, 
 yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, 
 yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, 
 yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution.
 Check it out by following [the instructions|http://goo.gl/rSJJC].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-07-05 Thread eric baldeschwieler (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059690#comment-13059690
 ] 

eric baldeschwieler commented on MAPREDUCE-279:
---

I have joined Hortonworks and am no longer at Yahoo!.  Please re-send your 
message to my non-Yahoo! email address.


 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt, capacity-scheduler-dark-theme.png, 
 multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, 
 yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, 
 yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, 
 yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution.
 Check it out by following [the instructions|http://goo.gl/rSJJC].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-06-30 Thread Giridharan Kesavan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13058060#comment-13058060
 ] 

Giridharan Kesavan commented on MAPREDUCE-279:
--

Nigel/Arun, I can help setup a build on MR-279

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt, capacity-scheduler-dark-theme.png, 
 multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, 
 yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, 
 yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, 
 yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution.
 Check it out by following [the instructions|http://goo.gl/rSJJC].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-06-29 Thread Nigel Daley (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057015#comment-13057015
 ] 

Nigel Daley commented on MAPREDUCE-279:
---

Arun, are you planning to get a Jenkins build running on this branch before 
merge?

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt, capacity-scheduler-dark-theme.png, 
 multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, 
 yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, 
 yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, 
 yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution.
 Check it out by following [the instructions|http://goo.gl/rSJJC].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-06-15 Thread Haoyuan Li (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049918#comment-13049918
 ] 

Haoyuan Li commented on MAPREDUCE-279:
--

This page doesn't work anymore: 
http://svn.apache.org/repos/asf/hadoop/mapreduce/branches/MR-279/INSTALL

Is there any new page to replace this?

Thank you.

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt, capacity-scheduler-dark-theme.png, 
 multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, 
 yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, 
 yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, 
 yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-06-15 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049927#comment-13049927
 ] 

Mahadev konar commented on MAPREDUCE-279:
-

haoyuan, 
 Because of the svn unsplit things have moved. The new link is:

http://svn.apache.org/repos/asf/hadoop/common/branches/MR-279/mapreduce/INSTALL

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt, capacity-scheduler-dark-theme.png, 
 multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, 
 yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, 
 yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, 
 yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-06-15 Thread Haoyuan Li (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13050094#comment-13050094
 ] 

Haoyuan Li commented on MAPREDUCE-279:
--

Thank you Mahadev.

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt, capacity-scheduler-dark-theme.png, 
 multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, 
 yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, 
 yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, 
 yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-06-14 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049257#comment-13049257
 ] 

Eli Collins commented on MAPREDUCE-279:
---

Is there an MR2 design doc? A couple of people have asked me about this would 
be very useful to share.

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt, capacity-scheduler-dark-theme.png, 
 multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, 
 yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, 
 yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, 
 yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-06-14 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049360#comment-13049360
 ] 

Arun C Murthy commented on MAPREDUCE-279:
-

Sigh, I keep missing this.

I have a slightly old version I'll spruce up and post. Thanks for the reminder.

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt, capacity-scheduler-dark-theme.png, 
 multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, 
 yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, 
 yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, 
 yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-06-12 Thread Nigel Daley (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13048423#comment-13048423
 ] 

Nigel Daley commented on MAPREDUCE-279:
---

Given these build issues (and just good engineering practice), I'd like to see 
a Jenkins CI build on this branch so we know when merged to trunk the builds 
won't be  (more) broken. 

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt, capacity-scheduler-dark-theme.png, 
 multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, 
 yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, 
 yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, 
 yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-06-11 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13047867#comment-13047867
 ] 

Arun C Murthy commented on MAPREDUCE-279:
-

Yep, it would be nice to completely mavenize and I strongly believe it should 
be our goal.

Maybe we can do it in stages, have a hybrid one on day one when we merge MR-279 
into trunk and then do the whole nine yards? That way each can proceed 
independently. Currently it's becoming painful to manage a large branch and 
hence my suggestion to get it into trunk and do mavenization independently. 
Thoughts?

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt, capacity-scheduler-dark-theme.png, 
 multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, 
 yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, 
 yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, 
 yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-05-26 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13039993#comment-13039993
 ] 

Tom White commented on MAPREDUCE-279:
-

I'm wondering what the maven modules might look like for this when integrated 
into trunk. Something like:

* api - containing the user-facing public API of MapReduce (from 
org.apache.hadoop.mapred(uce)). When MAPREDUCE-1638 is done it will be possible 
to split the API into a self-contained tree (no dependencies on other parts of 
MapReduce). 
* lib - containing the user-facing public MapReduce libraries (from 
org.apache.hadoop.mapred and org.apache.hadoop.mapred(uce).lib). There's a 
patch in MAPREDUCE-1478 to perform this separation.
* classic-impl - containing the implementation classes for MapReduce. This is 
what's left over after doing MAPREDUCE-1638 and MAPREDUCE-1478.
* nextgen-impl - this is mr-client in the MR-279 branch, which I think should 
be renamed, since it's not immediately clear what it's a client of in the 
context of the whole MapReduce project. It has submodules app, common, hs, 
jobclient, shuffle.
* yarn - the yarn framework from the MR-279 branch. Yarn is broken into 
submodules too.

Given the progress on mavenizing common (HADOOP-6671), is it worth integrating 
MAPREDUCE-279 at the same time as doing the full Mavenization of MapReduce? 
That would seem ideal, but perhaps there's an alternative I haven't considered. 


 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt, capacity-scheduler-dark-theme.png, 
 multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, 
 yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, 
 yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, 
 yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-04-15 Thread Amr Awadallah (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13020520#comment-13020520
 ] 

Amr Awadallah commented on MAPREDUCE-279:
-

I am out of office this week and will be slower than usual in
responding to emails. If this is urgent then please call my cell phone
(or send an SMS), otherwise I will reply to your email when I get
back.

Thanks for your patience,

-- amr


 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt, capacity-scheduler-dark-theme.png, 
 multi-column-stable-sort-default-theme.png, yarn-state-machine.job.dot, 
 yarn-state-machine.job.png, yarn-state-machine.task-attempt.dot, 
 yarn-state-machine.task-attempt.png, yarn-state-machine.task.dot, 
 yarn-state-machine.task.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-03-21 Thread Michael Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009098#comment-13009098
 ] 

Michael Lee commented on MAPREDUCE-279:
---

cannot build:
failed when building hadoop-mapred-279 ( follow instructions in 
http://svn.apache.org/repos/asf/hadoop/mapreduce/branches/MR-279/INSTALL)

when build hadoop-mapred-279:
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:2.3.2:testCompile 
(default-testCompile) on project yarn-common: Compilation failure
[ERROR] 
/home/michael/work/hadoop-mapred-279/yarn/yarn-common/src/test/java/org/apache/hadoop/yarn/TestRPC.java:[80,37]
 incompatible types
[ERROR] found   : java.util.ArrayListjava.lang.CharSequence
[ERROR] required: org.apache.avro.generic.GenericArrayjava.lang.CharSequence
[ERROR] - [Help 1]
[ERROR] 

My ENV:
Maven:
Apache Maven 3.0.3 (r1075438; 2011-03-01 01:31:09+0800)
Maven home: /home/michael/local/apache-maven-3.0.3
Java version: 1.6.0_07, vendor: Sun Microsystems Inc.
Java home: /home/michael/local/java6/jre
Default locale: en_US, platform encoding: ANSI_X3.4-1968
OS name: linux, version: 2.6.9_5-4-0-3, arch: amd64, family: unix

JDK:
java version 1.6.0_07
Java(TM) SE Runtime Environment (build 1.6.0_07-b06)
Java HotSpot(TM) 64-Bit Server VM (build 10.0-b23, mixed mode)

Ant:
Apache Ant version 1.7.0 compiled on December 13 2006

avro-maven-plugin:
using snapshot from: https://github.com/phunt/avro-maven-plugin/, 1.4.0 branch

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-03-21 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009183#comment-13009183
 ] 

Arun C Murthy commented on MAPREDUCE-279:
-

bq. Looking through the code a bit more I came across Hamlet.

Luke can provide more details, but I believe he took this route due to the lack 
of a better 'embeddable' alternative.

Having said that, echo'ing eric14, please feel free to open a jira with an 
alternate proposal and we can consider moving over to something more standard 
that satisfies our constraints. Alternately, in the long run, we could move 
Hamlet out to a separate (incubator?) project to attempt build a community 
around.

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-03-21 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009185#comment-13009185
 ] 

Arun C Murthy commented on MAPREDUCE-279:
-

I've opened MAPREDUCE-2399 to discuss Hamlet. Please use that jira so that we 
can keep MAPREDUCE-279 focussed on the next-gen MR framework. Thanks.

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-03-21 Thread Doug Cutting (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009200#comment-13009200
 ] 

Doug Cutting commented on MAPREDUCE-279:


Sharad Had to live with .genavro as the maven plugin 
(https://github.com/phunt/avro-maven-plugin) not been updated yet to work with 
the new extension.

FYI, a Maven plugin is included in Avro 1.5.0 that uses the .avdl file suffix.

Todd Does AvroIDL convert javadoc-style comments on records/protocols into 
JavaDoc on generated code?

Not yet (AVRO-296).

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-03-21 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009381#comment-13009381
 ] 

Konstantin Boudnik commented on MAPREDUCE-279:
--

Not to start a religious war or anything, but I am kinda wondering why not to 
use a standard Java webapp framework such as Grails ? There's a huge community 
working on it and there's a lot of people with expertise which will help to 
ease the development of user applications on top of MR2.0.

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt, capacity-scheduler-dark-theme.png, 
 multi-column-stable-sort-default-theme.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-03-21 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009385#comment-13009385
 ] 

Arun C Murthy commented on MAPREDUCE-279:
-

Cos, again, can you please use MAPREDUCE-2399 to discuss the specifics of the 
UI? Thanks.

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt, capacity-scheduler-dark-theme.png, 
 multi-column-stable-sort-default-theme.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-03-21 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009547#comment-13009547
 ] 

Tom White commented on MAPREDUCE-279:
-

 Also, as you pointed out, the changes to classes in 
 src/java/org/apache/hadoop/mapred(uce) are very minor

Yes, but we still need to be sure that they don't break compatibility, which is 
hard to see in the current patch. However, I agree that collaborating on this 
part by way of working on MAPREDUCE-1638 and changes in trunk will make the 
separation cleaner and clarify the changes required for MR2.

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt, capacity-scheduler-dark-theme.png, 
 multi-column-stable-sort-default-theme.png


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-279) Map-Reduce 2.0

2011-03-20 Thread eric baldeschwieler (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009012#comment-13009012
 ] 

eric baldeschwieler commented on MAPREDUCE-279:
---

I'll let luke comment on the details.  I'd support patches to convert the UI to 
something more standard, if we can agree on the right thing.  Having a good UI 
is a plus.

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

2011-03-19 Thread Sharad Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13008757#comment-13008757
 ] 

Sharad Agarwal commented on MAPREDUCE-279:
--

bq. Is the correct suffix still .genavro?
Had to live with .genavro as the maven plugin 
(https://github.com/phunt/avro-maven-plugin) not been updated yet to work with 
the new extension.
bq. Does AvroIDL convert javadoc-style comments on records/protocols into 
JavaDoc on generated code?
No. I don't see the comments in the generated code. 

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

2011-03-19 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13008881#comment-13008881
 ] 

Todd Lipcon commented on MAPREDUCE-279:
---

Looking through the code a bit more I came across Hamlet. It seems you've 
written your own MVC framework and Java implementation of Haml as part of Yarn.

Can you shed some light on why existing web frameworks were found to be 
insufficient? Do we really want a custom HTML generation framework as part of a 
resource scheduler?

I don't have much experience with web programming in Java, but I can't imagine 
we have any use cases that are _that_ unique that they couldn't be satisfied 
using a popular framework like Spring MVC. I also have strong doubts that a 
bunch of systems hackers like we have in our community can do a better job at 
designing and implementing a web framework compared to people who do web 
programming all day long (witness the completely incorrect job we do of input 
parameter escaping we do in the current webapps)

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

2011-03-19 Thread Philip Zeyliger (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13008882#comment-13008882
 ] 

Philip Zeyliger commented on MAPREDUCE-279:
---

I'm traveling and will return to the office on Monday, March 28th.

For urgent matters, please contact Aparna Ramani.

Thanks!

-- Philip


 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

2011-03-18 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13008452#comment-13008452
 ] 

Arun C Murthy commented on MAPREDUCE-279:
-

Thanks for your f/b Tom.

bq. I wonder if it would be easier not to move the 
src/java/org/apache/hadoop/mapred(uce) trees at this stage.

The main issue is the dependency chain - currently the mr-client depends purely 
on apis in yarn package. In the alternate proposal (which we considered) 
mr-client would need to depend on yarn and src/java for the runtime. 

The current scheme is both more modular and enforces discipline by ensuring 
that the MapReduce runtime (map, sort, shuffle, merge, reduce) cannot, even 
accidentally, start relying on classes in the server package i.e. JT/TT etc. 
This also seems like the right end-state for the project.

Also, as you pointed out, the changes to classes in 
src/java/org/apache/hadoop/mapred(uce) are very minor and the 'svn mv' is both 
well documented (MR-279_MR_files_to_move.txt, MR-279.sh) and straight-forward.



bq. MAPREDUCE-1638 is highly relevant for this work

Thanks! MAPREDUCE-1638 is very relevant. MAPREDUCE-279 already has some of the 
changes you proposed there i.e. keeping server classes in a separate source 
structure from the implementation classes - we should collaborate both on trunk 
and on the MR-279 branch to ensure consistency. I'm happy to merge if 
necessary. 


 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

2011-03-18 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13008517#comment-13008517
 ] 

Todd Lipcon commented on MAPREDUCE-279:
---

Hi Arun. I spent the train ride this morning looking over yarn/src/main/avro in 
the branch. Here are a few comments, sorry for the somewhat 
stream-of-consciousness format.


- Is the correct suffix still .genavro? Thought we'd changed the name to 
.avroidl or something?
- Apache licenses needed on these files
- Does AvroIDL convert javadoc-style comments on records/protocols into JavaDoc 
on generated code? If so we should do more of that.


- AMRMProtocol:
-- the release parameter to allocate is strange: (a) it seems the function is 
misnamed if you can also release things as you call it, and (b) why isn't it an 
arrayContainerId?
-- if you want to cancel previous resource requests, do you submit a new one 
with a negative numContainers?


- ApplicationSubmissionContext:
-- would be good to have some kind of scheduler-specific parameters here? eg 
maybe a scheduler has something beyond just priority (eg. perhaps a deadline)
-- using just URL type directly for resources - seems not quite flexible 
enough? eg one useful construct would be a URL + checksum
-- what's resources_todo going to be?
-- passing user - agreed, this should be more flexible than simple string.
-- Why not contain a ContainerLaunchContext to specify the container in which 
to run the AM? Seems like lots of duplicated fields.

- ContainerManager:
-- not following YarnContainerTags - these are opaque enums, how do they get 
interpolated in a string?
-- how does one access stderr/stdout contents? both while they're being written 
and after a container has terminated? (maybe I just haven't gotten to that bit 
yet somewhere else)

- yarn-types.avro:
-- For the typesafe ID classes, do we need to specify explicit comparison 
orderings? I don't know Avro behavior here.
-- Did you consider making the ids all strings instead of ints? The pro would 
be that there could be canonical formats, like AM-hex id for app masters vs 
C-hex id for containers. AWS does a good job of this.
-- Resource: field names should include units, like int memoryMB
-- what are ContainerTokens? could use some extra doc at the protocol layer 
here. (I assume this is for security?)
-- The Container type doesn't appear 
-- the URL record is missing user/password used for http basic auth or s3n auth
-- there are some hard tabs in this file
-- ApplicationMaster:
--- httpPort seems like it would be better described as something like 
httpStatusURL?
-- LocalResourceVisibility:
--- just to clarify, APPLICATION visibility means only to this application 
submitted by this user. ie if joe and bob both submit MapReduce 2.x.y jobs 
with identical jars, it still won't share, even if sha1s match?
--- if bob submits the same application (ie MR 2.x.y) twice, do APPLICATION 
visibility files get shared?


 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

2011-03-18 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13008648#comment-13008648
 ] 

Chris Douglas commented on MAPREDUCE-279:
-

bq. Why not contain a ContainerLaunchContext to specify the container in which 
to run the AM? Seems like lots of duplicated fields.
Agreed. Fixing this also addresses the URL as insufficient for resources. The 
\_todo form was introduced to effect this, and remains in-progress.

bq. how does one access stderr/stdout contents? both while they're being 
written and after a container has terminated? (maybe I just haven't gotten to 
that bit yet somewhere else)
This is still a TODO (working on it now). In the short term, something similar 
to what the TT does is probably sufficient, I hope.

bq. Did you consider making the ids all strings instead of ints? The pro would 
be that there could be canonical formats, like AM-hex id for app masters vs 
C-hex id for containers.
Some of the implementation ended up relying on a consistent mapping of int ids 
to strings, so going all the way could make sense. On the other hand, parsing 
strings to determine relationships between containers and applications is 
regrettable.

bq. the URL record is missing user/password used for http basic auth or s3n auth
Agreed, full URIs should be supported, though pushing that all the way through 
FileContext and FileSystem could be painful.

bq. just to clarify, APPLICATION visibility means only to this application 
submitted by this user. ie if joe and bob both submit MapReduce 2.x.y jobs 
with identical jars, it still won't share, even if sha1s match?
Right. The target layout for the NodeManager looks roughly like this:
{noformat}
for x in localdir:
$x/filecache # public cache
$x/usercache
$x/usercache/$user
$x/usercache/filecache # private cache
$x/usercache/$user/appcache
$x/usercache/$user/appcache/$appid
$x/usercache/$user/appcache/$appid/filecache # application cache
$x/usercache/$user/appcache/$appid/$containerid
$x/usercache/$user/appcache/$appid/output # output retained after container 
exits, i.e. intermediate data
{noformat}
So the end of the container and application can just delete those subdirs. 
Matching a job jar between invocations would require one to register that 
resource as PUBLIC/PRIVATE. The APPLICATION scope is more for job.xml and the 
like.

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

2011-03-18 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13008655#comment-13008655
 ] 

Chris Douglas commented on MAPREDUCE-279:
-

Sorry, the location of the private cache is {{$x/usercache/$user/filecache}}, 
not {{$x/usercache/filecache}}.

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

2011-03-17 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13008034#comment-13008034
 ] 

Arun C Murthy commented on MAPREDUCE-279:
-

I'm going to commit this to a dev branch (MR-279?) if no one objects. Thanks.

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

2011-03-17 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13008036#comment-13008036
 ] 

Todd Lipcon commented on MAPREDUCE-279:
---

sure, +1 for putting this on a dev (non-release) branch

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

2011-03-17 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13008110#comment-13008110
 ] 

Arun C Murthy commented on MAPREDUCE-279:
-

Thanks Todd. I've commited to a dev-branch: MR-279.

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

2011-03-17 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13008327#comment-13008327
 ] 

Tom White commented on MAPREDUCE-279:
-

There's a lot to digest here, but here are a couple of quick initial high-level 
comments from a packaging and staging perspective.

I wonder if it would be easier not to move the 
src/java/org/apache/hadoop/mapred(uce) trees at this stage. MR 2 could just 
depend on the MapReduce JAR produced by the ant file, just like it does for 
Common. This would make the introduction of the codebase easier. There are some 
changes required in the existing classes, but by the look of things they are 
fairly minor and by introducing them in situ (in separate JIRAs) we can be sure 
they won't break existing users, and the changes would be easier to track.

Alternatively this work could depend on full mavenization (at least of 
MapReduce), but that's probably some way off.

MAPREDUCE-1638 is highly relevant for this work, since it aims to split out the 
MR API from the implementation. I've got an in-progress patch for this, which 
I'll post soon for discussion.



 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.patch, MR-279.sh, 
 MR-279_MR_files_to_move.txt


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

2011-03-16 Thread eric baldeschwieler (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13007772#comment-13007772
 ] 

eric baldeschwieler commented on MAPREDUCE-279:
---

Hi Folks,

I'm back part-time, but I'm mainly focused on catching up, annual focal reviews 
and adjusting to life with a newborn at home.

Todd Papaioannou (p9u) remains acting head of Hadoop this week.

Most line issues can continue to go to Amol, Kazi, Satish, Avik or Senthil as 
appropriate.

I am about, drop me a line on my personal email or call my cell if you need 
rapid response, but I am reading mail now.

CUSoon,
E14


 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0

 Attachments: MR-279.patch, MR-279.sh, MR-279_MR_files_to_move.txt


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

2011-03-05 Thread Scott Carey (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13003086#comment-13003086
 ] 

Scott Carey commented on MAPREDUCE-279:
---

Re: Shuffle.

See https://issues.apache.org/jira/browse/MAPREDUCE-318

Those changes are in 0.21+ (and perhaps Y!'s distro but not Cloudera's), I 
believe.  This doesn't do everything mentioned but is a significant improvement.

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

2011-03-04 Thread MengWang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13002543#comment-13002543
 ] 

MengWang commented on MAPREDUCE-279:


@All

How shuffle works in MapReduce 2.0 ?

Our study shows that shuffle is a performance bottleneck of mapreduce 
computing. There are some problems of shuffle:
(1)Shuffle and reduce are tightly-coupled, usually shuffle phase doesn't 
consume too much memory and CPU, so theoretically, reducetasks's slot can be 
used for other computing tasks when copying data from maps. This method will 
enhance cluster utilization. Furthermore, should shuffle be separated from 
reduce? Then shuffle will not use reduce's slot,we need't distinguish between 
map slots and reduce slots at all.
(2)For large jobs, shuffle will use too many network connections, Data 
transmitted by each network connection is very little, which is inefficient. 
From 0.21.0 one connection can transfer several map outputs, but i think this 
is not enough. Maybe we can use a per node shuffle client progress(like 
tasktracker) to shuffle data for all reduce tasks on this node, then we can 
shuffle more data trough one connection.
(3)Too many concurrent connections will cause shuffle server do massive random 
IO, which is inefficient. Maybe we can aggregate http request(like delay 
scheduler), then random IO will be sequential.
(4)How to manage memory used by shuffle efficiently. We use buddy memory 
allocation, which will waste a considerable amount of memory.
(5)If shuffle separated from reduce, then we must figure out how to do reduce 
locality?
(6)Can we store map outputs in a Storage system(like hdfs)?
(7)Can shuffle be a general data transfer service, which not only for 
map/reduce paradigm?

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

2011-03-04 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13002677#comment-13002677
 ] 

Arun C Murthy commented on MAPREDUCE-279:
-

bq. I'm very interested in this project. How can I join?

Thanks Ozawa! We are very glad.

An update is that we have pretty much closed all the loops internally to get 
the code out into a branch, we'd love to start having everyone involved...

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

2011-03-04 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13002679#comment-13002679
 ] 

Arun C Murthy commented on MAPREDUCE-279:
-

bq. How shuffle works in MapReduce 2.0 ?

Meng - pretty much the same as currently, the map-outputs are served over http.

We have discussed improvements to shuffle along the lines you have suggested 
for a long while now (I just don't have the jiras handy) and I agree, they are 
excellent ideas.

Our hope is that with MRv2 we open up Map-Reduce to significant innovation so 
that folks can try various ideas like the ones you suggested... make sense?

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

2011-03-04 Thread Lianhui Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13002947#comment-13002947
 ] 

Lianhui Wang commented on MAPREDUCE-279:


i think that the MR2.0 may resolve the thing:
JT doesnot monitor the status of the every job and tasks,because many TT must 
RPC to the one JT every few seconds.
and many clients get the job's status through rpc the one JT every few seconds.
so the majority nodes of the cluster connect to the JT discontinuously,that 
lead to degrade the performance of the JT.especially the number of cluster 
increase,example 10K.
like hdfs's Federation Branch, the MR project must create a new branch for the 
2.0. 



 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

2011-03-02 Thread OZAWA Tsuyoshi (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13001797#comment-13001797
 ] 

OZAWA Tsuyoshi commented on MAPREDUCE-279:
--

 Hadoop needs to become more modular internally
+1

There are a lot of domain-specific programming model by extending MapReduce 
(e.g haloop, twiter, and so on), 
so this evolution is good to deal with the fashion.

@Arun
I'm very interested in this project. How can I join?
Or, is there some repository to access your prototype code?

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

2011-03-02 Thread OZAWA Tsuyoshi (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13001855#comment-13001855
 ] 

OZAWA Tsuyoshi commented on MAPREDUCE-279:
--

 (e.g haloop, twiter, and so on)
s/twiter/Twister/g

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

2011-02-28 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13000464#comment-13000464
 ] 

Arun C Murthy commented on MAPREDUCE-279:
-

@Min 

bq. How does ApplicationMaster know its resource requirements before it 
launches tasks?
The assumption is that the AM has a basic idea about resource requirements for 
it's application which is feasible for our primary use case: Map-Reduce. OTOH, 
an AM for other applications has the ability to launch a few tasks, watch their 
resource consumption/utilization and update future resource requests.

bq. Even common users can deploy their ApplicationMaster over the cluster they 
have no any permissions on that?
From the framework (i.e. RM/NN) perspective, everything in the cluster 
including AMs is 'user-land'. Thus as long as a user implements the protocols 
for AMs they can deploy any applications... they do not need any permission to 
deploy.

I'm working very hard to get the codebase committed to a branch, once there we 
would love your f/b on the protocols etc. Hopefully that should help you 
understand how to implement a custom AM if you so choose... appreciate your 
patience while I work the system! *smile*


 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

2011-02-28 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13000576#comment-13000576
 ] 

Mahadev konar commented on MAPREDUCE-279:
-

@Scott,
 With respect your comments on ResourceManager/ZooKeeper/RPC:
We intend to take it slow with ZooKeeper, initially the intention is to put 
just the allocations (what each job/application is allocated in ZooKeeper, this 
is mainly for ResourceManager and Application Master restart). I am not really 
in favor of  using ZK notification for getting rid of RPC's. For the scale we 
are talking abt, the first to get the work will take it approach will cause 
herd affect and will definitely be a cause for concern. I think ZK can be used 
much more than what we have proposed but itll be gradual process to see what 
all we can offload to ZK.

I am pretty hesitant to put RPC load onto ZK and use it as a workload queue for 
something like this. 
 

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

2011-02-28 Thread Leitao Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13000752#comment-13000752
 ] 

Leitao Guo commented on MAPREDUCE-279:
--

@Arun
I also suspect the assumption that the AM has a basic idea about resource 
requirements. For example in hive scenario, how does AM know the requirements 
for resources when facing all kinds of query requests?

At the same time, if AM finds the request for resources is not enough for the 
application, will it re-request more resources or just fail?

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

2011-02-26 Thread Min Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12999745#comment-12999745
 ] 

Min Zhou commented on MAPREDUCE-279:


@Arun

How does ApplicationMaster know its resource requirements before it launches 
tasks? IMHO, the biggest problem of resources allocation is that we could't 
determine the CPU/memory/disk/network requirements unless when the task is 
running. User defined requirements by the configuration files are always 
improper. 
From your words, the architecture allows end-users to implement any 
application-specific framework by implementing a custom ApplicationMaster. 
Even common users can deploy their ApplicationMaster over the cluster they 
have no any permissions on that? Can you illustrate how to achieve it?

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

2011-02-24 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12998977#comment-12998977
 ] 

Arun C Murthy commented on MAPREDUCE-279:
-

bq. +1 An easy way to achieve this here would be to put the resource manager 
code and new MapReduce ApplicationMaster code into separate source trees under 
mapreduce.

Agreed! We have done exactly that in the prototype and plan to continue 
improving modularity.

bq. Going further, the work to separate out the API and libraries from the 
implementation should help this effort too, since it will involve removing hard 
dependencies on the jobtracker from the API classes

+1

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

2011-02-23 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12998654#comment-12998654
 ] 

Tom White commented on MAPREDUCE-279:
-

bq. Hadoop needs to become more modular internally

+1 An easy way to achieve this here would be to put the resource manager code 
and new MapReduce ApplicationMaster code into separate source trees under 
mapreduce. This will help enforce dependencies from the beginning: MR2 depends 
on MR1 and RM, but RM doesn't depend on anything else (except common for RPC?). 
Going further, the work to separate out the API and libraries from the 
implementation should help this effort too, since it will involve removing hard 
dependencies on the jobtracker from the API classes (see MAPREDUCE-1478, 
MAPREDUCE-1638).




 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

2011-02-21 Thread Scott Carey (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12997589#comment-12997589
 ] 

Scott Carey commented on MAPREDUCE-279:
---

Good stuff!

Does the NodeManager communicate to the ResourceManager similar to now (ping - 
response RPC)?   I ask because some of the bottlenecks and complexities now are 
due to this style of RPC.  I've changed a couple systems in the past from ping 
- response to register - callback in the past and these became more efficient 
and the code became simpler.  With ZooKeeper in there, I wonder how much of the 
communicaton now uses ZooKeeper watches for efficiency and low latency.

When a Job starts up in the ApplicationMaster, does it have to wait for pings 
to get resources from the scheduler?  Or is the data all there in ZK, so that 
ramp-up times for jobs is much faster and resource reassignment for jobs with 
short lived tasks isn't completely throttled by the rate of pings?

In any case, the new architecture is decoupled and it should be much easier to 
make enhancements with this separation.


 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

2011-02-21 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12997600#comment-12997600
 ] 

Arun C Murthy commented on MAPREDUCE-279:
-

bq. With ZooKeeper in there, I wonder how much of the communicaton now uses 
ZooKeeper watches for efficiency and low latency.

Scott - We seriously considered this, but had to continue to use Hadoop RPC for 
a couple of reasons:
a) Mahadev, our resident ZK (and the new ResourceManager) expert, was very vary 
of using ZK watches for scalability reasons. Consider a 10k node cluster with 
25-30 containers per node and 10k running jobs - we'd need at least 10k * 10k 
watches which is a *lot* for ZK
b) Security on ZK is still largely unknown, eventually ZK will get there but 
we'd need a lot of work to do for delegation tokens etc. since we can't do 
kerberos everywhere.

Having said that...

bq. In any case, the new architecture is decoupled and it should be much easier 
to make enhancements with this separation.

Exactly. This is something we should definitely re-visit in a subsequent 
release. Hopefully that makes sense, thanks!

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

2011-02-21 Thread Jeff Hammerbacher (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12997599#comment-12997599
 ] 

Jeff Hammerbacher commented on MAPREDUCE-279:
-

Hey Arun,

As long as that evolution is happening in a branch, that seems totally 
reasonable to me. When it comes time to migrate the code into trunk, I hope for 
the same end state as Matei: I think the resource management system should be a 
separate project from MapReduce so that each system can evolve and release 
separately. When we have more clients than just MapReduce for the resource 
manager, we'll want those new clients to evolve as separate projects rather 
than all living under the Apache Hadoop umbrella. Now seems like an excellent 
time to facilitate that end state.

More specifically, in an ideal world, we'd have four separate projects here: 
Common (probably folded into Guava or Apache Commons), HDFS, Yahoo! Cluster 
Manager (Resource Manager + Node Manager), and MapReduce (the ApplicationMaster 
for MapReduce, I guess). Then, if someone wanted to write Pregel to run against 
the Cluster Manager, they could implement their own ApplicationMaster in a 
separate project. Similarly, if someone wanted to run MapReduce against a 
different cluster manager, that would be simple. More practically, we have the 
opportunity to get the Cluster Manager project started up as a separate ASF 
project once it has gestated in a branch here for a bit. Are there any 
technical barriers to making that happen?

I'm a huge fan of this work, and having watched a number of ASF projects evolve 
over the past several years, I suspect that a small, focused project dedicated 
to cluster resource management will have the best chance of moving quickly.

Thanks,
Jeff

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

2011-02-21 Thread eric baldeschwieler (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12997614#comment-12997614
 ] 

eric baldeschwieler commented on MAPREDUCE-279:
---

Hi Jeff,

A couple of thoughts:

1) Discussions on how to reorganize the hadoop universe probably should be 
moved from this bug to their own thread.  Can we restrict this thread to 
discussions about the design and implementation of this work?  Feel free to 
start this discussion on general or in JIRA.

2) I agree with you that it is important that we structure hadoop so that it is 
easy to plugin and use other technologies and I would welcome your contribution 
of code to help make that a reality in this case.

3) My experience with the project split has been very negative.  It is becoming 
much harder, not easier to evolve the hadoop code base.  Hence nigel's 
suggestions (which I support) to actually move the projects closer together.  
Since map-reduce is the core of Hadoop, I think it is import that Hadoop remain 
able to deliver the worlds best MR solution within the project.

4) We consider this work a natural evolution of the MR project.  Please don't 
refer to it as Yahoo! cluster manager.  That will just confuse the discussion.  
The intent is to complete this work in apache and others are more then welcome 
to help us with it.

Thanks,

E14


 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

2011-02-21 Thread Scott Carey (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12997648#comment-12997648
 ] 

Scott Carey commented on MAPREDUCE-279:
---

{quote}
Consider a 10k node cluster with 25-30 containers per node and 10k running jobs 
- we'd need at least 10k * 10k watches which is a lot for ZK
{quote}

Thanks for the info Arun.  There would be a lot to work out to mix in ZK and 
not run into a scalability wall.  
If you assume that each node has to watch every job, its not going to scale.   
If each node is only watching one thing when in need of work (Is there work 
for me?) you can get a large chunk of the RPC that causes delayed task starts 
gone.  I'm mainly thinking of the is there work for me now?  what about now?  
And now? RPC that goes on in hadoop today.  That could be inverted into flag 
three nodes with local data simultaneously that there is work for them, the 
first to grab the item wins.  How valuable is replacing just part of the RPC?  
I'm not sure.  It would help my clusters, but they aren't that big.
The other part of the scheduling problem you allude to that requires scanning 
all available jobs and assigning resources would need some clever work to do in 
ZK without scalability problems.

On a related item, I am glad that job submission includes a DAG of tasks.  
There is a lot of opportunity to reduce latency in job flows there and 
consolidate work from a half-dozen projects duplicating effort.

{quote}
It is becoming much harder, not easier to evolve the hadoop code base.
{quote}
The choice to have all three projects be in their own trunk/tags/branches was a 
mistake IMO.  I've done the same elsewhere and learned the hard way:  don't put 
projects under different version trees unless you intend to actually completely 
decouple them *and* release them separately.

Hadoop needs more modularity and plugability, but making Cluster Management and 
Application Management plug-able does not depend on separate projects, its the 
other way around.  Hadoop needs to become more modular internally, its build 
more sophisticated, and the build outputs more flexible.  After a user can swap 
out foo-resource-manager.jar with hadoop-resource-manager.jar behind 
resource-manager-api.jar and expect it to work, a separate project for the 
hadoop-resource-manager could make sense.

That said, I agree with Eric's #1 -- future modularity and this work are 
separate discussions / items.  IMO any greater project restructuring related to 
cluster management depends on this, and not the other way around.  A project 
split should not be the enforcer of for modularity, actual proven modularity 
should be justification for a split.  If one is afraid that without a project 
split, things are bound to be intertwined, other solutions should be found.  
Releasing separate jars for the components is one way to move forward that does 
not need a project split -- though it might require Maven to make it easy to 
manage and make a split much easier.


 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

2011-02-21 Thread Joydeep Sen Sarma (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12997654#comment-12997654
 ] 

Joydeep Sen Sarma commented on MAPREDUCE-279:
-

I have been working on maintaining/enhancing MR for FB's use case for last 6 
months or so. Here are a few priority areas for us that are relevant to this 
discussion:

# Latency.

  This is, imho, #1 priority wrt scheduling. As Scott has already remarked, the 
ping-response model is broken. So is preemption as an after-thought. We need to 
get small/medium jobs scheduled instantly. Period.

# Scalability.

  We have made a number of vital fixes to keep the JT working at our scale - 
but we have merely bought some time.

# (wrt. ResourceManager) Open API

  By which i mean something like Thrift/PB/Avro. We are, of course, most 
comfortable with Thrift and it would be nice (but not critical) if it were 
possible to build a Thrift wrapper (even if one was not baked in from scratch). 

  One thing i have found is that writing Thrift services is a breeze because of 
inbuilt service framework. Everything else on the serialization side being 
equal - this has been a big win for me personally as a developer (and something 
to be considered as other distributed execution frameworks try to use the RM).

# Ability to back-plug into older Hadoop versions

  This is related to #3. Unlike many other organizations - we cannot make big 
jumps in hadoop revisions anymore. We have too many custom changes and we don't 
have a QA department. Unlike in the past, where we could have depended on 
Yahoo's QA'ed releases - we don't have that luxury anymore (because we are now 
both running software at similar versions - and we can't wait until Yahoo has 
deployed/QA'ed new versions before deploying newer upgrades).

  If the RM api is open (and satisfactory from design perspective) - we can do 
the work in-house to our older version of Hadoop to use it. This is critical 
for us (although i am not sure it applies to other users).

I cannot emphasize the urgency around #1. Whether we continue to use Hadoop or 
not is predicated on big improvements in latency and efficiency (the latter is 
a different topic).

I hope #3 and #4 contribute to the discussion around component architecture. At 
our scale - i don't think we can build services using large software that is 
tightly integrated. We need too much customization and we can't afford the long 
upgrade cycles of such tightly integrated software. Of course, this is specific 
to our deployment and the requirements of most other deployments is likely to 
be quite different.



As a developer - i have found the current JobTracker code totally 
unmaintainable - I hope the new version (broken across RM/App-Master) is 
better. There are several design points that have struck me as particularly 
evil:

# synchronous RPC based architecture: limits concurrency and forces bad 
implementation choices (see #2)
# crazy locking: this is just bad implementation for the most part - but i hope 
the new design/implementation clearly articulates some principles around the 
fundamental data structures and how transactional changes to these data 
structures are meant to be accomplished.
# poor data structure maintenance: 99% of the data structures in the JT have a 
pattern of a:
  a. a primary collection (eg: list of all jobs in the system)
  b. several secondary indices/views (list of all runnable jobs from above, 
list of all completed jobs etc)
  
  Instead of modeling updates to such collections and related views through a 
common entry point - updates to primary and secondary data structures are at 
disjoint places throughout the code and make maintenance of code a nightmare.

i can only hope that a big rewrite like this will try to address some of these 
issues (others - like hard-wiring to specific (M/R) task types - are already 
addressed i presume in the new RM). 

my 2c.

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

2011-02-19 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12996851#comment-12996851
 ] 

Arun C Murthy commented on MAPREDUCE-279:
-

Jeff, the prototype uses significant amount of Hadoop MapReduce, especially the 
MapReduce ApplicationMaster for running MR jobs. There is a new 
ResourceManager/NodeManager, but we still need to co-evolve and stabilize the 
entire codebase for serving our primary aim: running Hadoop MapReduce 
applications. After all, this is a re-factor of Hadoop MapReduce.

Moving to different projects is premature... it will be very reminiscent of the 
issues we have with Common and HDFS i.e. every change might be spread over 
multiple projects, which is a logistical nightmare for developers. Eventually, 
once we have a few releases under our belt and successful deployments etc., we 
might be in a better place to revisit this proposal. Make sense?

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

2011-02-17 Thread Jeff Hammerbacher (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12995768#comment-12995768
 ] 

Jeff Hammerbacher commented on MAPREDUCE-279:
-

Hey Arun,

Wow, thanks for reviving one of my favorite old issues! One question: how much 
code does the prototype share with Hadoop MapReduce? My understanding is that 
it's mostly new code. If that's the case, have you considered creating a 
separate Apache Incubator project for the new two-level scheduler? Mesos, for 
example, has similar aims and is going the Apache Incubator route. I am aware 
of at least one other effort on this front, and having the projects gestate in 
the Incubator rather than as a branch of Hadoop would allow them to release 
more regularly while young and would be more in line with the dreams of the 
project split (having HDFS and MapReduce developed as separate projects). This 
seems like a great opportunity to continue the trend of keeping individual ASF 
projects small and focused so that releases require less work and can happen 
more regularly. What do you think?

Later,
Jeff

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

2011-02-17 Thread eric baldeschwieler (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12995771#comment-12995771
 ] 

eric baldeschwieler commented on MAPREDUCE-279:
---

I'm way out of the office, I'm helping with the newest addition to our family, 
Jack baldeschwieler Yoshikawa.

Todd Papaioannou (p9u) is action head of Hadoop.
Most line issues can continue to go to Amol, Kazi, Satish, Avik or Senthil as 
appropriate.

I'll be back on roughly march 9th.

CUSoon,
E14


 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

2011-02-17 Thread Matei Zaharia (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12996189#comment-12996189
 ] 

Matei Zaharia commented on MAPREDUCE-279:
-

+1 on decoupling Hadoop MapReduce from the resource management system in a way 
that allows Hadoop to run on top of other cluster scheduling systems as well. 
Apart from simplifying experimentation with these types of two-level 
schedulers, I think this is would be a good thing for the MapReduce project in 
general as a way to make the project runnable in the maximum variety of 
environments. For example, there have already been efforts to get Hadoop 
running on HPC schedulers (e.g. Grid Engine) or Condor, and that would be quite 
a bit easier with the refactoring that Arun is doing. I imagine that there will 
be a lot of other work in cluster scheduling in future years, especially as 
people start running more non-MapReduce applications, so it would be nice to be 
able to run the Hadoop software stack in these environments.

 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

2011-02-15 Thread eric baldeschwieler (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12994693#comment-12994693
 ] 

eric baldeschwieler commented on MAPREDUCE-279:
---

We're having a baby!

Todd Papaioannou (p9u) is action head of Hadoop.
Most line issues can continue to go to Amol, Kazi, Satish, Avik or Senthil as 
appropriate.

I'll be back on roughly march 9th.

CUSoon,
E14


 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (MAPREDUCE-279) Map-Reduce 2.0

2011-02-15 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12994928#comment-12994928
 ] 

Arun C Murthy commented on MAPREDUCE-279:
-

h5. Proposal

The fundamental idea of the re-factor is to divide the two major functions of 
the JobTracker, resource management and job scheduling/monitoring, into 
separate components: a generic resource scheduler and a per-job, user-defined 
component that manages the application execution.

The new ResourceManager manages the global assignment of compute resources to 
applications and the per-application ApplicationMaster manages the 
application's scheduling and coordination. An application is either a single 
job in the classic MapReduce jobs or a DAG of such jobs. The ResourceManager 
and per-machine NodeManager server, which manages the user processes on that 
machine, form the computation fabric. The per-application ApplicationMaster is, 
in effect, a framework specific library and is tasked with negotiating 
resources from the ResourceManager and working with the NodeManager(s) to 
execute and monitor the tasks.

The ResourceManager is a pure scheduler in the sense that it performs no 
monitoring or tracking of status for the application. Also, it offers no 
guarantees on restarting failed tasks either due to application failure or 
hardware failures.

The ResourceManager performs its scheduling function based the resource 
requirements of the applications; each application has multiple resource 
request types that represent the resources required for containers. The 
resource requests include memory, CPU, disk, network etc. Note that this is a 
significant change from the current model of fixed-type slots in Hadoop 
MapReduce, which leads to significant negative impact on cluster utilization. 
The ResourceManager has a scheduler policy plug-in, which is responsible for 
partitioning the cluster resources among various queues, applications etc. 
Scheduler plug-ins can be based, for e.g., on the current CapacityScheduler and 
FairScheduler.

The NodeManager is the per-machine framework agent who is responsible for 
launching the applications' containers, monitoring their resource usage (cpu, 
memory, disk, network) and reporting the same to the Scheduler.

The per-application ApplicationMaster has the responsibility of negotiating 
appropriate resource containers from the Scheduler, launching tasks, tracking 
their status  monitoring for progress, handling task-failures and recovering 
from saved state on an ResourceManager fail-over.

Since downtime is more expensive at scale high-availability is built-in from 
the beginning via Apache ZooKeeper for the ResourceManager and HDFS checkpoint 
for the MapReduce ApplicationMaster. Security and multi-tenancy support is 
critical to support many users on the larger clusters. The new architecture 
will also increase innovation and agility by allowing for user-defined versions 
of MapReduce runtime. Support for generic resource requests will increase 
cluster utilization by removing artificial bottlenecks such as 
hard-partitioning of resources into map and reduce slots.



We have a *prototype* we'd like to commit to a branch soon, where we look 
forward to feedback. From there on, we would love to collaborate to get it 
committed to trunk.



 Map-Reduce 2.0
 --

 Key: MAPREDUCE-279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker, tasktracker
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 0.23.0


 Re-factor MapReduce into a generic resource scheduler and a per-job, 
 user-defined component that manages the application execution. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira