[jira] Commented: (MAPREDUCE-1807) TestQueueManager can take long enough to time out
[ https://issues.apache.org/jira/browse/MAPREDUCE-1807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12869864#action_12869864 ] Sreekanth Ramakrishnan commented on MAPREDUCE-1807: --- Currently in trunk and .21 of hadoop, we have removed all the {{MiniMRCluster}} test to a new test case and {{TestQueueManager}} is plain unit test. All the integration tests have been moved to {{TestQueueManagerWithJobTracker}} which has only 3 test cases and all of the same runs under a minute even. Currently in latest build the same took only 30 seconds. This was done as part of MAPREDUCE-28. Isn't solution at MAPREDUCE-28 sufficient? TestQueueManager can take long enough to time out - Key: MAPREDUCE-1807 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1807 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Dick King Sometimes TestQueueManager takes such a long time that the JUnit engine times out and declares it a failure. We should fix this, possibly by splitting the file's 19 test cases into two or more manageable test sets. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1692) Remove TestStreamedMerge from the streaming tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-1692: -- Release Note: Removed streaming testcase which tested non-existent functionality in Streaming. Remove TestStreamedMerge from the streaming tests - Key: MAPREDUCE-1692 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1692 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/streaming Reporter: Sreekanth Ramakrishnan Assignee: Amareshwari Sriramadasu Priority: Minor Fix For: 0.22.0 Attachments: MAPREDUCE-1692-1.patch, MAPREDUCE-1692-1.patch, patch-1692-ydist.txt, patch-1692.txt Currently the {{TestStreamedMerge}} is never run as a part of the streaming test suite, the code paths which were exercised by the test was removed in HADOOP-1315, so it is better to remove the testcase from the code base. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1219) JobTracker Metrics causes undue load on JobTracker
[ https://issues.apache.org/jira/browse/MAPREDUCE-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-1219: -- Attachment: MR-1219-2.patch Incorporating Amareshwaris comment. JobTracker Metrics causes undue load on JobTracker -- Key: MAPREDUCE-1219 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1219 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.21.0 Reporter: Jothi Padmanabhan Assignee: Sreekanth Ramakrishnan Attachments: MAPREDUCE-1219.patch, MR-1219-1.patch, MR-1219-2.patch, patch-1219-ydist.txt JobTrackerMetricsInst.doUpdates updates job-level counters of all running jobs into JobTracker's metrics causing very bad performance and hampers heartbeats. Since Job level metrics are better served by JobHistory, it may be a good idea to remove these from the metrics framework. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1219) JobTracker Metrics causes undue load on JobTracker
[ https://issues.apache.org/jira/browse/MAPREDUCE-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-1219: -- Status: Open (was: Patch Available) JobTracker Metrics causes undue load on JobTracker -- Key: MAPREDUCE-1219 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1219 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.21.0 Reporter: Jothi Padmanabhan Assignee: Sreekanth Ramakrishnan Attachments: MAPREDUCE-1219.patch, MR-1219-1.patch, MR-1219-2.patch, patch-1219-ydist.txt JobTrackerMetricsInst.doUpdates updates job-level counters of all running jobs into JobTracker's metrics causing very bad performance and hampers heartbeats. Since Job level metrics are better served by JobHistory, it may be a good idea to remove these from the metrics framework. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1219) JobTracker Metrics causes undue load on JobTracker
[ https://issues.apache.org/jira/browse/MAPREDUCE-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-1219: -- Status: Patch Available (was: Open) JobTracker Metrics causes undue load on JobTracker -- Key: MAPREDUCE-1219 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1219 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.21.0 Reporter: Jothi Padmanabhan Assignee: Sreekanth Ramakrishnan Attachments: MAPREDUCE-1219.patch, MR-1219-1.patch, MR-1219-2.patch, patch-1219-ydist.txt JobTrackerMetricsInst.doUpdates updates job-level counters of all running jobs into JobTracker's metrics causing very bad performance and hampers heartbeats. Since Job level metrics are better served by JobHistory, it may be a good idea to remove these from the metrics framework. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1612) job conf file is not accessible from job history web page
[ https://issues.apache.org/jira/browse/MAPREDUCE-1612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-1612: -- Attachment: MR-1619-1.patch Attaching patch for trunk. job conf file is not accessible from job history web page - Key: MAPREDUCE-1612 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1612 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.22.0 Reporter: Ravi Gummadi Assignee: Ravi Gummadi Fix For: 0.22.0 Attachments: jobconf_history_jsp.fix.20S.patch, MR-1619-1.patch Clicking on conf file link from job history web page is causing an NPE if history file(and the job conf file) are stored on DFS. This NPE is from jobconf_history.jsp because jobConf built from path on DFS is not having any properties. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1612) job conf file is not accessible from job history web page
[ https://issues.apache.org/jira/browse/MAPREDUCE-1612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-1612: -- Status: Patch Available (was: Open) job conf file is not accessible from job history web page - Key: MAPREDUCE-1612 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1612 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.22.0 Reporter: Ravi Gummadi Assignee: Ravi Gummadi Fix For: 0.22.0 Attachments: jobconf_history_jsp.fix.20S.patch, MR-1619-1.patch Clicking on conf file link from job history web page is causing an NPE if history file(and the job conf file) are stored on DFS. This NPE is from jobconf_history.jsp because jobConf built from path on DFS is not having any properties. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1219) JobTracker Metrics causes undue load on JobTracker
[ https://issues.apache.org/jira/browse/MAPREDUCE-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-1219: -- Attachment: MR-1219-1.patch Attaching patch removing the unused code in {{JobInProgress}} JobTracker Metrics causes undue load on JobTracker -- Key: MAPREDUCE-1219 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1219 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.21.0 Reporter: Jothi Padmanabhan Assignee: Amareshwari Sriramadasu Attachments: MAPREDUCE-1219.patch, MR-1219-1.patch, patch-1219-ydist.txt JobTrackerMetricsInst.doUpdates updates job-level counters of all running jobs into JobTracker's metrics causing very bad performance and hampers heartbeats. Since Job level metrics are better served by JobHistory, it may be a good idea to remove these from the metrics framework. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1219) JobTracker Metrics causes undue load on JobTracker
[ https://issues.apache.org/jira/browse/MAPREDUCE-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-1219: -- Status: Patch Available (was: Open) Running thro' Hudson JobTracker Metrics causes undue load on JobTracker -- Key: MAPREDUCE-1219 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1219 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.21.0 Reporter: Jothi Padmanabhan Assignee: Amareshwari Sriramadasu Attachments: MAPREDUCE-1219.patch, MR-1219-1.patch, patch-1219-ydist.txt JobTrackerMetricsInst.doUpdates updates job-level counters of all running jobs into JobTracker's metrics causing very bad performance and hampers heartbeats. Since Job level metrics are better served by JobHistory, it may be a good idea to remove these from the metrics framework. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1219) JobTracker Metrics causes undue load on JobTracker
[ https://issues.apache.org/jira/browse/MAPREDUCE-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12859672#action_12859672 ] Sreekanth Ramakrishnan commented on MAPREDUCE-1219: --- The justification is mentioned in the comment : https://issues.apache.org/jira/browse/MAPREDUCE-1219?focusedCommentId=12779940page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12779940 I just have removed unused code from {{JobInProgress}}. JobTracker Metrics causes undue load on JobTracker -- Key: MAPREDUCE-1219 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1219 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.21.0 Reporter: Jothi Padmanabhan Assignee: Amareshwari Sriramadasu Attachments: MAPREDUCE-1219.patch, MR-1219-1.patch, patch-1219-ydist.txt JobTrackerMetricsInst.doUpdates updates job-level counters of all running jobs into JobTracker's metrics causing very bad performance and hampers heartbeats. Since Job level metrics are better served by JobHistory, it may be a good idea to remove these from the metrics framework. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1219) JobTracker Metrics causes undue load on JobTracker
[ https://issues.apache.org/jira/browse/MAPREDUCE-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12858753#action_12858753 ] Sreekanth Ramakrishnan commented on MAPREDUCE-1219: --- In todays state, JobTracker publishes its metrics along with its running jobs metrics. The running jobs list can be pretty long and the metrics updating cycle is done every heartbeat. This causes a significant increase in heartbeat processing time. Also, the job level metrics are nothing other than counters of the running job. The counters of running job are obtained by locking up the job, which also does not help us in terms of performance. But looking at the information published, shouldn't jobtracker publish its own metrics and not include job level details? Also, users can obtain the counters using different API. So can we remove the job level metrics aka counters from JobTracker metrics? Thoughts? JobTracker Metrics causes undue load on JobTracker -- Key: MAPREDUCE-1219 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1219 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.21.0 Reporter: Jothi Padmanabhan Assignee: Amareshwari Sriramadasu Attachments: MAPREDUCE-1219.patch, patch-1219-ydist.txt JobTrackerMetricsInst.doUpdates updates job-level counters of all running jobs into JobTracker's metrics causing very bad performance and hampers heartbeats. Since Job level metrics are better served by JobHistory, it may be a good idea to remove these from the metrics framework. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-1692) Remove TestStreamedMerge from the streaming tests
Remove TestStreamedMerge from the streaming tests - Key: MAPREDUCE-1692 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1692 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Sreekanth Ramakrishnan Currently the {{TestStreamedMerge}} is never run as a part of the streaming test suite, the code paths which were exercised by the test was removed in HADOOP-1315, so it is better to remove the testcase from the code base. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (MAPREDUCE-1692) Remove TestStreamedMerge from the streaming tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-1692: -- Attachment: MAPREDUCE-1692-1.patch Attaching patch removing the test case. Remove TestStreamedMerge from the streaming tests - Key: MAPREDUCE-1692 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1692 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/streaming Reporter: Sreekanth Ramakrishnan Priority: Minor Attachments: MAPREDUCE-1692-1.patch Currently the {{TestStreamedMerge}} is never run as a part of the streaming test suite, the code paths which were exercised by the test was removed in HADOOP-1315, so it is better to remove the testcase from the code base. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (MAPREDUCE-1692) Remove TestStreamedMerge from the streaming tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-1692: -- Attachment: MAPREDUCE-1692-1.patch Attaching patch correcting the build.xml in streaming. Remove TestStreamedMerge from the streaming tests - Key: MAPREDUCE-1692 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1692 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/streaming Reporter: Sreekanth Ramakrishnan Priority: Minor Attachments: MAPREDUCE-1692-1.patch, MAPREDUCE-1692-1.patch Currently the {{TestStreamedMerge}} is never run as a part of the streaming test suite, the code paths which were exercised by the test was removed in HADOOP-1315, so it is better to remove the testcase from the code base. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (MAPREDUCE-1640) Node health feature fails to blacklist a node if the health check script times out in some cases
[ https://issues.apache.org/jira/browse/MAPREDUCE-1640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12853346#action_12853346 ] Sreekanth Ramakrishnan commented on MAPREDUCE-1640: --- The problem in gist: * In bash scripts all the commands are spawned using fork + exec. Similar to system() syscall. The process hierarchy is as follows: {noformat} 15772 pts/0S 0:00 \_ /bin/bash ./myscript.sh 15773 pts/0S 0:00 | \_ sleep 10 {noformat} * So when kill 15772 is sent, the signal is not delivered to child. * So parent exits and sleep does not check if the parent is alive or not continues doing its work as it is a long running process. The problem is similar to what is mentioned and addressed in HADOOP-2721 For this problem all the node health script which is spawned should do: {{setsid; exec(node_health_path)}} and then instead of Process.destory() we do {{kill -pid}}, now the problem is that pid of the process is now passed on to java, so it will change to {{setsid; echo $$ ;exec(node_health_path)}} and we should read the input stream to get process id. Or alternate solution to the problem is the node health script configured manages its children :-) Node health feature fails to blacklist a node if the health check script times out in some cases Key: MAPREDUCE-1640 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1640 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Reporter: Vinod K V Fix For: 0.22.0 Node health check feature fails to blacklist a TT if health check script times out. Below are the values that were set: - mapred.healthChecker.interval=6 - mapred.healthChecker.script.timeout=6000 And the script was: {code} #!/bin/bash echo start sleep 10 echo end {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-1377) Port of HDFS-587 to mapreduce project.
Port of HDFS-587 to mapreduce project. -- Key: MAPREDUCE-1377 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1377 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 0.22.0 Reporter: Sreekanth Ramakrishnan HDFS-587 made changes to the HDFS tests to support generic hadoop parameters to tests. This JIRA is to track the mapreduce part of the same. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1377) Port of HDFS-587 to mapreduce project.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12800530#action_12800530 ] Sreekanth Ramakrishnan commented on MAPREDUCE-1377: --- +1 to Eric's patch, running thro' Hudson. Port of HDFS-587 to mapreduce project. -- Key: MAPREDUCE-1377 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1377 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 0.22.0 Reporter: Sreekanth Ramakrishnan Attachments: jira.HDFS-587.mapreduce.branch-0.22.patch HDFS-587 made changes to the HDFS tests to support generic hadoop parameters to tests. This JIRA is to track the mapreduce part of the same. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1377) Port of HDFS-587 to mapreduce project.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-1377: -- Attachment: jira.HDFS-587.mapreduce.branch-0.22.patch Attaching same patch as [HDFS-587|https://issues.apache.org/jira/secure/attachment/12425362/jira.HDFS-587.mapreduce.branch-0.22.patch] mapreduce patch. Port of HDFS-587 to mapreduce project. -- Key: MAPREDUCE-1377 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1377 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 0.22.0 Reporter: Sreekanth Ramakrishnan Attachments: jira.HDFS-587.mapreduce.branch-0.22.patch HDFS-587 made changes to the HDFS tests to support generic hadoop parameters to tests. This JIRA is to track the mapreduce part of the same. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1342) Potential JT deadlock in faulty TT tracking
[ https://issues.apache.org/jira/browse/MAPREDUCE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-1342: -- Attachment: mapreduce-1342-1.patch Attaching a patch, removes the need to lock on faultyTrackerInfo, by changing the field to a concurrent hash map and not locking on addition and removal. Potential JT deadlock in faulty TT tracking --- Key: MAPREDUCE-1342 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1342 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.22.0 Reporter: Todd Lipcon Attachments: cycle0.png, mapreduce-1342-1.patch JT$FaultyTrackersInfo.incrementFaults first locks potentiallyFaultyTrackers, and then calls blackListTracker, which calls removeHostCapacity, which locks JT.taskTrackers On the other hand, JT.blacklistedTaskTrackers() locks taskTrackers, then calls faultyTrackers.isBlacklisted() which goes on to lock potentiallyFaultyTrackers. I haven't produced such a deadlock, but the lock ordering here is inverted and therefore could deadlock. Not sure if this goes back to 0.21 or just in trunk. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1342) Potential JT deadlock in faulty TT tracking
[ https://issues.apache.org/jira/browse/MAPREDUCE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-1342: -- Attachment: mapreduce-1342-2.patch Attaching new patch after discussion with Amar. Made the map concurrent map and changed the getters not to lock on the map. This way we will remove the lock on the second resource for Client API's which don't lock on {{JobTracker}} Potential JT deadlock in faulty TT tracking --- Key: MAPREDUCE-1342 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1342 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.22.0 Reporter: Todd Lipcon Attachments: cycle0.png, mapreduce-1342-1.patch, mapreduce-1342-2.patch JT$FaultyTrackersInfo.incrementFaults first locks potentiallyFaultyTrackers, and then calls blackListTracker, which calls removeHostCapacity, which locks JT.taskTrackers On the other hand, JT.blacklistedTaskTrackers() locks taskTrackers, then calls faultyTrackers.isBlacklisted() which goes on to lock potentiallyFaultyTrackers. I haven't produced such a deadlock, but the lock ordering here is inverted and therefore could deadlock. Not sure if this goes back to 0.21 or just in trunk. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1301) TestDebugScriptWithLinuxTaskController fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12791769#action_12791769 ] Sreekanth Ramakrishnan commented on MAPREDUCE-1301: --- The changes look fine to me. +1 to the patch. TestDebugScriptWithLinuxTaskController fails - Key: MAPREDUCE-1301 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1301 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 0.21.0 Reporter: Amareshwari Sriramadasu Fix For: 0.21.0 Attachments: patch-1301.txt After MAPREDUCE:879, TestDebugScriptWithLinuxTaskController fails with following exception : java.lang.NullPointerException at org.apache.hadoop.mapred.TestTaskTrackerLocalization.getFilePermissionAttrs(TestTaskTrackerLocalization.java:274) at org.apache.hadoop.mapred.TestTaskTrackerLocalization.checkFilePermissions(TestTaskTrackerLocalization.java:294) at org.apache.hadoop.mapred.TestDebugScript.verifyDebugScriptOutput(TestDebugScript.java:162) at org.apache.hadoop.mapred.TestDebugScriptWithLinuxTaskController.testDebugScriptExecutionAsDifferentUser(TestDebugScriptWithLinuxTaskController.java:50) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-879) TestTaskTrackerLocalization fails on MAC OS
[ https://issues.apache.org/jira/browse/MAPREDUCE-879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789213#action_12789213 ] Sreekanth Ramakrishnan commented on MAPREDUCE-879: -- The test case failure of the {{testJobShell}} seems to be an issue of MAPREDUCE-1275 The gridmix failure is handled in MAPREDUCE-1124 TestTaskTrackerLocalization fails on MAC OS --- Key: MAPREDUCE-879 URL: https://issues.apache.org/jira/browse/MAPREDUCE-879 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 0.21.0 Environment: Mac OS X 10.5.7 Reporter: Devaraj Das Assignee: Sreekanth Ramakrishnan Priority: Blocker Fix For: 0.21.0 Attachments: mapreduce-879-1.patch, TEST-org.apache.hadoop.mapred.TestTaskTrackerLocalization.txt TestTaskTrackerLocalization failed on an 'ant test' run. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1084) Implementing aspects development and fault injeciton framework for MapReduce
[ https://issues.apache.org/jira/browse/MAPREDUCE-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789102#action_12789102 ] Sreekanth Ramakrishnan commented on MAPREDUCE-1084: --- The test case failures for this test seems to be caused by MAPREDUCE-1275 Locally there were no test case failures of any sort. Implementing aspects development and fault injeciton framework for MapReduce Key: MAPREDUCE-1084 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1084 Project: Hadoop Map/Reduce Issue Type: Improvement Components: build, test Reporter: Konstantin Boudnik Assignee: Sreekanth Ramakrishnan Attachments: mapreduce-1084-1-withoutsvnexternals.patch, mapreduce-1084-1.patch, mapreduce-1084-2.patch, mapreduce-1084-3.patch, mapreduce-1084-5.patch, mapreduce-1084-6.patch, mapreduce-1084-final.patch Similar to HDFS-435 and HADOOP-6204 this JIRA will track the introduction of injection framework for MapReduce. After HADOOP-6204 is in place this particular modification should be very trivial and would take importing (via svn:external) of src/test/build and some tweaking of the build.xml file -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1084) Implementing aspects development and fault injeciton framework for MapReduce
[ https://issues.apache.org/jira/browse/MAPREDUCE-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-1084: -- Status: Open (was: Patch Available) Cancelling patch, due to hudson issue. Implementing aspects development and fault injeciton framework for MapReduce Key: MAPREDUCE-1084 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1084 Project: Hadoop Map/Reduce Issue Type: Improvement Components: build, test Reporter: Konstantin Boudnik Assignee: Sreekanth Ramakrishnan Attachments: mapreduce-1084-1-withoutsvnexternals.patch, mapreduce-1084-1.patch, mapreduce-1084-2.patch, mapreduce-1084-3.patch, mapreduce-1084-final.patch Similar to HDFS-435 and HADOOP-6204 this JIRA will track the introduction of injection framework for MapReduce. After HADOOP-6204 is in place this particular modification should be very trivial and would take importing (via svn:external) of src/test/build and some tweaking of the build.xml file -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1084) Implementing aspects development and fault injeciton framework for MapReduce
[ https://issues.apache.org/jira/browse/MAPREDUCE-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-1084: -- Attachment: mapreduce-1084-5.patch Found an issue with contrib project vaidya, which accidentally re includes the top level build.xml, this was causing the compilation of the contrib project itself breaking. I have removed the change which was put in by MAPREDUCE-676 The core test failure is not related to the patch as the patch does not change any java code nor modify the existing classpaths which junit runner uses. Implementing aspects development and fault injeciton framework for MapReduce Key: MAPREDUCE-1084 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1084 Project: Hadoop Map/Reduce Issue Type: Improvement Components: build, test Reporter: Konstantin Boudnik Assignee: Sreekanth Ramakrishnan Attachments: mapreduce-1084-1-withoutsvnexternals.patch, mapreduce-1084-1.patch, mapreduce-1084-2.patch, mapreduce-1084-3.patch, mapreduce-1084-5.patch, mapreduce-1084-final.patch Similar to HDFS-435 and HADOOP-6204 this JIRA will track the introduction of injection framework for MapReduce. After HADOOP-6204 is in place this particular modification should be very trivial and would take importing (via svn:external) of src/test/build and some tweaking of the build.xml file -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1084) Implementing aspects development and fault injeciton framework for MapReduce
[ https://issues.apache.org/jira/browse/MAPREDUCE-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-1084: -- Status: Open (was: Patch Available) Implementing aspects development and fault injeciton framework for MapReduce Key: MAPREDUCE-1084 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1084 Project: Hadoop Map/Reduce Issue Type: Improvement Components: build, test Reporter: Konstantin Boudnik Assignee: Sreekanth Ramakrishnan Attachments: mapreduce-1084-1-withoutsvnexternals.patch, mapreduce-1084-1.patch, mapreduce-1084-2.patch, mapreduce-1084-3.patch, mapreduce-1084-5.patch, mapreduce-1084-final.patch Similar to HDFS-435 and HADOOP-6204 this JIRA will track the introduction of injection framework for MapReduce. After HADOOP-6204 is in place this particular modification should be very trivial and would take importing (via svn:external) of src/test/build and some tweaking of the build.xml file -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1084) Implementing aspects development and fault injeciton framework for MapReduce
[ https://issues.apache.org/jira/browse/MAPREDUCE-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-1084: -- Status: Patch Available (was: Open) Implementing aspects development and fault injeciton framework for MapReduce Key: MAPREDUCE-1084 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1084 Project: Hadoop Map/Reduce Issue Type: Improvement Components: build, test Reporter: Konstantin Boudnik Assignee: Sreekanth Ramakrishnan Attachments: mapreduce-1084-1-withoutsvnexternals.patch, mapreduce-1084-1.patch, mapreduce-1084-2.patch, mapreduce-1084-3.patch, mapreduce-1084-5.patch, mapreduce-1084-6.patch, mapreduce-1084-final.patch Similar to HDFS-435 and HADOOP-6204 this JIRA will track the introduction of injection framework for MapReduce. After HADOOP-6204 is in place this particular modification should be very trivial and would take importing (via svn:external) of src/test/build and some tweaking of the build.xml file -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1084) Implementing aspects development and fault injeciton framework for MapReduce
[ https://issues.apache.org/jira/browse/MAPREDUCE-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-1084: -- Attachment: mapreduce-1084-6.patch Patch merged with latest trunk. Implementing aspects development and fault injeciton framework for MapReduce Key: MAPREDUCE-1084 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1084 Project: Hadoop Map/Reduce Issue Type: Improvement Components: build, test Reporter: Konstantin Boudnik Assignee: Sreekanth Ramakrishnan Attachments: mapreduce-1084-1-withoutsvnexternals.patch, mapreduce-1084-1.patch, mapreduce-1084-2.patch, mapreduce-1084-3.patch, mapreduce-1084-5.patch, mapreduce-1084-6.patch, mapreduce-1084-final.patch Similar to HDFS-435 and HADOOP-6204 this JIRA will track the introduction of injection framework for MapReduce. After HADOOP-6204 is in place this particular modification should be very trivial and would take importing (via svn:external) of src/test/build and some tweaking of the build.xml file -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-879) TestTaskTrackerLocalization fails on MAC OS
[ https://issues.apache.org/jira/browse/MAPREDUCE-879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-879: - Attachment: mapreduce-879-1.patch Attaching a patch fixing the issue, instead of using stat the patch uses the following which outputs in same format as stat -c command used {noformat} ls -l -d path | awk '{print $1:$3:$4} {noformat} The long format of the file listing on mac and linux are same and we are pretty much safe on all OS's which share the same long format listing. Tested the the test case on linux and mac both passed successfully. TestTaskTrackerLocalization fails on MAC OS --- Key: MAPREDUCE-879 URL: https://issues.apache.org/jira/browse/MAPREDUCE-879 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 0.21.0 Environment: Mac OS X 10.5.7 Reporter: Devaraj Das Assignee: Vinod K V Priority: Blocker Fix For: 0.21.0 Attachments: mapreduce-879-1.patch, TEST-org.apache.hadoop.mapred.TestTaskTrackerLocalization.txt TestTaskTrackerLocalization failed on an 'ant test' run. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1084) Implementing aspects development and fault injeciton framework for MapReduce
[ https://issues.apache.org/jira/browse/MAPREDUCE-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-1084: -- Attachment: mapreduce-1084-final.patch Reattaching mapreduce-1084-2.patch as final patch. Implementing aspects development and fault injeciton framework for MapReduce Key: MAPREDUCE-1084 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1084 Project: Hadoop Map/Reduce Issue Type: Improvement Components: build, test Reporter: Konstantin Boudnik Assignee: Sreekanth Ramakrishnan Attachments: mapreduce-1084-1-withoutsvnexternals.patch, mapreduce-1084-1.patch, mapreduce-1084-2.patch, mapreduce-1084-3.patch, mapreduce-1084-final.patch Similar to HDFS-435 and HADOOP-6204 this JIRA will track the introduction of injection framework for MapReduce. After HADOOP-6204 is in place this particular modification should be very trivial and would take importing (via svn:external) of src/test/build and some tweaking of the build.xml file -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1084) Implementing aspects development and fault injeciton framework for MapReduce
[ https://issues.apache.org/jira/browse/MAPREDUCE-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-1084: -- Status: Patch Available (was: Open) Implementing aspects development and fault injeciton framework for MapReduce Key: MAPREDUCE-1084 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1084 Project: Hadoop Map/Reduce Issue Type: Improvement Components: build, test Reporter: Konstantin Boudnik Assignee: Sreekanth Ramakrishnan Attachments: mapreduce-1084-1-withoutsvnexternals.patch, mapreduce-1084-1.patch, mapreduce-1084-2.patch, mapreduce-1084-3.patch, mapreduce-1084-final.patch Similar to HDFS-435 and HADOOP-6204 this JIRA will track the introduction of injection framework for MapReduce. After HADOOP-6204 is in place this particular modification should be very trivial and would take importing (via svn:external) of src/test/build and some tweaking of the build.xml file -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1084) Implementing aspects development and fault injeciton framework for MapReduce
[ https://issues.apache.org/jira/browse/MAPREDUCE-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-1084: -- Status: Open (was: Patch Available) Implementing aspects development and fault injeciton framework for MapReduce Key: MAPREDUCE-1084 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1084 Project: Hadoop Map/Reduce Issue Type: Improvement Components: build, test Reporter: Konstantin Boudnik Assignee: Sreekanth Ramakrishnan Attachments: mapreduce-1084-1-withoutsvnexternals.patch, mapreduce-1084-1.patch, mapreduce-1084-2.patch, mapreduce-1084-3.patch, mapreduce-1084-final.patch Similar to HDFS-435 and HADOOP-6204 this JIRA will track the introduction of injection framework for MapReduce. After HADOOP-6204 is in place this particular modification should be very trivial and would take importing (via svn:external) of src/test/build and some tweaking of the build.xml file -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1084) Implementing aspects development and fault injeciton framework for MapReduce
[ https://issues.apache.org/jira/browse/MAPREDUCE-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-1084: -- Status: Patch Available (was: Open) Running thro' HUDSON. Implementing aspects development and fault injeciton framework for MapReduce Key: MAPREDUCE-1084 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1084 Project: Hadoop Map/Reduce Issue Type: Improvement Components: build, test Reporter: Konstantin Boudnik Assignee: Sreekanth Ramakrishnan Attachments: mapreduce-1084-1-withoutsvnexternals.patch, mapreduce-1084-1.patch, mapreduce-1084-2.patch, mapreduce-1084-3.patch Similar to HDFS-435 and HADOOP-6204 this JIRA will track the introduction of injection framework for MapReduce. After HADOOP-6204 is in place this particular modification should be very trivial and would take importing (via svn:external) of src/test/build and some tweaking of the build.xml file -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1084) Implementing aspects development and fault injeciton framework for MapReduce
[ https://issues.apache.org/jira/browse/MAPREDUCE-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-1084: -- Attachment: mapreduce-1084-3.patch Renaming {{jar-mapred-test-fault-inject}} to {{jar-test-fault-inject}}. The reason why {{jar-mapred-test-fault-inject}} was implemented was to follow the same standard as in HDFS-703 Implementing aspects development and fault injeciton framework for MapReduce Key: MAPREDUCE-1084 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1084 Project: Hadoop Map/Reduce Issue Type: Improvement Components: build, test Reporter: Konstantin Boudnik Assignee: Sreekanth Ramakrishnan Attachments: mapreduce-1084-1-withoutsvnexternals.patch, mapreduce-1084-1.patch, mapreduce-1084-2.patch, mapreduce-1084-3.patch Similar to HDFS-435 and HADOOP-6204 this JIRA will track the introduction of injection framework for MapReduce. After HADOOP-6204 is in place this particular modification should be very trivial and would take importing (via svn:external) of src/test/build and some tweaking of the build.xml file -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1062) MRReliability test does not work with retired jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-1062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-1062: -- Status: Open (was: Patch Available) Cancelling patch and running thro' hudson with latest patch. MRReliability test does not work with retired jobs -- Key: MAPREDUCE-1062 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1062 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 0.21.0 Reporter: Sreekanth Ramakrishnan Assignee: Sreekanth Ramakrishnan Attachments: mapreduce-1062-1.patch, mapreduce-1062-2.patch, mapreduce-1062-3-ydist.patch, mapreduce-1062-3.patch, mapreduce-1062-4.patch, mapreduce-ydist-20-1.patch Currently the MRReliability uses job clients get all job api which also includes retired jobs. In case we have retired jobs in cluster, The retired jobs are appended at the end of the job list, this results in Test always getting completed job and not spawning off KillTask thread and KillTracker threads. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1062) MRReliability test does not work with retired jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-1062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-1062: -- Status: Patch Available (was: Open) Rerunning the patch thro' hudson. MRReliability test does not work with retired jobs -- Key: MAPREDUCE-1062 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1062 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 0.21.0 Reporter: Sreekanth Ramakrishnan Assignee: Sreekanth Ramakrishnan Attachments: mapreduce-1062-1.patch, mapreduce-1062-2.patch, mapreduce-1062-3-ydist.patch, mapreduce-1062-3.patch, mapreduce-1062-4.patch, mapreduce-ydist-20-1.patch Currently the MRReliability uses job clients get all job api which also includes retired jobs. In case we have retired jobs in cluster, The retired jobs are appended at the end of the job list, this results in Test always getting completed job and not spawning off KillTask thread and KillTracker threads. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1084) Implementing aspects development and fault injeciton framework for MapReduce
[ https://issues.apache.org/jira/browse/MAPREDUCE-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12787323#action_12787323 ] Sreekanth Ramakrishnan commented on MAPREDUCE-1084: --- I have uploaded both patches, the patch number 3 has the two targets: {{jar-test-fault-inject}} - {{jar-mapred-test-fault-inject}} Patch 4 has just {{jar-test-fault-inject}}. Am I required to upload a new patch for committing which would be same as patch number 3? Implementing aspects development and fault injeciton framework for MapReduce Key: MAPREDUCE-1084 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1084 Project: Hadoop Map/Reduce Issue Type: Improvement Components: build, test Reporter: Konstantin Boudnik Assignee: Sreekanth Ramakrishnan Attachments: mapreduce-1084-1-withoutsvnexternals.patch, mapreduce-1084-1.patch, mapreduce-1084-2.patch, mapreduce-1084-3.patch Similar to HDFS-435 and HADOOP-6204 this JIRA will track the introduction of injection framework for MapReduce. After HADOOP-6204 is in place this particular modification should be very trivial and would take importing (via svn:external) of src/test/build and some tweaking of the build.xml file -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1084) Implementing aspects development and fault injeciton framework for MapReduce
[ https://issues.apache.org/jira/browse/MAPREDUCE-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12787326#action_12787326 ] Sreekanth Ramakrishnan commented on MAPREDUCE-1084: --- Sorry for confusion in above comment it is patch 2 and patch 3 instead of 3 and 4. Sorry for the confusion. Implementing aspects development and fault injeciton framework for MapReduce Key: MAPREDUCE-1084 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1084 Project: Hadoop Map/Reduce Issue Type: Improvement Components: build, test Reporter: Konstantin Boudnik Assignee: Sreekanth Ramakrishnan Attachments: mapreduce-1084-1-withoutsvnexternals.patch, mapreduce-1084-1.patch, mapreduce-1084-2.patch, mapreduce-1084-3.patch Similar to HDFS-435 and HADOOP-6204 this JIRA will track the introduction of injection framework for MapReduce. After HADOOP-6204 is in place this particular modification should be very trivial and would take importing (via svn:external) of src/test/build and some tweaking of the build.xml file -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-913) TaskRunner crashes with NPE resulting in held up slots, UNINITIALIZED tasks and hung TaskTracker
[ https://issues.apache.org/jira/browse/MAPREDUCE-913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786775#action_12786775 ] Sreekanth Ramakrishnan commented on MAPREDUCE-913: -- Took a look at the patch, following at the comments : * Can we check if the workDir is non-null in the run-debug script and throw an exception if the same is null? Would prevent launch of task-controller code. * In test case can we verify the correct number of the map slot is actually reported back to JobTracker after the failing job completes, this would test the actual slot management. Wouldn't it be much better that we add a check to figure out if the taskJVM was launched or not and then run debug script accordingly. TaskRunner crashes with NPE resulting in held up slots, UNINITIALIZED tasks and hung TaskTracker Key: MAPREDUCE-913 URL: https://issues.apache.org/jira/browse/MAPREDUCE-913 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 0.20.1 Reporter: Vinod K V Priority: Blocker Fix For: 0.21.0 Attachments: mapreduce-913-1.patch, MAPREDUCE-913-20091119.1.txt, MAPREDUCE-913-20091119.2.txt, MAPREDUCE-913-20091120.1.txt -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1084) Implementing aspects development and fault injeciton framework for MapReduce
[ https://issues.apache.org/jira/browse/MAPREDUCE-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-1084: -- Attachment: mapreduce-1084-2.patch Added the target {{jar-test-fault-inject}} which points to {{jar-mapred-test-fault-inject}} same way as it is implemented in HDFS-703. Including the {{src/test/aop/build/aop.xml}} from commons-trunk. Implementing aspects development and fault injeciton framework for MapReduce Key: MAPREDUCE-1084 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1084 Project: Hadoop Map/Reduce Issue Type: Improvement Components: build, test Reporter: Konstantin Boudnik Assignee: Sreekanth Ramakrishnan Attachments: mapreduce-1084-1-withoutsvnexternals.patch, mapreduce-1084-1.patch, mapreduce-1084-2.patch Similar to HDFS-435 and HADOOP-6204 this JIRA will track the introduction of injection framework for MapReduce. After HADOOP-6204 is in place this particular modification should be very trivial and would take importing (via svn:external) of src/test/build and some tweaking of the build.xml file -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (MAPREDUCE-1084) Implementing aspects development and fault injeciton framework for MapReduce
[ https://issues.apache.org/jira/browse/MAPREDUCE-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan reassigned MAPREDUCE-1084: - Assignee: Sreekanth Ramakrishnan Implementing aspects development and fault injeciton framework for MapReduce Key: MAPREDUCE-1084 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1084 Project: Hadoop Map/Reduce Issue Type: Improvement Components: build, test Reporter: Konstantin Boudnik Assignee: Sreekanth Ramakrishnan Similar to HDFS-435 and HADOOP-6204 this JIRA will track the introduction of injection framework for MapReduce. After HADOOP-6204 is in place this particular modification should be very trivial and would take importing (via svn:external) of src/test/build and some tweaking of the build.xml file -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1084) Implementing aspects development and fault injeciton framework for MapReduce
[ https://issues.apache.org/jira/browse/MAPREDUCE-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-1084: -- Attachment: mapreduce-1084-1-withoutsvnexternals.patch mapreduce-1084-1.patch Attaching the patch implementing the fault injection in mapreduce project. There are two patches with svn external and without svn external. Svn external patch when applied over workspace does not create the appropriate folder structure with links even tho' the property and folder is added into version control. Implementing aspects development and fault injeciton framework for MapReduce Key: MAPREDUCE-1084 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1084 Project: Hadoop Map/Reduce Issue Type: Improvement Components: build, test Reporter: Konstantin Boudnik Assignee: Sreekanth Ramakrishnan Attachments: mapreduce-1084-1-withoutsvnexternals.patch, mapreduce-1084-1.patch Similar to HDFS-435 and HADOOP-6204 this JIRA will track the introduction of injection framework for MapReduce. After HADOOP-6204 is in place this particular modification should be very trivial and would take importing (via svn:external) of src/test/build and some tweaking of the build.xml file -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (MAPREDUCE-763) Capacity scheduler should clean up reservations if it runs tasks on nodes other than where it has made reservations
[ https://issues.apache.org/jira/browse/MAPREDUCE-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan reassigned MAPREDUCE-763: Assignee: rahul k singh (was: Sreekanth Ramakrishnan) Capacity scheduler should clean up reservations if it runs tasks on nodes other than where it has made reservations --- Key: MAPREDUCE-763 URL: https://issues.apache.org/jira/browse/MAPREDUCE-763 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/capacity-sched Affects Versions: 0.21.0 Reporter: Hemanth Yamijala Assignee: rahul k singh Fix For: 0.21.0 Currently capacity scheduler makes a reservation on nodes for high memory jobs that cannot currently run at the time. It could happen that in the meantime other tasktrackers become free to run the tasks of this job. Ideally in the next heartbeat from the reserved TTs the reservation should be removed. Otherwise it could unnecessarily block capacity for a while (until the TT has enough slots free to run a task of this job). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1062) MRReliability test does not work with retired jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-1062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-1062: -- Status: Patch Available (was: Open) Re running thro' Hudson MRReliability test does not work with retired jobs -- Key: MAPREDUCE-1062 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1062 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 0.21.0 Reporter: Sreekanth Ramakrishnan Assignee: Sreekanth Ramakrishnan Attachments: mapreduce-1062-1.patch, mapreduce-1062-2.patch, mapreduce-1062-3-ydist.patch, mapreduce-1062-3.patch, mapreduce-ydist-20-1.patch Currently the MRReliability uses job clients get all job api which also includes retired jobs. In case we have retired jobs in cluster, The retired jobs are appended at the end of the job list, this results in Test always getting completed job and not spawning off KillTask thread and KillTracker threads. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1062) MRReliability test does not work with retired jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-1062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-1062: -- Status: Open (was: Patch Available) MRReliability test does not work with retired jobs -- Key: MAPREDUCE-1062 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1062 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 0.21.0 Reporter: Sreekanth Ramakrishnan Assignee: Sreekanth Ramakrishnan Attachments: mapreduce-1062-1.patch, mapreduce-1062-2.patch, mapreduce-1062-3-ydist.patch, mapreduce-1062-3.patch, mapreduce-ydist-20-1.patch Currently the MRReliability uses job clients get all job api which also includes retired jobs. In case we have retired jobs in cluster, The retired jobs are appended at the end of the job list, this results in Test always getting completed job and not spawning off KillTask thread and KillTracker threads. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1239) Mapreduce test build is broken after HADOOP-5107
[ https://issues.apache.org/jira/browse/MAPREDUCE-1239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12782390#action_12782390 ] Sreekanth Ramakrishnan commented on MAPREDUCE-1239: --- Checked the streaming and fair-scheduler test cases. Test cases in fairscheduler fail with too many open files message apart from that test streaming passes fully. Mapreduce test build is broken after HADOOP-5107 Key: MAPREDUCE-1239 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1239 Project: Hadoop Map/Reduce Issue Type: Bug Components: build Affects Versions: 0.21.0, 0.22.0 Reporter: Vinod K V Assignee: Giridharan Kesavan Priority: Blocker Fix For: 0.21.0 Attachments: mapred-1239.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1239) Mapreduce test build is broken after HADOOP-5107
[ https://issues.apache.org/jira/browse/MAPREDUCE-1239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12782742#action_12782742 ] Sreekanth Ramakrishnan commented on MAPREDUCE-1239: --- The issue on Giri's box could have been due to old mapred-queues.xml i.e. MAPREDUCE-1241 , but what I observed when fairscheduler test was run that the {{TestFairScheduler}} was timing out with following exception. {noformat} [junit] 09/11/25 16:34:08 INFO mapred.JobTracker: JobTracker up at: 32801 [junit] 09/11/25 16:34:08 INFO mapred.JobTracker: JobTracker webserver: 44348 [junit] 09/11/25 16:34:08 INFO mapred.JobTracker: Cleaning up the system directory [junit] 09/11/25 16:34:08 INFO mapred.JobTracker: problem cleaning system directory: file:/tmp/hadoop-sreerama/mapred/system [junit] java.io.IOException: Cannot run program chmod: java.io.IOException: error=24, Too many open files [junit] at java.lang.ProcessBuilder.start(ProcessBuilder.java:474) [junit] at org.apache.hadoop.util.Shell.runCommand(Shell.java:188) [junit] at org.apache.hadoop.util.Shell.run(Shell.java:170) [junit] at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:363) [junit] at org.apache.hadoop.util.Shell.execCommand(Shell.java:449) [junit] at org.apache.hadoop.util.Shell.execCommand(Shell.java:432) [junit] at org.apache.hadoop.fs.RawLocalFileSystem.execCommand(RawLocalFileSystem.java:540) [junit] at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:532) [junit] at org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:281) [junit] at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:295) [junit] at org.apache.hadoop.mapred.JobTracker.init(JobTracker.java:1477) [junit] at org.apache.hadoop.mapred.JobTracker.init(JobTracker.java:1306) [junit] at org.apache.hadoop.mapred.JobTracker.init(JobTracker.java:1299) [junit] at org.apache.hadoop.mapred.UtilsForTests.getJobTracker(UtilsForTests.java:712) [junit] at org.apache.hadoop.mapred.TestFairScheduler.testPoolAssignment(TestFairScheduler.java:2546) [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [junit] at java.lang.reflect.Method.invoke(Method.java:616) [junit] at junit.framework.TestCase.runTest(TestCase.java:168) [junit] at junit.framework.TestCase.runBare(TestCase.java:134) [junit] at junit.framework.TestResult$1.protect(TestResult.java:110) [junit] at junit.framework.TestResult.runProtected(TestResult.java:128) [junit] at junit.framework.TestResult.run(TestResult.java:113) [junit] at junit.framework.TestCase.run(TestCase.java:124) [junit] at junit.framework.TestSuite.runTest(TestSuite.java:232) [junit] at junit.framework.TestSuite.run(TestSuite.java:227) [junit] at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:79) [junit] at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39) [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:420) [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:911) [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:768) [junit] Caused by: java.io.IOException: java.io.IOException: error=24, Too many open files [junit] at java.lang.UNIXProcess.init(UNIXProcess.java:164) [junit] at java.lang.ProcessImpl.start(ProcessImpl.java:81) [junit] at java.lang.ProcessBuilder.start(ProcessBuilder.java:467) [junit] ... 31 more {noformat} Mapreduce test build is broken after HADOOP-5107 Key: MAPREDUCE-1239 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1239 Project: Hadoop Map/Reduce Issue Type: Bug Components: build Affects Versions: 0.21.0, 0.22.0 Reporter: Vinod K V Assignee: Giridharan Kesavan Priority: Blocker Fix For: 0.21.0 Attachments: mapred-1239.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Moved: (MAPREDUCE-1238) mapred metrics shows negative count of waiting maps and reduces
[ https://issues.apache.org/jira/browse/MAPREDUCE-1238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan moved HADOOP-5863 to MAPREDUCE-1238: --- Component/s: (was: metrics) jobtracker Fix Version/s: (was: 0.20.2) Key: MAPREDUCE-1238 (was: HADOOP-5863) Project: Hadoop Map/Reduce (was: Hadoop Common) mapred metrics shows negative count of waiting maps and reduces Key: MAPREDUCE-1238 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1238 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Reporter: Ramya R Negative waiting_maps and waiting_reduces count is observed in the mapred metrics -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-28) TestQueueManager takes too long and times out some times
[ https://issues.apache.org/jira/browse/MAPREDUCE-28?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-28: Status: Open (was: Patch Available) TestQueueManager takes too long and times out some times Key: MAPREDUCE-28 URL: https://issues.apache.org/jira/browse/MAPREDUCE-28 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker, test Affects Versions: 0.21.0 Reporter: Amareshwari Sriramadasu Assignee: V.V.Chaitanya Krishna Fix For: 0.21.0 Attachments: MAPREDUCE-28-1.txt, MAPREDUCE-28-2.txt, MAPREDUCE-28-3.txt, MAPREDUCE-28-4.txt, MAPREDUCE-28-5.txt, MAPREDUCE-28-6.txt, MAPREDUCE-28-7.txt, MAPREDUCE-28-8.patch TestQueueManager takes long time for the run and timeouts sometimes. See the failure at http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3875/testReport/. Looking at the console output, before the test finsihes, it was timed-out. On my machine, the test takes about 5 minutes. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (MAPREDUCE-514) Check for invalid queues in capacity scheduler
[ https://issues.apache.org/jira/browse/MAPREDUCE-514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan resolved MAPREDUCE-514. -- Resolution: Won't Fix Check for invalid queues in capacity scheduler -- Key: MAPREDUCE-514 URL: https://issues.apache.org/jira/browse/MAPREDUCE-514 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Hemanth Yamijala Assignee: Sreekanth Ramakrishnan Attachments: HADOOP-4079-1.patch, HADOOP-4079-2.patch, HADOOP-4079-3.patch, HADOOP-4079-4.patch The {{ResourceManagerConf}} class that is being moved to the capacity scheduler of HADOOP-3445 needs to check for a queue name that is not configured in the {{resource-manager-conf.xml}} file in its APIs and fail if it is not available. This feature was originally available, but due to subsequent changes in HADOOP-3698, queues are no longer being configured as part of the mentioned configuration file. Hence we need a different mechanism to check for valid queues. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-514) Check for invalid queues in capacity scheduler
[ https://issues.apache.org/jira/browse/MAPREDUCE-514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12778741#action_12778741 ] Sreekanth Ramakrishnan commented on MAPREDUCE-514: -- This issue does not apply after the commit of MAPREDUCE-860 the queue information is no more maintained in the {{capacity-scheduler.xml}} rather the properties are maintained in the {{mapred-queues.xml}}. So closing this issue. if required for older version please feel free to re-open it. Check for invalid queues in capacity scheduler -- Key: MAPREDUCE-514 URL: https://issues.apache.org/jira/browse/MAPREDUCE-514 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Hemanth Yamijala Assignee: Sreekanth Ramakrishnan Attachments: HADOOP-4079-1.patch, HADOOP-4079-2.patch, HADOOP-4079-3.patch, HADOOP-4079-4.patch The {{ResourceManagerConf}} class that is being moved to the capacity scheduler of HADOOP-3445 needs to check for a queue name that is not configured in the {{resource-manager-conf.xml}} file in its APIs and fail if it is not available. This feature was originally available, but due to subsequent changes in HADOOP-3698, queues are no longer being configured as part of the mentioned configuration file. Hence we need a different mechanism to check for valid queues. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-732) node health check script should not log UNHEALTHY status for every heartbeat in INFO mode
[ https://issues.apache.org/jira/browse/MAPREDUCE-732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-732: - Release Note: Changed log level of addition of blacklisted reason in the JobTracker log to debug instead of INFO node health check script should not log UNHEALTHY status for every heartbeat in INFO mode --- Key: MAPREDUCE-732 URL: https://issues.apache.org/jira/browse/MAPREDUCE-732 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.21.0 Reporter: Ramya R Assignee: Sreekanth Ramakrishnan Priority: Minor Fix For: 0.21.0 Attachments: MAPRED-732-ydist.patch, mapreduce-732-1.patch, MAPREDUCE-732-2.patch, mapreduce-732.patch Currently, when a TT is blacklisted by the node health check script, for every heartbeat a message such as the following is being logged. {noformat} date time INFO org.apache.hadoop.mapred.JobTracker: Adding blacklisted reason for tracker : blacklisted TT Reason for blacklisting is : NODE_UNHEALTHY {noformat} Due to this, the the JT logs fill up rapidly clogging the logdirs. Hence this message should be logged in DEBUG mode instead of INFO mode. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1075) getQueue(String queue) in JobTracker would return NPE for invalid queue name
[ https://issues.apache.org/jira/browse/MAPREDUCE-1075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12769087#action_12769087 ] Sreekanth Ramakrishnan commented on MAPREDUCE-1075: --- Took a look at the patch. We should handle the {{IOException}} at the client end and print only the message on the CLI. Testcase directly accesses the {{JobTracker}} why don't we use Client api's to check the same. getQueue(String queue) in JobTracker would return NPE for invalid queue name Key: MAPREDUCE-1075 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1075 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: V.V.Chaitanya Krishna Assignee: V.V.Chaitanya Krishna Fix For: 0.21.0 Attachments: MAPREDUCE-1075-1.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-964) Inaccurate values in jobSummary logs
[ https://issues.apache.org/jira/browse/MAPREDUCE-964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-964: - Attachment: mapreduce-964-ydist.patch Attaching Yahoo! distribution patch for the issue with Hemanth comments. Inaccurate values in jobSummary logs Key: MAPREDUCE-964 URL: https://issues.apache.org/jira/browse/MAPREDUCE-964 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.1 Reporter: Rajiv Chittajallu Assignee: Sreekanth Ramakrishnan Priority: Critical Attachments: mapreduce-964-1.patch, mapreduce-964-ydist.patch For some jobs the mapSlotSeconds is incorrect. negative value 09/09/01 18:31:44 INFOmapred.JobInProgress$JobSummary: jobId=job_200908270718_4568,submitTime=1251823543976,launchTime=1251823554310,finishTime=1251829904565, numMaps=7965,numSlotsPerMap=1,numReduces=40,numSlotsPerReduce=1,user=wile,queue=runner,status=SUCCEEDED, mapSlotSeconds=-2503133523,reduceSlotsSeconds=186536,clusterMapCapacity=11262,clusterReduceCapacity=3754 or too high 09/09/02 23:59:57 INFO mapred.JobInProgress$JobSummary: jobId=job_200908270718_5861,submitTime=1251935672924,launchTime=1251935687698,finishTime=1251935997949, numMaps=1026,numSlotsPerMap=1,numReduces=10,numSlotsPerReduce=1,user=dfsload,queue=gridops,status=SUCCEEDED, mapSlotSeconds=1251949742,reduceSlotsSeconds=537,clusterMapCapacity=11262,clusterReduceCapacity=3754 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-964) Inaccurate values in jobSummary logs
[ https://issues.apache.org/jira/browse/MAPREDUCE-964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-964: - Attachment: mapreduce-964-2.patch Attaching patch incorporating Hemanths comment. Inaccurate values in jobSummary logs Key: MAPREDUCE-964 URL: https://issues.apache.org/jira/browse/MAPREDUCE-964 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.1 Reporter: Rajiv Chittajallu Assignee: Sreekanth Ramakrishnan Priority: Critical Attachments: mapreduce-964-1.patch, mapreduce-964-2.patch, mapreduce-964-ydist.patch For some jobs the mapSlotSeconds is incorrect. negative value 09/09/01 18:31:44 INFOmapred.JobInProgress$JobSummary: jobId=job_200908270718_4568,submitTime=1251823543976,launchTime=1251823554310,finishTime=1251829904565, numMaps=7965,numSlotsPerMap=1,numReduces=40,numSlotsPerReduce=1,user=wile,queue=runner,status=SUCCEEDED, mapSlotSeconds=-2503133523,reduceSlotsSeconds=186536,clusterMapCapacity=11262,clusterReduceCapacity=3754 or too high 09/09/02 23:59:57 INFO mapred.JobInProgress$JobSummary: jobId=job_200908270718_5861,submitTime=1251935672924,launchTime=1251935687698,finishTime=1251935997949, numMaps=1026,numSlotsPerMap=1,numReduces=10,numSlotsPerReduce=1,user=dfsload,queue=gridops,status=SUCCEEDED, mapSlotSeconds=1251949742,reduceSlotsSeconds=537,clusterMapCapacity=11262,clusterReduceCapacity=3754 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (MAPREDUCE-1002) After MAPREDUCE-862, command line queue-list doesn't print any queues
[ https://issues.apache.org/jira/browse/MAPREDUCE-1002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan reassigned MAPREDUCE-1002: - Assignee: V.V.Chaitanya Krishna After MAPREDUCE-862, command line queue-list doesn't print any queues - Key: MAPREDUCE-1002 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1002 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Reporter: Vinod K V Assignee: V.V.Chaitanya Krishna Fix For: 0.21.0 Web-ui correctly prints the queues, it is the command line that is not showing any queues. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-964) Inaccurate values in jobSummary logs
[ https://issues.apache.org/jira/browse/MAPREDUCE-964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12757008#action_12757008 ] Sreekanth Ramakrishnan commented on MAPREDUCE-964: -- The reason for the negative values being present in the JobSummary results from the taskStatus not having the finishTime field not being set in the taskStatus. This happens especially when the tasks which are being killed when they just have finished, resulting in a race condition. If look at the attempt which is mentioned has actually finished and same time a kill task action has been recv for same attempt at the same time. Had modified the code to put in the debug statement following is example of task which upset the metering: {noformat} 2009-09-18 06:18:32,001 INFO org.apache.hadoop.mapred.JobInProgress: TaskDebug attemptId : attempt_200909180346_0004_m_000575_0 slots : SLOTS_MILLIS_MAPS tip.numslots is: 1 difference to add : -1253254705577 status start : 1253254705577 status end time : 0 {noformat} TT logs : {noformat} 2009-09-18 06:18:01,200 INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction (registerTask): attempt_200909180346_0004_m_000575_0 task's state:UNASSIGNED 2009-09-18 06:18:01,200 INFO org.apache.hadoop.mapred.TaskTracker: Trying to launch : attempt_200909180346_0004_m_000575_0 which needs 1 slots 2009-09-18 06:18:01,200 INFO org.apache.hadoop.mapred.TaskTracker: In TaskLauncher, current free slots : 1 and trying to launch attempt_200909180346_0004_m_000575_0 which needs 1 slots 2009-09-18 06:18:02,750 INFO org.apache.hadoop.mapred.TaskTracker: JVM with ID: jvm_200909180346_0004_m_1883468461 given task: attempt_200909180346_0004_m_000575_0 2009-09-18 06:18:09,034 INFO org.apache.hadoop.mapred.TaskTracker: attempt_200909180346_0004_m_000575_0 0.18553433% /my_reliability_test_input/part-00032:335544320+67108864 2009-09-18 06:18:12,040 INFO org.apache.hadoop.mapred.TaskTracker: attempt_200909180346_0004_m_000575_0 0.25399202% xxx/my_reliability_test_input/part-00032:335544320+67108864 2009-09-18 06:18:15,317 INFO org.apache.hadoop.mapred.TaskTracker: attempt_200909180346_0004_m_000575_0 0.3378629% xxx/my_reliability_test_input/part-00032:335544320+67108864 2009-09-18 06:18:15,319 INFO org.apache.hadoop.mapred.TaskTracker: Received KillTaskAction for task: attempt_200909180346_0004_m_000575_0 2009-09-18 06:18:15,319 INFO org.apache.hadoop.mapred.TaskTracker: About to purge task: attempt_200909180346_0004_m_000575_0 2009-09-18 06:18:20,413 INFO org.apache.hadoop.mapred.TaskRunner: attempt_200909180346_0004_m_000575_0 done; removing files. 2009-09-18 06:18:20,415 INFO org.apache.hadoop.mapred.IndexCache: Map ID attempt_200909180346_0004_m_000575_0 not found in cache 2009-09-18 06:18:25,511 INFO org.apache.hadoop.mapred.TaskRunner: attempt_200909180346_0004_m_000575_0 done; removing files. 2009-09-18 06:18:25,515 INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction (registerTask): attempt_200909180346_0004_m_000575_0 task's state:FAILED_UNCLEAN 2009-09-18 06:18:25,516 INFO org.apache.hadoop.mapred.TaskTracker: Trying to launch : attempt_200909180346_0004_m_000575_0 which needs 1 slots 2009-09-18 06:18:25,516 INFO org.apache.hadoop.mapred.TaskTracker: In TaskLauncher, current free slots : 2 and trying to launch attempt_200909180346_0004_m_000575_0 which needs 1 slots 2009-09-18 06:18:26,651 INFO org.apache.hadoop.mapred.TaskTracker: JVM with ID: jvm_200909180346_0004_m_-1834354161 given task: attempt_200909180346_0004_m_000575_0 2009-09-18 06:18:26,888 INFO org.apache.hadoop.mapred.TaskTracker: Received KillTaskAction for task: attempt_200909180346_0004_m_000575_0 2009-09-18 06:18:26,888 INFO org.apache.hadoop.mapred.TaskTracker: About to purge task: attempt_200909180346_0004_m_000575_0 2009-09-18 06:18:31,986 INFO org.apache.hadoop.mapred.TaskTracker: attempt_200909180346_0004_m_000575_0 0.0% 2009-09-18 06:18:31,986 INFO org.apache.hadoop.mapred.TaskTracker: attempt_200909180346_0004_m_000575_0 Ignoring status-update since runState: FAILED 2009-09-18 06:18:31,989 INFO org.apache.hadoop.mapred.TaskRunner: attempt_200909180346_0004_m_000575_0 done; removing files. 2009-09-18 06:18:31,990 WARN org.apache.hadoop.ipc.Server: IPC Server Responder, call statusUpdate(attempt_200909180346_0004_m_000575_0, org.apache.hadoop.mapred.maptasksta...@4e69b84c) from 127.0.0.1:51146: output error 2009-09-18 06:18:31,995 INFO org.apache.hadoop.mapred.IndexCache: Map ID attempt_200909180346_0004_m_000575_0 not found in cache 2009-09-18 06:18:34,992 INFO org.apache.hadoop.mapred.TaskRunner: attempt_200909180346_0004_m_000575_0 done; removing files. {noformat} Inaccurate values in jobSummary logs Key: MAPREDUCE-964 URL: https://issues.apache.org/jira/browse/MAPREDUCE-964 Project: Hadoop
[jira] Commented: (MAPREDUCE-862) Modify UI to support a hierarchy of queues
[ https://issues.apache.org/jira/browse/MAPREDUCE-862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12754869#action_12754869 ] Sreekanth Ramakrishnan commented on MAPREDUCE-862: -- Took a look at patch: * Add link in cluster status table to redirecting to new queue information page. * Move the Queue information to a new JSP. redirect to this page if we are not able to connect to YUI servers. Hemanth's offline comment: * Add recursive option in CLI for queue -info queue-name * Add a new field in the queue -info queue-name which lists only top level child queues. Modify UI to support a hierarchy of queues -- Key: MAPREDUCE-862 URL: https://issues.apache.org/jira/browse/MAPREDUCE-862 Project: Hadoop Map/Reduce Issue Type: Sub-task Reporter: Hemanth Yamijala Assignee: V.V.Chaitanya Krishna Attachments: clustersummarymodification.png, detailspage.png, initialscreen.png, MAPREDUCE-862-1.patch, subqueue.png MAPREDUCE-853 proposes to introduce a hierarchy of queues into the Map/Reduce framework. This JIRA is for defining changes to the UI related to queues. This includes the hadoop queue CLI and the web UI on the JobTracker. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-945) Test programs support only default queue.
[ https://issues.apache.org/jira/browse/MAPREDUCE-945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-945: - Status: Open (was: Patch Available) Test programs support only default queue. - Key: MAPREDUCE-945 URL: https://issues.apache.org/jira/browse/MAPREDUCE-945 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Reporter: Suman Sehgal Attachments: mapreduce-945-1.patch, mapreduce-945-2.patch None of the test program seems to be supporting queue's concept. These programs looks for the default queue only even if some other queue is specified to run these programs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-945) Test programs support only default queue.
[ https://issues.apache.org/jira/browse/MAPREDUCE-945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752539#action_12752539 ] Sreekanth Ramakrishnan commented on MAPREDUCE-945: -- All tests contrib and core passed locally. output from ant test-patch: {noformat} [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 6 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] {noformat} Test programs support only default queue. - Key: MAPREDUCE-945 URL: https://issues.apache.org/jira/browse/MAPREDUCE-945 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Reporter: Suman Sehgal Assignee: Sreekanth Ramakrishnan Attachments: mapreduce-945-1.patch, mapreduce-945-2.patch, mapreduce-945-internal-3.8.patch.txt None of the test program seems to be supporting queue's concept. These programs looks for the default queue only even if some other queue is specified to run these programs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-945) Test programs support only default queue.
[ https://issues.apache.org/jira/browse/MAPREDUCE-945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-945: - Attachment: mapreduce-945-internal-3.8.patch.txt Attaching Y! distribution patch. Test programs support only default queue. - Key: MAPREDUCE-945 URL: https://issues.apache.org/jira/browse/MAPREDUCE-945 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Reporter: Suman Sehgal Assignee: Sreekanth Ramakrishnan Attachments: mapreduce-945-1.patch, mapreduce-945-2.patch, mapreduce-945-internal-3.8.patch.txt None of the test program seems to be supporting queue's concept. These programs looks for the default queue only even if some other queue is specified to run these programs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-28) TestQueueManager takes too long and times out some times
[ https://issues.apache.org/jira/browse/MAPREDUCE-28?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752327#action_12752327 ] Sreekanth Ramakrishnan commented on MAPREDUCE-28: - After discussion with Rahul and looking at the test case which were written for MAPREDUCE-861, the path forward would be to just test the sematic meaning of the configured acls in the {{TestQueueManager}} the state and acl refresh is actually taken care in the test case introduced in {{MAPREDUCE-861}} TestQueueManager takes too long and times out some times Key: MAPREDUCE-28 URL: https://issues.apache.org/jira/browse/MAPREDUCE-28 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Amareshwari Sriramadasu TestQueueManager takes long time for the run and timeouts sometimes. See the failure at http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3875/testReport/. Looking at the console output, before the test finsihes, it was timed-out. On my machine, the test takes about 5 minutes. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-28) TestQueueManager takes too long and times out some times
[ https://issues.apache.org/jira/browse/MAPREDUCE-28?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12750919#action_12750919 ] Sreekanth Ramakrishnan commented on MAPREDUCE-28: - Currently the job submission in JobTracker does the following steps during job submission: 1. Create a new JobInProgress object. 2. Verify if the queue which job is submitted actually exists. 3. Check if the queue is running. 4. Check if the user hsa access for running job. 5. Check memory requirement. And all the job tracker methods are delegation calls so we can directly call the methods which directly call the QueueManager appropriate method, so we can change the following test cases to directly call QueueManager appropriate method instead of starting up MiniMRCluster: - testAllEnabledACLForJobSubmission() - testAllDisabledACLForJobSubmission() - testUserDisabledACLForJobSubmission() - testDisabledACLForNonDefaultQueue() - testSubmissionToInvalidQueue() - testEnabledACLForNonDefaultQueue() - testUserEnabledACLForJobSubmission() - testGroupsEnabledACLForJobSubmission() - testAllEnabledACLForJobKill() - testAllDisabledACLForJobKill() - testOwnerAllowedForJobKill() - testUserDisabledACLForJobKill() - testUserEnabledACLForJobKill() - testUserDisabledForJobPriorityChange() The test case testStateRefresh() can also be made unit test instead of starting up a cluster same way as we test acls refresh. We can then add a new integration test case to test following conditions: 1. acls for a user in a queue where he is denied/given access to submit and manage a job 2. State of the queue. TestQueueManager takes too long and times out some times Key: MAPREDUCE-28 URL: https://issues.apache.org/jira/browse/MAPREDUCE-28 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Amareshwari Sriramadasu TestQueueManager takes long time for the run and timeouts sometimes. See the failure at http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3875/testReport/. Looking at the console output, before the test finsihes, it was timed-out. On my machine, the test takes about 5 minutes. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-945) Test programs support only default queue.
[ https://issues.apache.org/jira/browse/MAPREDUCE-945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-945: - Attachment: mapreduce-945-1.patch Test programs support only default queue. - Key: MAPREDUCE-945 URL: https://issues.apache.org/jira/browse/MAPREDUCE-945 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Reporter: Suman Sehgal Attachments: mapreduce-945-1.patch None of the test program seems to be supporting queue's concept. These programs looks for the default queue only even if some other queue is specified to run these programs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-521) After JobTracker restart Capacity Schduler does not schedules pending tasks from already running tasks.
[ https://issues.apache.org/jira/browse/MAPREDUCE-521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12750385#action_12750385 ] Sreekanth Ramakrishnan commented on MAPREDUCE-521: -- Closing the patch as invalid as it no more exists after MAPREDUCE-873 is commited. After JobTracker restart Capacity Schduler does not schedules pending tasks from already running tasks. Key: MAPREDUCE-521 URL: https://issues.apache.org/jira/browse/MAPREDUCE-521 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Karam Singh Assignee: rahul k singh Attachments: hadoop-5739-20-alt.patch, hadoop-5739-3.patch, hadoop-5739-4.patch, hadoop-5739-5.patch, hadoop-5739-6-Version20.patch, hadoop-5739-6.patch, hadoop-5739-7.patch, hadoop-5739-latest.patch, hadoop-5739.patch, hadoop-5739.patch After JobTracker restart Capacity Schduler does not schedules pending task from already running tasks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Moved: (MAPREDUCE-945) Test programs support only default queue.
[ https://issues.apache.org/jira/browse/MAPREDUCE-945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan moved HADOOP-5536 to MAPREDUCE-945: -- Component/s: (was: test) test Affects Version/s: (was: 0.20.0) Key: MAPREDUCE-945 (was: HADOOP-5536) Project: Hadoop Map/Reduce (was: Hadoop Common) Test programs support only default queue. - Key: MAPREDUCE-945 URL: https://issues.apache.org/jira/browse/MAPREDUCE-945 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Reporter: Suman Sehgal None of the test program seems to be supporting queue's concept. These programs looks for the default queue only even if some other queue is specified to run these programs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-945) Test programs support only default queue.
[ https://issues.apache.org/jira/browse/MAPREDUCE-945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12750281#action_12750281 ] Sreekanth Ramakrishnan commented on MAPREDUCE-945: -- After the project split the issue has to be split into two parts. Created a new HDFS JIRA: HDFS-587 Test programs support only default queue. - Key: MAPREDUCE-945 URL: https://issues.apache.org/jira/browse/MAPREDUCE-945 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Reporter: Suman Sehgal None of the test program seems to be supporting queue's concept. These programs looks for the default queue only even if some other queue is specified to run these programs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-861) Modify queue configuration format and parsing to support a hierarchy of queues.
[ https://issues.apache.org/jira/browse/MAPREDUCE-861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-861: - Attachment: MAPREDUCE-861-2.patch Uploading patch in-lieu of Rahul. Modify queue configuration format and parsing to support a hierarchy of queues. --- Key: MAPREDUCE-861 URL: https://issues.apache.org/jira/browse/MAPREDUCE-861 Project: Hadoop Map/Reduce Issue Type: Sub-task Reporter: Hemanth Yamijala Assignee: rahul k singh Attachments: MAPREDUCE-861-1.patch, MAPREDUCE-861-2.patch MAPREDUCE-853 proposes to introduce a hierarchy of queues into the Map/Reduce framework. This JIRA is for defining changes to the configuration related to queues. The current format for defining a queue and its properties is as follows: mapred.queue.queue-name.property-name. For e.g. mapred.queue.queue-name.acl-submit-job. The reason for using this verbose format was to be able to reuse the Configuration parser in Hadoop. However, administrators currently using the queue configuration have already indicated a very strong desire for a more manageable format. Since, this becomes more unwieldy with hierarchical queues, the time may be good to introduce a new format for representing queue configuration. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-777) A method for finding and tracking jobs from the new API
[ https://issues.apache.org/jira/browse/MAPREDUCE-777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12749374#action_12749374 ] Sreekanth Ramakrishnan commented on MAPREDUCE-777: -- With respect to {noformat} public class QueueInfo { String getName() { return null; } String getSchedulingInfo() throws IOException { return null; } Job[] getJobs(int maxJobs) throws IOException { return null; } QueueState getState() throws IOException { return null; } } {noformat} Currently the class {{org.apache.hadoop.mapred.JobQueueInfo}} is client only view of the information pertaining to the queue and is not used in framework for any other purpose, why don't we reuse it instead of a creating a new class? Also, in the framework the concept of queue was nothing but a tag associated with a Job and some schedulers need not honor the queue and can store the job in a single queue rather than in separate queue, are we planning to change that? Then, sending a list of jobs for all the client request might not be required as, there are currently two queue commands i.e. {{queue -list}} which prints out list of queues and associated scheduling information and {{queue -info queuename [-listjobs]}} the option of list jobs is optional in second case, by using the proposed api we might end up sending list of jobs all the time even tho' client does not request it. Finally, MAPREDUCE-853 is introducing an hierarchy of queues and we should also try to handle those scenarios in the JIRA. A method for finding and tracking jobs from the new API --- Key: MAPREDUCE-777 URL: https://issues.apache.org/jira/browse/MAPREDUCE-777 Project: Hadoop Map/Reduce Issue Type: New Feature Components: client Reporter: Owen O'Malley Assignee: Amareshwari Sriramadasu Fix For: 0.21.0 Attachments: m-777.patch, patch-777-1.txt, patch-777-2.txt, patch-777.txt We need to create a replacement interface for the JobClient API in the new interface. In particular, the user needs to be able to query and track jobs that were launched by other processes. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-834) When TaskTracker config use old memory management values its memory monitoring is diabled.
[ https://issues.apache.org/jira/browse/MAPREDUCE-834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-834: - Release Note: The issue fixes the backward compatibility issue which caused memory monitoring not to be started when TaskTracker was started with old configuration. When TaskTracker config use old memory management values its memory monitoring is diabled. -- Key: MAPREDUCE-834 URL: https://issues.apache.org/jira/browse/MAPREDUCE-834 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Karam Singh Assignee: Sreekanth Ramakrishnan Fix For: 0.20.1 Attachments: mapred-834-20.patch, mapreduce-834-1.patch, mapreduce-834-2.patch, mapreduce-834-3.patch, mapreduce-834-4.patch, mapreduce-834-ydist.patch TaskTracker memory config values -: mapred.tasktracker.vmem.reserved=8589934592 mapred.task.default.maxvmem=2147483648 mapred.task.limit.maxvmem=4294967296 mapred.tasktracker.pmem.reserved=2147483648 TaskTracker start as -: 2009-08-05 12:39:03,308 WARN org.apache.hadoop.mapred.TaskTracker: The variable mapred.tasktracker.vmem.reserved is no longer used 2009-08-05 12:39:03,308 WARN org.apache.hadoop.mapred.TaskTracker: The variable mapred.tasktracker.pmem.reserved is no longer used 2009-08-05 12:39:03,308 WARN org.apache.hadoop.mapred.TaskTracker: The variable mapred.task.default.maxvmem is no longer used 2009-08-05 12:39:03,308 WARN org.apache.hadoop.mapred.TaskTracker: The variable mapred.task.limit.maxvmem is no longer used 2009-08-05 12:39:03,308 INFO org.apache.hadoop.mapred.TaskTracker: Starting thread: Map-events fetcher for all reduce tasks on tracker_name 2009-08-05 12:39:03,309 INFO org.apache.hadoop.mapred.TaskTracker: Using MemoryCalculatorPlugin : org.apache.hadoop.util.linuxmemorycalculatorplu...@19be4777 2009-08-05 12:39:03,311 WARN org.apache.hadoop.mapred.TaskTracker: TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is disabled. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-682) Reserved tasktrackers should be removed when a node is globally blacklisted
[ https://issues.apache.org/jira/browse/MAPREDUCE-682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-682: - Release Note: Fixes the issue with respect to high ram job reservation not being cleared when the task tracker is black listed. Reserved tasktrackers should be removed when a node is globally blacklisted --- Key: MAPREDUCE-682 URL: https://issues.apache.org/jira/browse/MAPREDUCE-682 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.21.0 Reporter: Hemanth Yamijala Assignee: Sreekanth Ramakrishnan Fix For: 0.21.0 Attachments: mapreduce-682-1.patch, mapreduce-682-2.patch, mapreduce-682-ydist.patch When support was added to reserve tasktrackers for high RAM jobs per MAPREDUCE-516, we missed removing reservations on tasktrackers that are globally blacklisted. This is not a major concern, just that the reservation might cause the job to finish a little later than if the reservation is removed when the blacklisting happens. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-802) Simplify the job updated event notification between Jobtracker and schedulers
[ https://issues.apache.org/jira/browse/MAPREDUCE-802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-802: - Attachment: eventmodel-3.patch Attaching new patch merging changes in the trunk. Simplify the job updated event notification between Jobtracker and schedulers - Key: MAPREDUCE-802 URL: https://issues.apache.org/jira/browse/MAPREDUCE-802 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker Reporter: Hemanth Yamijala Assignee: Sreekanth Ramakrishnan Attachments: eventmodel-1.patch, eventmodel-2.patch, eventmodel-3.patch HADOOP-4053 and HADOOP-4149 added events to take care of updates to the state / property of a job like the run state / priority of a job notified to the scheduler. We've seen some issues with this framework, such as the following: - Events are not raised correctly at all places. If a new code path is added to kill a job, raising events is missed out. - Events are raised with incorrect event data. For e.g. typically start time value is missed out. The resulting contract break between jobtracker and schedulers has lead to problems in the capacity scheduler where jobs remain stuck in the queue without being ever removed and so on. It has proven complicated to get this right in the framework and fixes have typically still left dangling cases. Or new code paths introduce new bugs. This JIRA is about trying to simplify the interaction model so that it is more robust and works well. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-768) Configuration information should generate dump in a standard format.
[ https://issues.apache.org/jira/browse/MAPREDUCE-768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-768: - Status: Open (was: Patch Available) Configuration information should generate dump in a standard format. Key: MAPREDUCE-768 URL: https://issues.apache.org/jira/browse/MAPREDUCE-768 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: rahul k singh Assignee: V.V.Chaitanya Krishna Attachments: jobtracker_configurationdump.txt, MAPREDUCE-768-1.patch, MAPREDUCE-768-2.patch, MAPREDUCE-768-3.patch, MAPREDUCE-768-4.patch, MAPREDUCE-768-5.patch, MAPREDUCE-768-5.patch, MAPREDUCE-768-6.patch, MAPREDUCE-768.patch We need to generate the configuration dump in a standard format . -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-768) Configuration information should generate dump in a standard format.
[ https://issues.apache.org/jira/browse/MAPREDUCE-768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12745884#action_12745884 ] Sreekanth Ramakrishnan commented on MAPREDUCE-768: -- Took a look at the patch: * Extract the printing of usage into a new method. * Change the usage string to JobTracker [-dumpConfiguration] * Change the current if else condition in {{JobTracker}} to do the following: {code} if args.length == 0 start jobtracker else if args[1] == -dumpconfiguration dump configuration else print usage {code} Configuration information should generate dump in a standard format. Key: MAPREDUCE-768 URL: https://issues.apache.org/jira/browse/MAPREDUCE-768 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: rahul k singh Attachments: MAPREDUCE-768-1.patch, MAPREDUCE-768.patch We need to generate the configuration dump in a standard format . -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-768) Configuration information should generate dump in a standard format.
[ https://issues.apache.org/jira/browse/MAPREDUCE-768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12745900#action_12745900 ] Sreekanth Ramakrishnan commented on MAPREDUCE-768: -- The changes in the patch look fine to me. +1 to patch. Configuration information should generate dump in a standard format. Key: MAPREDUCE-768 URL: https://issues.apache.org/jira/browse/MAPREDUCE-768 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: rahul k singh Attachments: MAPREDUCE-768-1.patch, MAPREDUCE-768-2.patch, MAPREDUCE-768.patch We need to generate the configuration dump in a standard format . -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-886) After 4491, when task-controller exit with some error message, LinuxTaskController only ExitCodeException but does not prints the exit code of task-controller
[ https://issues.apache.org/jira/browse/MAPREDUCE-886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-886: - Attachment: mapreduce-886-1.patch Adding exit code to the logging method, logging exit code only if the process has exited with non-zero exit code. After 4491, when task-controller exit with some error message, LinuxTaskController only ExitCodeException but does not prints the exit code of task-controller -- Key: MAPREDUCE-886 URL: https://issues.apache.org/jira/browse/MAPREDUCE-886 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 0.21.0 Reporter: Karam Singh Attachments: mapreduce-886-1.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-834) When TaskTracker config use old memory management values its memory monitoring is diabled.
[ https://issues.apache.org/jira/browse/MAPREDUCE-834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12745384#action_12745384 ] Sreekanth Ramakrishnan commented on MAPREDUCE-834: -- The patch does not change any part on {{JobTracker}} or streaming api's. The issue with {{TestRecoveryManager}} timing out is reported on MAPREDUCE-880 and the streaming test case failures are also a known reported issue on hudson. When TaskTracker config use old memory management values its memory monitoring is diabled. -- Key: MAPREDUCE-834 URL: https://issues.apache.org/jira/browse/MAPREDUCE-834 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Karam Singh Assignee: Sreekanth Ramakrishnan Attachments: mapreduce-834-1.patch, mapreduce-834-2.patch, mapreduce-834-3.patch, mapreduce-834-4.patch TaskTracker memory config values -: mapred.tasktracker.vmem.reserved=8589934592 mapred.task.default.maxvmem=2147483648 mapred.task.limit.maxvmem=4294967296 mapred.tasktracker.pmem.reserved=2147483648 TaskTracker start as -: 2009-08-05 12:39:03,308 WARN org.apache.hadoop.mapred.TaskTracker: The variable mapred.tasktracker.vmem.reserved is no longer used 2009-08-05 12:39:03,308 WARN org.apache.hadoop.mapred.TaskTracker: The variable mapred.tasktracker.pmem.reserved is no longer used 2009-08-05 12:39:03,308 WARN org.apache.hadoop.mapred.TaskTracker: The variable mapred.task.default.maxvmem is no longer used 2009-08-05 12:39:03,308 WARN org.apache.hadoop.mapred.TaskTracker: The variable mapred.task.limit.maxvmem is no longer used 2009-08-05 12:39:03,308 INFO org.apache.hadoop.mapred.TaskTracker: Starting thread: Map-events fetcher for all reduce tasks on tracker_name 2009-08-05 12:39:03,309 INFO org.apache.hadoop.mapred.TaskTracker: Using MemoryCalculatorPlugin : org.apache.hadoop.util.linuxmemorycalculatorplu...@19be4777 2009-08-05 12:39:03,311 WARN org.apache.hadoop.mapred.TaskTracker: TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is disabled. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-834) When TaskTracker config use old memory management values its memory monitoring is diabled.
[ https://issues.apache.org/jira/browse/MAPREDUCE-834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-834: - Attachment: mapred-834-20.patch Attaching patch for branch 20. When TaskTracker config use old memory management values its memory monitoring is diabled. -- Key: MAPREDUCE-834 URL: https://issues.apache.org/jira/browse/MAPREDUCE-834 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Karam Singh Assignee: Sreekanth Ramakrishnan Attachments: mapred-834-20.patch, mapreduce-834-1.patch, mapreduce-834-2.patch, mapreduce-834-3.patch, mapreduce-834-4.patch TaskTracker memory config values -: mapred.tasktracker.vmem.reserved=8589934592 mapred.task.default.maxvmem=2147483648 mapred.task.limit.maxvmem=4294967296 mapred.tasktracker.pmem.reserved=2147483648 TaskTracker start as -: 2009-08-05 12:39:03,308 WARN org.apache.hadoop.mapred.TaskTracker: The variable mapred.tasktracker.vmem.reserved is no longer used 2009-08-05 12:39:03,308 WARN org.apache.hadoop.mapred.TaskTracker: The variable mapred.tasktracker.pmem.reserved is no longer used 2009-08-05 12:39:03,308 WARN org.apache.hadoop.mapred.TaskTracker: The variable mapred.task.default.maxvmem is no longer used 2009-08-05 12:39:03,308 WARN org.apache.hadoop.mapred.TaskTracker: The variable mapred.task.limit.maxvmem is no longer used 2009-08-05 12:39:03,308 INFO org.apache.hadoop.mapred.TaskTracker: Starting thread: Map-events fetcher for all reduce tasks on tracker_name 2009-08-05 12:39:03,309 INFO org.apache.hadoop.mapred.TaskTracker: Using MemoryCalculatorPlugin : org.apache.hadoop.util.linuxmemorycalculatorplu...@19be4777 2009-08-05 12:39:03,311 WARN org.apache.hadoop.mapred.TaskTracker: TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is disabled. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-834) When TaskTracker config use old memory management values its memory monitoring is diabled.
[ https://issues.apache.org/jira/browse/MAPREDUCE-834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-834: - Attachment: mapreduce-834-2.patch Attaching patch fixing using the correct key and converting the same to MB while allotting total memory alloted to the tasks. When TaskTracker config use old memory management values its memory monitoring is diabled. -- Key: MAPREDUCE-834 URL: https://issues.apache.org/jira/browse/MAPREDUCE-834 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Karam Singh Attachments: mapreduce-834-1.patch, mapreduce-834-2.patch TaskTracker memory config values -: mapred.tasktracker.vmem.reserved=8589934592 mapred.task.default.maxvmem=2147483648 mapred.task.limit.maxvmem=4294967296 mapred.tasktracker.pmem.reserved=2147483648 TaskTracker start as -: 2009-08-05 12:39:03,308 WARN org.apache.hadoop.mapred.TaskTracker: The variable mapred.tasktracker.vmem.reserved is no longer used 2009-08-05 12:39:03,308 WARN org.apache.hadoop.mapred.TaskTracker: The variable mapred.tasktracker.pmem.reserved is no longer used 2009-08-05 12:39:03,308 WARN org.apache.hadoop.mapred.TaskTracker: The variable mapred.task.default.maxvmem is no longer used 2009-08-05 12:39:03,308 WARN org.apache.hadoop.mapred.TaskTracker: The variable mapred.task.limit.maxvmem is no longer used 2009-08-05 12:39:03,308 INFO org.apache.hadoop.mapred.TaskTracker: Starting thread: Map-events fetcher for all reduce tasks on tracker_name 2009-08-05 12:39:03,309 INFO org.apache.hadoop.mapred.TaskTracker: Using MemoryCalculatorPlugin : org.apache.hadoop.util.linuxmemorycalculatorplu...@19be4777 2009-08-05 12:39:03,311 WARN org.apache.hadoop.mapred.TaskTracker: TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is disabled. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-862) Modify UI to support a hierarchy of queues
[ https://issues.apache.org/jira/browse/MAPREDUCE-862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-862: - Attachment: initialscreen.png detailspage.png clustersummarymodification.png Attaching screens of how the UI would look for modified queue design. Cluster summary would be modified, to introduce a new column which will have a number of Queue, which will be linked to modified queue details page, which is described in initialscreen.png. From initalscreen.png we can click on the queue hierarchy, which would have two pages, for {{ContainerQueues}} we would not have a job list and for {{JobQueue}} we have a job list apart from scheduling information. Modify UI to support a hierarchy of queues -- Key: MAPREDUCE-862 URL: https://issues.apache.org/jira/browse/MAPREDUCE-862 Project: Hadoop Map/Reduce Issue Type: Sub-task Reporter: Hemanth Yamijala Attachments: clustersummarymodification.png, detailspage.png, initialscreen.png, subqueue.png MAPREDUCE-853 proposes to introduce a hierarchy of queues into the Map/Reduce framework. This JIRA is for defining changes to the UI related to queues. This includes the hadoop queue CLI and the web UI on the JobTracker. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-862) Modify UI to support a hierarchy of queues
[ https://issues.apache.org/jira/browse/MAPREDUCE-862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-862: - Attachment: subqueue.png Modify UI to support a hierarchy of queues -- Key: MAPREDUCE-862 URL: https://issues.apache.org/jira/browse/MAPREDUCE-862 Project: Hadoop Map/Reduce Issue Type: Sub-task Reporter: Hemanth Yamijala Attachments: clustersummarymodification.png, detailspage.png, initialscreen.png, subqueue.png MAPREDUCE-853 proposes to introduce a hierarchy of queues into the Map/Reduce framework. This JIRA is for defining changes to the UI related to queues. This includes the hadoop queue CLI and the web UI on the JobTracker. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-802) Simplify the job updated event notification between Jobtracker and schedulers
[ https://issues.apache.org/jira/browse/MAPREDUCE-802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-802: - Attachment: eventmodel-2.patch Attaching patch merging with MAPREDUCE-805 changes in the mapreduce trunk. Simplify the job updated event notification between Jobtracker and schedulers - Key: MAPREDUCE-802 URL: https://issues.apache.org/jira/browse/MAPREDUCE-802 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker Reporter: Hemanth Yamijala Assignee: Sreekanth Ramakrishnan Attachments: eventmodel-1.patch, eventmodel-2.patch HADOOP-4053 and HADOOP-4149 added events to take care of updates to the state / property of a job like the run state / priority of a job notified to the scheduler. We've seen some issues with this framework, such as the following: - Events are not raised correctly at all places. If a new code path is added to kill a job, raising events is missed out. - Events are raised with incorrect event data. For e.g. typically start time value is missed out. The resulting contract break between jobtracker and schedulers has lead to problems in the capacity scheduler where jobs remain stuck in the queue without being ever removed and so on. It has proven complicated to get this right in the framework and fixes have typically still left dangling cases. Or new code paths introduce new bugs. This JIRA is about trying to simplify the interaction model so that it is more robust and works well. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-478) separate jvm param for mapper and reducer
[ https://issues.apache.org/jira/browse/MAPREDUCE-478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742681#action_12742681 ] Sreekanth Ramakrishnan commented on MAPREDUCE-478: -- Also on second thought, in my opinion [HADOOP-6105|http://issues.apache.org/jira/browse/HADOOP-6105] actually helps the issue which is mentioned here, it provides you automatic facility to split old key into two new keys. separate jvm param for mapper and reducer - Key: MAPREDUCE-478 URL: https://issues.apache.org/jira/browse/MAPREDUCE-478 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Koji Noguchi Assignee: Arun C Murthy Priority: Minor Fix For: 0.21.0 Attachments: HADOOP-5684_0_20090420.patch, MAPREDUCE-478_0_20090804.patch, MAPREDUCE-478_0_20090804_yhadoop20.patch, MAPREDUCE-478_1_20090806.patch, MAPREDUCE-478_1_20090806_yhadoop20.patch Memory footprint of mapper and reducer can differ. It would be nice if we can pass different jvm param (mapred.child.java.opts) for mappers and reducers. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-768) Configuration information should generate dump in a standard format.
[ https://issues.apache.org/jira/browse/MAPREDUCE-768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12741670#action_12741670 ] Sreekanth Ramakrishnan commented on MAPREDUCE-768: -- Resource field of the configuration property is the last resource from which the properties value has been loaded. The motivation of this field would be for administrators to know, for if they have accidentally overridden any property they didn't mean to. Configuration information should generate dump in a standard format. Key: MAPREDUCE-768 URL: https://issues.apache.org/jira/browse/MAPREDUCE-768 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: rahul k singh We need to generate the configuration dump in a standard format . -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-824) Support a hierarchy of queues in the capacity scheduler
[ https://issues.apache.org/jira/browse/MAPREDUCE-824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12741191#action_12741191 ] Sreekanth Ramakrishnan commented on MAPREDUCE-824: -- Started taking look at the patch, following are comments: QueueSchedulingContext: * Can a comment be added for the each of the fields of the QueueSchedulingContext, stating what each is meant to do? AbstractQueue * Change cummulative to cumulative. * Childrens should be children * Can we move the queue scheduling information string into QueueSchedulingContext? As we have for the TaskSchedulingContext? * Can we take away state variables prevReduceClusterCapacity and prevMapClusterCapacity so there is no state maintained? JobQueue:- * Why are we having the jobStateChanged method in JobQueue, shouldnt it be in JobQueueManager which is JobInProgressListener and shouldn't it be handling the job state change event and handling and management of movement of jobs? JobInitalizationPoller: *Do we need to change the log statements? (Queue to AbstractQueue? Or Should it be JobQueue?) * What would QueueManager.getQueues() give to {{JobInitializationPoller}} in {{CapacityScheduler.start()}} Support a hierarchy of queues in the capacity scheduler --- Key: MAPREDUCE-824 URL: https://issues.apache.org/jira/browse/MAPREDUCE-824 Project: Hadoop Map/Reduce Issue Type: New Feature Components: contrib/capacity-sched Reporter: Hemanth Yamijala Attachments: HADOOP-824-1.patch Currently in Capacity Scheduler, cluster capacity is divided among the queues based on the queue capacity. These queues typically represent an organization and the capacity of the queue represents the capacity the organization is entitled to. Most organizations are large and need to divide their capacity among sub-organizations they have. Or they may want to divide the capacity based on a category or type of jobs they run. This JIRA covers the requirements and other details to provide the above feature. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-824) Support a hierarchy of queues in the capacity scheduler
[ https://issues.apache.org/jira/browse/MAPREDUCE-824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12741192#action_12741192 ] Sreekanth Ramakrishnan commented on MAPREDUCE-824: -- To clarify the above comment was: {quote} What would QueueManager.getQueues() give to JobInitializationPoller in CapacityScheduler.start() {quote} The {{JobInitalizationPoller}} acts only on {{JobQueue}} not {{ContainerQueue}} which is configured by the {{QueueManager}}. I agree, the future patch would address configuration related changes, but atleast interim the patch should see to that {{JobInitalizationPoller}} would get only {{JobQueue}}. Support a hierarchy of queues in the capacity scheduler --- Key: MAPREDUCE-824 URL: https://issues.apache.org/jira/browse/MAPREDUCE-824 Project: Hadoop Map/Reduce Issue Type: New Feature Components: contrib/capacity-sched Reporter: Hemanth Yamijala Attachments: HADOOP-824-1.patch Currently in Capacity Scheduler, cluster capacity is divided among the queues based on the queue capacity. These queues typically represent an organization and the capacity of the queue represents the capacity the organization is entitled to. Most organizations are large and need to divide their capacity among sub-organizations they have. Or they may want to divide the capacity based on a category or type of jobs they run. This JIRA covers the requirements and other details to provide the above feature. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-802) Simplify the job updated event notification between Jobtracker and schedulers
[ https://issues.apache.org/jira/browse/MAPREDUCE-802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-802: - Attachment: eventmodel-1.patch Attaching the patch which makes changes in the event model as described in the [comment|https://issues.apache.org/jira/browse/MAPREDUCE-802?focusedCommentId=12738226page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12738226] I have introduced {{JobSchedulingInfoIndex}} for removal based on the old {{JobSchedulingInfo}} as I thought the update of the jobs are happening with {{JobTracker}} lock. Simplify the job updated event notification between Jobtracker and schedulers - Key: MAPREDUCE-802 URL: https://issues.apache.org/jira/browse/MAPREDUCE-802 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker Reporter: Hemanth Yamijala Assignee: Sreekanth Ramakrishnan Attachments: eventmodel-1.patch HADOOP-4053 and HADOOP-4149 added events to take care of updates to the state / property of a job like the run state / priority of a job notified to the scheduler. We've seen some issues with this framework, such as the following: - Events are not raised correctly at all places. If a new code path is added to kill a job, raising events is missed out. - Events are raised with incorrect event data. For e.g. typically start time value is missed out. The resulting contract break between jobtracker and schedulers has lead to problems in the capacity scheduler where jobs remain stuck in the queue without being ever removed and so on. It has proven complicated to get this right in the framework and fixes have typically still left dangling cases. Or new code paths introduce new bugs. This JIRA is about trying to simplify the interaction model so that it is more robust and works well. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-779) Add node health failures into JobTrackerStatistics
[ https://issues.apache.org/jira/browse/MAPREDUCE-779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-779: - Status: Patch Available (was: Open) Add node health failures into JobTrackerStatistics -- Key: MAPREDUCE-779 URL: https://issues.apache.org/jira/browse/MAPREDUCE-779 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker Reporter: Sreekanth Ramakrishnan Assignee: Sreekanth Ramakrishnan Attachments: mapreduce-779-1.patch, mapreduce-779-2.patch, mapreduce-779-3.patch, mapreduce-779-4.patch Add the node health failure counts into {{JobTrackerStatistics}}. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-779) Add node health failures into JobTrackerStatistics
[ https://issues.apache.org/jira/browse/MAPREDUCE-779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-779: - Attachment: mapreduce-779-4.patch Had added that method assuming that TaskTrackerStat is not created until first task is run/failed in a lazy manner, now rechecked that the stat object is created when tracker is added. So I have removed the method. Renamed the heading in JSP. Also added {{TestTaskTrackerBlacklisting}} to commit-tests list. Add node health failures into JobTrackerStatistics -- Key: MAPREDUCE-779 URL: https://issues.apache.org/jira/browse/MAPREDUCE-779 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker Reporter: Sreekanth Ramakrishnan Assignee: Sreekanth Ramakrishnan Attachments: mapreduce-779-1.patch, mapreduce-779-2.patch, mapreduce-779-3.patch, mapreduce-779-4.patch Add the node health failure counts into {{JobTrackerStatistics}}. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-802) Simplify the job updated event notification between Jobtracker and schedulers
[ https://issues.apache.org/jira/browse/MAPREDUCE-802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12738226#action_12738226 ] Sreekanth Ramakrishnan commented on MAPREDUCE-802: -- Currently problems arise within the systems which rely on the job events can be classified into two categories: # Not all code path make call to raise status change events. The reason for this is the state change is performed in {{JobInProgress}} which does not have handle to the list of {{JobInProgressListener}} which is managed by the {{JobTracker}}. So the components which need the state change for removing/updating internal structures of {{JobInProgress}} object are left out of synch. # Relying, on {{oldStatus}} field and member of the structure to be correctly set by {{JobTracker}} before calling the listeners. Notable example of this is start time changes which is described in MAPREDUCE-45 In order to solve the problems listed above following is a proposal: * For solving the case number 1, whenever {{JobInProgress}} changes its state, we route the associated event to {{JobTracker}}. This will ensure that any part of code which changes the {{JobStatus}} would actually result in events being raised. * For solving the case number 2, we remove the the {{oldStatus}} field in {{JobStatusChangeEvent}} as it is not always correct. The change would be an incompatible change and old status is actually used in two schedulers {{JobQueueJobInProgressListener}} for default scheduler and {{JobQueueManager}} for capacity scheduler. So both these scheduler would now have to maintain their link of old status to {{JobInProgress}}. The changes proposed would change current pseudo code for raising events as below: {noformat} JobStatus oldStatus = job.getstatus.clone make changes to jobs status. JobStatus newStatus = job.getstatus.clone create event with both old and new inform listeners {noformat} To following: {noformat} make changes to job create JobChanged event inform listeners {noformat} So scheduler would have maintain an association with the scheduling information which they used to populate their internal structures previously on their own instead of the {{JobTracker}} sending correct information. Currently, default scheduler {{JobQueueTaskScheduler}} maintains the ordered list of jobs using a {{TreeMapJobSchedulingInfo,JobInProgress}}, the key of the map while update operation was constructed using _oldStatus_ field of the {{JobStatusChangedEvent}}. With proposed changed as _oldStatus_ is removed default scheduler would have to maintain its association between job to job scheduling info i.e. a {{MapJobID,JobSchedulingInfo}} the value of a JobID would be current {{JobSchedulingInfo}} which it used to insert into {{TreeMap}} of the scheduler. While {{jobUpdated()}} is called removal of the old {{JobSchedulingInfo}} from {{TreeMap}} would be done using the value from {{Map}}, then {{MapJobID,JobSchedulingInfo}} and {{TreeMapJobSchedulingInfo,JobInProgress}} are updated with most recent {{JobSchedulingInfo}}. Any comments on the above proposal and changes which it would bring to framework? Simplify the job updated event notification between Jobtracker and schedulers - Key: MAPREDUCE-802 URL: https://issues.apache.org/jira/browse/MAPREDUCE-802 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker Reporter: Hemanth Yamijala Assignee: Sreekanth Ramakrishnan HADOOP-4053 and HADOOP-4149 added events to take care of updates to the state / property of a job like the run state / priority of a job notified to the scheduler. We've seen some issues with this framework, such as the following: - Events are not raised correctly at all places. If a new code path is added to kill a job, raising events is missed out. - Events are raised with incorrect event data. For e.g. typically start time value is missed out. The resulting contract break between jobtracker and schedulers has lead to problems in the capacity scheduler where jobs remain stuck in the queue without being ever removed and so on. It has proven complicated to get this right in the framework and fixes have typically still left dangling cases. Or new code paths introduce new bugs. This JIRA is about trying to simplify the interaction model so that it is more robust and works well. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-766) Enhance -list-blacklisted-trackers to display host name, blacklisted reason and blacklist report.
[ https://issues.apache.org/jira/browse/MAPREDUCE-766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12737456#action_12737456 ] Sreekanth Ramakrishnan commented on MAPREDUCE-766: -- All tests passed locally, both core and contrib. Enhance -list-blacklisted-trackers to display host name, blacklisted reason and blacklist report. - Key: MAPREDUCE-766 URL: https://issues.apache.org/jira/browse/MAPREDUCE-766 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Sreekanth Ramakrishnan Assignee: Sreekanth Ramakrishnan Attachments: blacklist3.png, mapreduce-766-1.patch, mapreduce-766-2.patch, mapreduce-766-3.patch, mapreduce-766-4.patch, mapreduce-766-5.patch, mapreduce-766-6.patch Currently, the -list-blacklisted-trackers in the mapred job option list only tracker name. We should enhance it to display as hostname, reason for blacklisting and blacklist report. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-766) Enhance -list-blacklisted-trackers to display host name, blacklisted reason and blacklist report.
[ https://issues.apache.org/jira/browse/MAPREDUCE-766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-766: - Attachment: mapreduce-766-ydist.patch patch for Yahoo! distribution Enhance -list-blacklisted-trackers to display host name, blacklisted reason and blacklist report. - Key: MAPREDUCE-766 URL: https://issues.apache.org/jira/browse/MAPREDUCE-766 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Sreekanth Ramakrishnan Assignee: Sreekanth Ramakrishnan Fix For: 0.21.0 Attachments: blacklist3.png, mapreduce-766-1.patch, mapreduce-766-2.patch, mapreduce-766-3.patch, mapreduce-766-4.patch, mapreduce-766-5.patch, mapreduce-766-6.patch, mapreduce-766-ydist.patch Currently, the -list-blacklisted-trackers in the mapred job option list only tracker name. We should enhance it to display as hostname, reason for blacklisting and blacklist report. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-766) Enhance -list-blacklisted-trackers to display host name, blacklisted reason and blacklist report.
[ https://issues.apache.org/jira/browse/MAPREDUCE-766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-766: - Attachment: mapreduce-766-6.patch Attaching patch removing the {{System.out.println()}} statement. Running test patch and tests. Enhance -list-blacklisted-trackers to display host name, blacklisted reason and blacklist report. - Key: MAPREDUCE-766 URL: https://issues.apache.org/jira/browse/MAPREDUCE-766 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Sreekanth Ramakrishnan Assignee: Sreekanth Ramakrishnan Attachments: blacklist3.png, mapreduce-766-1.patch, mapreduce-766-2.patch, mapreduce-766-3.patch, mapreduce-766-4.patch, mapreduce-766-5.patch, mapreduce-766-6.patch Currently, the -list-blacklisted-trackers in the mapred job option list only tracker name. We should enhance it to display as hostname, reason for blacklisting and blacklist report. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-766) Enhance -list-blacklisted-trackers to display host name, blacklisted reason and blacklist report.
[ https://issues.apache.org/jira/browse/MAPREDUCE-766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-766: - Status: Patch Available (was: Open) Enhance -list-blacklisted-trackers to display host name, blacklisted reason and blacklist report. - Key: MAPREDUCE-766 URL: https://issues.apache.org/jira/browse/MAPREDUCE-766 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Sreekanth Ramakrishnan Assignee: Sreekanth Ramakrishnan Attachments: blacklist3.png, mapreduce-766-1.patch, mapreduce-766-2.patch, mapreduce-766-3.patch, mapreduce-766-4.patch, mapreduce-766-5.patch, mapreduce-766-6.patch Currently, the -list-blacklisted-trackers in the mapred job option list only tracker name. We should enhance it to display as hostname, reason for blacklisting and blacklist report. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-766) Enhance -list-blacklisted-trackers to display host name, blacklisted reason and blacklist report.
[ https://issues.apache.org/jira/browse/MAPREDUCE-766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12737050#action_12737050 ] Sreekanth Ramakrishnan commented on MAPREDUCE-766: -- output from ant test-patch {noformat} [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. {noformat} Enhance -list-blacklisted-trackers to display host name, blacklisted reason and blacklist report. - Key: MAPREDUCE-766 URL: https://issues.apache.org/jira/browse/MAPREDUCE-766 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Sreekanth Ramakrishnan Assignee: Sreekanth Ramakrishnan Attachments: blacklist3.png, mapreduce-766-1.patch, mapreduce-766-2.patch, mapreduce-766-3.patch, mapreduce-766-4.patch, mapreduce-766-5.patch, mapreduce-766-6.patch Currently, the -list-blacklisted-trackers in the mapred job option list only tracker name. We should enhance it to display as hostname, reason for blacklisting and blacklist report. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (MAPREDUCE-539) Implement a config validator tool for the capacity scheduler
[ https://issues.apache.org/jira/browse/MAPREDUCE-539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan resolved MAPREDUCE-539. -- Resolution: Duplicate Will be fixed as part of MAPREDUCE-768 Implement a config validator tool for the capacity scheduler Key: MAPREDUCE-539 URL: https://issues.apache.org/jira/browse/MAPREDUCE-539 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Hemanth Yamijala Assignee: Sreekanth Ramakrishnan Attachments: HADOOP-4809-1.patch The capacity scheduler sanity checks configuration when it starts and halts if there are any problems found. For ease of deployment, it would help to have a simple utility that will validate the configuration before the capacity scheduler can be started, and report errors / warnings to the user about (possible) misconfigurations. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (MAPREDUCE-12) Tasks execed by the task controller shouldn't inherit tasktracker groups
[ https://issues.apache.org/jira/browse/MAPREDUCE-12?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan resolved MAPREDUCE-12. - Resolution: Fixed Fixed as part of HADOOP-5420 Tasks execed by the task controller shouldn't inherit tasktracker groups Key: MAPREDUCE-12 URL: https://issues.apache.org/jira/browse/MAPREDUCE-12 Project: Hadoop Map/Reduce Issue Type: Bug Environment: hadoo 0.20 + patches, Linux Task controller Reporter: Rajiv Chittajallu Assignee: Sreekanth Ramakrishnan Attachments: hadoop-5686-1.patch Mapred tasks process seem to inherit the group list from the TaskTracker daemon instead of the task owner. tom 26633 15736 0 21:33 ?00:00:02 /usr/bin/java ... org.apache.hadoop.mapred.Child 127.0.0.1 51207 .. mapred 15736 1 2 Apr08 ?03:54:59 /usr/bin/java ... org.apache.hadoop.mapred.TaskTracker hadoop1:~$ id mapred uid=50589(mapred) gid=100(users) groups=100(users),20001(hadoop) hadoop1:~$ fgrep Groups /proc/26633/status Groups: 100 20001 hadoop1:~$ id tom uid=47765(tom) gid=100(users) groups=100(users),10764(ninjas) org.apache.hadoop.mapred.LinuxTaskController should set the user supplimentary group list. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (MAPREDUCE-513) Prior code fix in Capacity Scheduler prevents speculative execution in jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan resolved MAPREDUCE-513. -- Resolution: Fixed The TestQueueCapacities have been fixed in HADOOP-5869 Prior code fix in Capacity Scheduler prevents speculative execution in jobs --- Key: MAPREDUCE-513 URL: https://issues.apache.org/jira/browse/MAPREDUCE-513 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Vivek Ratan Assignee: Sreekanth Ramakrishnan Attachments: 4981.1.patch, 4981.2.patch, HADOOP-4981-1.patch, HADOOP-4981-2.patch, HADOOP-4981-3.patch, HADOOP-4981-4.patch, HADOOP-4981-5-br20.patch, HADOOP-4981-5.patch As part of the code fix for HADOOP-4035, the Capacity Scheduler obtains a task from JobInProgress (calling obtainNewMapTask() or obtainNewReduceTask()) only if the number of pending tasks for a job is greater than zero (see the if-block in TaskSchedulingMgr.getTaskFromJob()). So, if a job has no pending tasks and only has running tasks, it will never be given a slot, and will never have a chance to run a speculative task. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-766) Enhance -list-blacklisted-trackers to display host name, blacklisted reason and blacklist report.
[ https://issues.apache.org/jira/browse/MAPREDUCE-766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-766: - Attachment: mapreduce-766-5.patch Attaching latest patch with changes from Hemanth offline. ClusterStatus.java : Modified documentation, removed a constructor, made the set methods package private. JobTracker.java: Added a check for sb.length() 0 before replacing the final ',', also renamed an API to getFaultReport - replaced occurrences including jsp and test case. Fixed the test case to check if the correct replaced string is being used in the BlacklistInfo instance. Enhance -list-blacklisted-trackers to display host name, blacklisted reason and blacklist report. - Key: MAPREDUCE-766 URL: https://issues.apache.org/jira/browse/MAPREDUCE-766 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Sreekanth Ramakrishnan Assignee: Sreekanth Ramakrishnan Attachments: blacklist3.png, mapreduce-766-1.patch, mapreduce-766-2.patch, mapreduce-766-3.patch, mapreduce-766-4.patch, mapreduce-766-5.patch Currently, the -list-blacklisted-trackers in the mapred job option list only tracker name. We should enhance it to display as hostname, reason for blacklisting and blacklist report. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-682) Reserved tasktrackers should be removed when a node is globally blacklisted
[ https://issues.apache.org/jira/browse/MAPREDUCE-682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreekanth Ramakrishnan updated MAPREDUCE-682: - Attachment: mapreduce-682-ydist.patch Attaching patch for Yahoo! distribution. Reserved tasktrackers should be removed when a node is globally blacklisted --- Key: MAPREDUCE-682 URL: https://issues.apache.org/jira/browse/MAPREDUCE-682 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.21.0 Reporter: Hemanth Yamijala Assignee: Sreekanth Ramakrishnan Fix For: 0.21.0 Attachments: mapreduce-682-1.patch, mapreduce-682-2.patch, mapreduce-682-ydist.patch When support was added to reserve tasktrackers for high RAM jobs per MAPREDUCE-516, we missed removing reservations on tasktrackers that are globally blacklisted. This is not a major concern, just that the reservation might cause the job to finish a little later than if the reservation is removed when the blacklisting happens. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.