[jira] Created: (MAPREDUCE-1372) ConcurrentModificationException in JobInProgress
ConcurrentModificationException in JobInProgress Key: MAPREDUCE-1372 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1372 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.20.1 Reporter: Amareshwari Sriramadasu Priority: Critical Fix For: 0.21.0 We have seen the following ConcurrentModificationException in one of our clusters {noformat} java.io.IOException: java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793) at java.util.HashMap$KeyIterator.next(HashMap.java:828) at org.apache.hadoop.mapred.JobInProgress.findNewMapTask(JobInProgress.java:2018) at org.apache.hadoop.mapred.JobInProgress.obtainNewMapTask(JobInProgress.java:1077) at org.apache.hadoop.mapred.CapacityTaskScheduler$MapSchedulingMgr.obtainNewTask(CapacityTaskScheduler.java:796) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.getTaskFromQueue(CapacityTaskScheduler.java:589) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.assignTasks(CapacityTaskScheduler.java:677) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.access$500(CapacityTaskScheduler.java:348) at org.apache.hadoop.mapred.CapacityTaskScheduler.addMapTask(CapacityTaskScheduler.java:1397) at org.apache.hadoop.mapred.CapacityTaskScheduler.assignTasks(CapacityTaskScheduler.java:1349) at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:2976) at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1316) JobTracker holds stale references to retired jobs via unreported tasks
[ https://issues.apache.org/jira/browse/MAPREDUCE-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799115#action_12799115 ] Arun C Murthy commented on MAPREDUCE-1316: -- I took a quick look, the patch looks reasonable - I'll spend some more time on it tmrw. JobTracker holds stale references to retired jobs via unreported tasks --- Key: MAPREDUCE-1316 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1316 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Reporter: Amar Kamat Assignee: Amar Kamat Priority: Blocker Attachments: mapreduce-1316-v1.11.patch, mapreduce-1316-v1.7.patch JobTracker fails to remove _unreported_ tasks' mapping from _taskToTIPMap_ if the job finishes and retires. _Unreported tasks_ refers to tasks that were scheduled but the tasktracker did not report back with the task status. In such cases a stale reference is held to TaskInProgress (and thus JobInProgress) long after the job is gone leading to memory leak. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1372) ConcurrentModificationException in JobInProgress
[ https://issues.apache.org/jira/browse/MAPREDUCE-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799118#action_12799118 ] Amareshwari Sriramadasu commented on MAPREDUCE-1372: The code corresponding to the exception: {code} 2012:CollectionNode nodesAtMaxLevel = jobtracker.getNodesAtMaxLevel(); 2013: 2014:// get the node parent at max level 2015:Node nodeParentAtMaxLevel = 2016: (node == null) ? null : JobTracker.getParentNode(node, maxLevel - 1); 2017: 2018:for (Node parent : nodesAtMaxLevel) { 2019: {code} Add to nodesAtMaxLevel Map happens from two methods: JobTracker.addNewTracker()(with JobTracker lock) and JobInProgress.createCache (with JobInProgress lock). Solution is to make JobTracker.nodesAtMaxLevel a SynchronizedMap and add synchronization for the iterations. Thoughts? ConcurrentModificationException in JobInProgress Key: MAPREDUCE-1372 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1372 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.20.1 Reporter: Amareshwari Sriramadasu Priority: Blocker Fix For: 0.21.0 We have seen the following ConcurrentModificationException in one of our clusters {noformat} java.io.IOException: java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793) at java.util.HashMap$KeyIterator.next(HashMap.java:828) at org.apache.hadoop.mapred.JobInProgress.findNewMapTask(JobInProgress.java:2018) at org.apache.hadoop.mapred.JobInProgress.obtainNewMapTask(JobInProgress.java:1077) at org.apache.hadoop.mapred.CapacityTaskScheduler$MapSchedulingMgr.obtainNewTask(CapacityTaskScheduler.java:796) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.getTaskFromQueue(CapacityTaskScheduler.java:589) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.assignTasks(CapacityTaskScheduler.java:677) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.access$500(CapacityTaskScheduler.java:348) at org.apache.hadoop.mapred.CapacityTaskScheduler.addMapTask(CapacityTaskScheduler.java:1397) at org.apache.hadoop.mapred.CapacityTaskScheduler.assignTasks(CapacityTaskScheduler.java:1349) at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:2976) at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1372) ConcurrentModificationException in JobInProgress
[ https://issues.apache.org/jira/browse/MAPREDUCE-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated MAPREDUCE-1372: - Priority: Blocker (was: Critical) ConcurrentModificationException in JobInProgress Key: MAPREDUCE-1372 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1372 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.20.1 Reporter: Amareshwari Sriramadasu Priority: Blocker Fix For: 0.21.0 We have seen the following ConcurrentModificationException in one of our clusters {noformat} java.io.IOException: java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793) at java.util.HashMap$KeyIterator.next(HashMap.java:828) at org.apache.hadoop.mapred.JobInProgress.findNewMapTask(JobInProgress.java:2018) at org.apache.hadoop.mapred.JobInProgress.obtainNewMapTask(JobInProgress.java:1077) at org.apache.hadoop.mapred.CapacityTaskScheduler$MapSchedulingMgr.obtainNewTask(CapacityTaskScheduler.java:796) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.getTaskFromQueue(CapacityTaskScheduler.java:589) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.assignTasks(CapacityTaskScheduler.java:677) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.access$500(CapacityTaskScheduler.java:348) at org.apache.hadoop.mapred.CapacityTaskScheduler.addMapTask(CapacityTaskScheduler.java:1397) at org.apache.hadoop.mapred.CapacityTaskScheduler.assignTasks(CapacityTaskScheduler.java:1349) at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:2976) at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1372) ConcurrentModificationException in JobInProgress
[ https://issues.apache.org/jira/browse/MAPREDUCE-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799121#action_12799121 ] Arun C Murthy commented on MAPREDUCE-1372: -- Sigh, this is a *bad* one. Unfortunately call heirarchy of JobTracker.addHostToNodeMapping locks JIP in one code path and JT in another while JobInProgress.findNewMapTask locks both JT and JIP (in that order) ... ConcurrentModificationException in JobInProgress Key: MAPREDUCE-1372 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1372 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.20.1 Reporter: Amareshwari Sriramadasu Priority: Blocker Fix For: 0.21.0 We have seen the following ConcurrentModificationException in one of our clusters {noformat} java.io.IOException: java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793) at java.util.HashMap$KeyIterator.next(HashMap.java:828) at org.apache.hadoop.mapred.JobInProgress.findNewMapTask(JobInProgress.java:2018) at org.apache.hadoop.mapred.JobInProgress.obtainNewMapTask(JobInProgress.java:1077) at org.apache.hadoop.mapred.CapacityTaskScheduler$MapSchedulingMgr.obtainNewTask(CapacityTaskScheduler.java:796) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.getTaskFromQueue(CapacityTaskScheduler.java:589) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.assignTasks(CapacityTaskScheduler.java:677) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.access$500(CapacityTaskScheduler.java:348) at org.apache.hadoop.mapred.CapacityTaskScheduler.addMapTask(CapacityTaskScheduler.java:1397) at org.apache.hadoop.mapred.CapacityTaskScheduler.assignTasks(CapacityTaskScheduler.java:1349) at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:2976) at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1327) Oracle database import via sqoop fails when a table contains the column types such as TIMESTAMP(6) WITH LOCAL TIME ZONE and TIMESTAMP(6) WITH TIME ZONE
[ https://issues.apache.org/jira/browse/MAPREDUCE-1327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799123#action_12799123 ] Hadoop QA commented on MAPREDUCE-1327: -- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12429970/MAPREDUCE-1327.3.patch against trunk revision 898019. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/376/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/376/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/376/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/376/console This message is automatically generated. Oracle database import via sqoop fails when a table contains the column types such as TIMESTAMP(6) WITH LOCAL TIME ZONE and TIMESTAMP(6) WITH TIME ZONE --- Key: MAPREDUCE-1327 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1327 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/sqoop Affects Versions: 0.22.0 Reporter: Leonid Furman Fix For: 0.22.0 Attachments: MAPREDUCE-1327.3.patch, MAPREDUCE-1327.patch Original Estimate: 96h Remaining Estimate: 96h When Oracle table contains the columns TIMESTAMP(6) WITH LOCAL TIME ZONE and TIMESTAMP(6) WITH TIME ZONE, Sqoop fails to map values for those columns to valid Java data types, resulting in the following exception: ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.NullPointerException java.lang.NullPointerException at org.apache.hadoop.sqoop.orm.ClassWriter.generateFields(ClassWriter.java:253) at org.apache.hadoop.sqoop.orm.ClassWriter.generateClassForColumns(ClassWriter.java:701) at org.apache.hadoop.sqoop.orm.ClassWriter.generate(ClassWriter.java:597) at org.apache.hadoop.sqoop.Sqoop.generateORM(Sqoop.java:75) at org.apache.hadoop.sqoop.Sqoop.importTable(Sqoop.java:87) at org.apache.hadoop.sqoop.Sqoop.run(Sqoop.java:175) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.hadoop.sqoop.Sqoop.main(Sqoop.java:201) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) I have modified the code for Hadoop and Sqoop so this bug is fixed on my machine. Please let me know if you would like me to generate the patch and upload it to this ticket. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1372) ConcurrentModificationException in JobInProgress
[ https://issues.apache.org/jira/browse/MAPREDUCE-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799126#action_12799126 ] Arun C Murthy commented on MAPREDUCE-1372: -- I guess making nodesAtMaxLevel a synchronized Set and then locking nodesAtMaxLevel in findNewTask is an option since the locking order will be JT, JIP, nodesAtMaxLevel in one path and JIP, nodesAtMaxLevel in the other - but really ugly. Another option is to make nodesAtMaxLevel a synchronized ArrayList and then use ArrayList.get rather than iterate over it in JIP.findNewTask i.e. {noformat} JIP.findNewTask: int size = JobTracker.getNumNodesAtMaxLevel(); for (int i = 0; i size; ++i) { Node n = JobTracker.getNthNodeAtMaxLevel(i); // ... // do something with n // ... } {noformat} However, we'll need to ensure JobTracker.nodesAtMaxLevel doesn't contain dups - something which is currently ensure by having nodesAtMaxLevel be a Set. ConcurrentModificationException in JobInProgress Key: MAPREDUCE-1372 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1372 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.20.1 Reporter: Amareshwari Sriramadasu Priority: Blocker Fix For: 0.21.0 We have seen the following ConcurrentModificationException in one of our clusters {noformat} java.io.IOException: java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793) at java.util.HashMap$KeyIterator.next(HashMap.java:828) at org.apache.hadoop.mapred.JobInProgress.findNewMapTask(JobInProgress.java:2018) at org.apache.hadoop.mapred.JobInProgress.obtainNewMapTask(JobInProgress.java:1077) at org.apache.hadoop.mapred.CapacityTaskScheduler$MapSchedulingMgr.obtainNewTask(CapacityTaskScheduler.java:796) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.getTaskFromQueue(CapacityTaskScheduler.java:589) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.assignTasks(CapacityTaskScheduler.java:677) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.access$500(CapacityTaskScheduler.java:348) at org.apache.hadoop.mapred.CapacityTaskScheduler.addMapTask(CapacityTaskScheduler.java:1397) at org.apache.hadoop.mapred.CapacityTaskScheduler.assignTasks(CapacityTaskScheduler.java:1349) at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:2976) at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1342) Potential JT deadlock in faulty TT tracking
[ https://issues.apache.org/jira/browse/MAPREDUCE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799127#action_12799127 ] Hadoop QA commented on MAPREDUCE-1342: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12429993/patch-1342-1.txt against trunk revision 898019. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/263/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/263/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/263/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/263/console This message is automatically generated. Potential JT deadlock in faulty TT tracking --- Key: MAPREDUCE-1342 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1342 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.20.1 Reporter: Todd Lipcon Assignee: Amareshwari Sriramadasu Fix For: 0.21.0 Attachments: cycle0.png, mapreduce-1342-1.patch, mapreduce-1342-2.patch, patch-1342-1.txt, patch-1342-ydist.txt, patch-1342.txt JT$FaultyTrackersInfo.incrementFaults first locks potentiallyFaultyTrackers, and then calls blackListTracker, which calls removeHostCapacity, which locks JT.taskTrackers On the other hand, JT.blacklistedTaskTrackers() locks taskTrackers, then calls faultyTrackers.isBlacklisted() which goes on to lock potentiallyFaultyTrackers. I haven't produced such a deadlock, but the lock ordering here is inverted and therefore could deadlock. Not sure if this goes back to 0.21 or just in trunk. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1342) Potential JT deadlock in faulty TT tracking
[ https://issues.apache.org/jira/browse/MAPREDUCE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-1342: --- Attachment: patch-1342-2.txt Patch with Arun's comments incorporated. Now, taskTrackers or potentiallyFaultyTrackers is always locked holding JobTracker lock. The newly synchronized methods are called from testcases or already synchronized methods. Potential JT deadlock in faulty TT tracking --- Key: MAPREDUCE-1342 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1342 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.20.1 Reporter: Todd Lipcon Assignee: Amareshwari Sriramadasu Fix For: 0.21.0 Attachments: cycle0.png, mapreduce-1342-1.patch, mapreduce-1342-2.patch, patch-1342-1.txt, patch-1342-2.txt, patch-1342-ydist.txt, patch-1342.txt JT$FaultyTrackersInfo.incrementFaults first locks potentiallyFaultyTrackers, and then calls blackListTracker, which calls removeHostCapacity, which locks JT.taskTrackers On the other hand, JT.blacklistedTaskTrackers() locks taskTrackers, then calls faultyTrackers.isBlacklisted() which goes on to lock potentiallyFaultyTrackers. I haven't produced such a deadlock, but the lock ordering here is inverted and therefore could deadlock. Not sure if this goes back to 0.21 or just in trunk. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1342) Potential JT deadlock in faulty TT tracking
[ https://issues.apache.org/jira/browse/MAPREDUCE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-1342: --- Status: Patch Available (was: Open) Potential JT deadlock in faulty TT tracking --- Key: MAPREDUCE-1342 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1342 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.20.1 Reporter: Todd Lipcon Assignee: Amareshwari Sriramadasu Fix For: 0.21.0 Attachments: cycle0.png, mapreduce-1342-1.patch, mapreduce-1342-2.patch, patch-1342-1.txt, patch-1342-2.txt, patch-1342-ydist.txt, patch-1342.txt JT$FaultyTrackersInfo.incrementFaults first locks potentiallyFaultyTrackers, and then calls blackListTracker, which calls removeHostCapacity, which locks JT.taskTrackers On the other hand, JT.blacklistedTaskTrackers() locks taskTrackers, then calls faultyTrackers.isBlacklisted() which goes on to lock potentiallyFaultyTrackers. I haven't produced such a deadlock, but the lock ordering here is inverted and therefore could deadlock. Not sure if this goes back to 0.21 or just in trunk. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1302) TrackerDistributedCacheManager can delete file asynchronously
[ https://issues.apache.org/jira/browse/MAPREDUCE-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799137#action_12799137 ] Amareshwari Sriramadasu commented on MAPREDUCE-1302: Since Hudson does not run LinuxTaskController tests, can you run TestTrackerDistributedCacheManagerWithLinuxTaskController and make sure it passes? TrackerDistributedCacheManager can delete file asynchronously - Key: MAPREDUCE-1302 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1302 Project: Hadoop Map/Reduce Issue Type: Improvement Components: tasktracker Affects Versions: 0.20.2, 0.21.0, 0.22.0 Reporter: Zheng Shao Assignee: Zheng Shao Attachments: MAPREDUCE-1302.0.patch, MAPREDUCE-1302.1.patch, MAPREDUCE-1302.2.patch, MAPREDUCE-1302.3.patch, MAPREDUCE-1302.4.patch, MAPREDUCE-1302.5.patch With the help of AsyncDiskService from MAPREDUCE-1213, we should be able to delete files from distributed cache asynchronously. That will help make task initialization faster, because task initialization calls the code that localizes files into the cache and may delete some other files. The deletion can slow down the task initialization speed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1316) JobTracker holds stale references to retired jobs via unreported tasks
[ https://issues.apache.org/jira/browse/MAPREDUCE-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amar Kamat updated MAPREDUCE-1316: -- Attachment: mapreduce-1316-v1.13-branch20-yahoo.patch Attaching a patch for Yahoo!'s distribution of Hadoop incorporating review comments from Jothi and Hemanth. test-patch and ant tests (except TestNameNodeMetrics, TestJobHistory, TestKillSubProcesses and TestReduceFetch) passed. All the failed tests passed when I re-ran them. JobTracker holds stale references to retired jobs via unreported tasks --- Key: MAPREDUCE-1316 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1316 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Reporter: Amar Kamat Assignee: Amar Kamat Priority: Blocker Attachments: mapreduce-1316-v1.11.patch, mapreduce-1316-v1.13-branch20-yahoo.patch, mapreduce-1316-v1.7.patch JobTracker fails to remove _unreported_ tasks' mapping from _taskToTIPMap_ if the job finishes and retires. _Unreported tasks_ refers to tasks that were scheduled but the tasktracker did not report back with the task status. In such cases a stale reference is held to TaskInProgress (and thus JobInProgress) long after the job is gone leading to memory leak. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1342) Potential JT deadlock in faulty TT tracking
[ https://issues.apache.org/jira/browse/MAPREDUCE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-1342: --- Attachment: patch-1342-2-ydist.txt Patch for Yahoo! distribution Potential JT deadlock in faulty TT tracking --- Key: MAPREDUCE-1342 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1342 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.20.1 Reporter: Todd Lipcon Assignee: Amareshwari Sriramadasu Fix For: 0.21.0 Attachments: cycle0.png, mapreduce-1342-1.patch, mapreduce-1342-2.patch, patch-1342-1.txt, patch-1342-2-ydist.txt, patch-1342-2.txt, patch-1342-ydist.txt, patch-1342.txt JT$FaultyTrackersInfo.incrementFaults first locks potentiallyFaultyTrackers, and then calls blackListTracker, which calls removeHostCapacity, which locks JT.taskTrackers On the other hand, JT.blacklistedTaskTrackers() locks taskTrackers, then calls faultyTrackers.isBlacklisted() which goes on to lock potentiallyFaultyTrackers. I haven't produced such a deadlock, but the lock ordering here is inverted and therefore could deadlock. Not sure if this goes back to 0.21 or just in trunk. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1366) Tests should not timeout if TaskTracker/JobTracker crashes in MiniMRCluster
[ https://issues.apache.org/jira/browse/MAPREDUCE-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799188#action_12799188 ] Steve Loughran commented on MAPREDUCE-1366: --- I would only set the fatalError value if it is not null, so that the earliest fault gets retained. A setFatalError() method could do this. Also, this may be an opportunity to give the MiniMRCluster and MinDFS cluster a common base class rather than continue to duplicate code. Tests should not timeout if TaskTracker/JobTracker crashes in MiniMRCluster --- Key: MAPREDUCE-1366 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1366 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Reporter: Amareshwari Sriramadasu Fix For: 0.22.0 Attachments: M1366-0.patch Currently tests timeout if there is any problem bringing up JobTracker or TaskTracker in MiniMRCluster. Instead tests should fail saying JT/TT crashed. See test timeout on MAPREDUCE-1365 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1372) ConcurrentModificationException in JobInProgress
[ https://issues.apache.org/jira/browse/MAPREDUCE-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799247#action_12799247 ] Koji Noguchi commented on MAPREDUCE-1372: - When we hit this, that task never get scheduled and job would stuck forever. ConcurrentModificationException in JobInProgress Key: MAPREDUCE-1372 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1372 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.20.1 Reporter: Amareshwari Sriramadasu Priority: Blocker Fix For: 0.21.0 We have seen the following ConcurrentModificationException in one of our clusters {noformat} java.io.IOException: java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793) at java.util.HashMap$KeyIterator.next(HashMap.java:828) at org.apache.hadoop.mapred.JobInProgress.findNewMapTask(JobInProgress.java:2018) at org.apache.hadoop.mapred.JobInProgress.obtainNewMapTask(JobInProgress.java:1077) at org.apache.hadoop.mapred.CapacityTaskScheduler$MapSchedulingMgr.obtainNewTask(CapacityTaskScheduler.java:796) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.getTaskFromQueue(CapacityTaskScheduler.java:589) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.assignTasks(CapacityTaskScheduler.java:677) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.access$500(CapacityTaskScheduler.java:348) at org.apache.hadoop.mapred.CapacityTaskScheduler.addMapTask(CapacityTaskScheduler.java:1397) at org.apache.hadoop.mapred.CapacityTaskScheduler.assignTasks(CapacityTaskScheduler.java:1349) at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:2976) at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1218) Collecting cpu and memory usage for TaskTrackers
[ https://issues.apache.org/jira/browse/MAPREDUCE-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799271#action_12799271 ] Vinod K V commented on MAPREDUCE-1218: -- bq. I have found that there is a mistake in the previous patch. Previously I made MemoryCalculatorPlugin extends ResourceCalculatorPlugin and LinuxMemoryCalculatorPlugin extends LinuxResourceCalculatorPlugin. But LinuxResourceCalculatorPlugin does not extend MemoryCalculatorPlugin. Good catch! {{DummyMemoryCalculatorPlugin}} needs to be deprecated still?. Other than that, +1 for the patch. Once that is done, and after Hudson blesses it, can you ask someone to commit this? Thanks for being patient! Collecting cpu and memory usage for TaskTrackers Key: MAPREDUCE-1218 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1218 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 0.22.0 Environment: linux Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.22.0 Attachments: MAPREDUCE-1218-rename.sh, MAPREDUCE-1218-v2.patch, MAPREDUCE-1218-v3.patch, MAPREDUCE-1218-v4.patch, MAPREDUCE-1218-v5.patch, MAPREDUCE-1218-v6.patch, MAPREDUCE-1218.patch The information can be used for resource aware scheduling. Note that this is related to MAPREDUCE-220. There the per task resource information is collected. This one collects the per machine information. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-815) Add AvroInputFormat and AvroOutputFormat so that hadoop can use Avro Serialization
[ https://issues.apache.org/jira/browse/MAPREDUCE-815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799288#action_12799288 ] Doug Cutting commented on MAPREDUCE-815: Aaron, this sounds good. A few questions: - If, in the InputFormat we populated the key rather than the value, then one would not even need to specify InverseMapper: by default, MapReduce would simply partition and sort Avro data. Making values optional in both input and output seems more consistent, but does break compatibility with TextInputFormat. Thoughts? - In the OutputFormat, should we check if values are non-null or just drop them? Just dropping them may cause some confusion, but is probably useful in many cases, so I guess we err towards utility? Add AvroInputFormat and AvroOutputFormat so that hadoop can use Avro Serialization -- Key: MAPREDUCE-815 URL: https://issues.apache.org/jira/browse/MAPREDUCE-815 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: Ravi Gummadi Assignee: Aaron Kimball MapReduce needs AvroInputFormat similar to other InputFormats like TextInputFormat to be able to use avro serialization in hadoop. Similarly AvroOutputFormat is needed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1218) Collecting cpu and memory usage for TaskTrackers
[ https://issues.apache.org/jira/browse/MAPREDUCE-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799402#action_12799402 ] Scott Chen commented on MAPREDUCE-1218: --- @Dhruba, I have updated the patch. Let me know if this one works. Collecting cpu and memory usage for TaskTrackers Key: MAPREDUCE-1218 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1218 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 0.22.0 Environment: linux Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.22.0 Attachments: MAPREDUCE-1218-rename.sh, MAPREDUCE-1218-v2.patch, MAPREDUCE-1218-v3.patch, MAPREDUCE-1218-v4.patch, MAPREDUCE-1218-v5.patch, MAPREDUCE-1218-v6.1.patch, MAPREDUCE-1218-v6.patch, MAPREDUCE-1218.patch The information can be used for resource aware scheduling. Note that this is related to MAPREDUCE-220. There the per task resource information is collected. This one collects the per machine information. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1218) Collecting cpu and memory usage for TaskTrackers
[ https://issues.apache.org/jira/browse/MAPREDUCE-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen updated MAPREDUCE-1218: -- Attachment: MAPREDUCE-1218-v6.1.patch Collecting cpu and memory usage for TaskTrackers Key: MAPREDUCE-1218 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1218 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 0.22.0 Environment: linux Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.22.0 Attachments: MAPREDUCE-1218-rename.sh, MAPREDUCE-1218-v2.patch, MAPREDUCE-1218-v3.patch, MAPREDUCE-1218-v4.patch, MAPREDUCE-1218-v5.patch, MAPREDUCE-1218-v6.1.patch, MAPREDUCE-1218-v6.patch, MAPREDUCE-1218.patch The information can be used for resource aware scheduling. Note that this is related to MAPREDUCE-220. There the per task resource information is collected. This one collects the per machine information. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1366) Tests should not timeout if TaskTracker/JobTracker crashes in MiniMRCluster
[ https://issues.apache.org/jira/browse/MAPREDUCE-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated MAPREDUCE-1366: - Status: Open (was: Patch Available) Failures appear related to MAPREDUCE-1275. Will try again. bq. When i tried to test the patch, I realized that the test timeout on MAPREDUCE-1365 is because of MAPREDUCE-1371. *nod* Yes, you're right. I hadn't tested that. The test timeout wasn't my motivation, but the spurious failure in MAPREDUCE-64 that would be easier to diagnose. bq. I would only set the fatalError value if it is not null, so that the earliest fault gets retained. A setFatalError() method could do this. I don't see what you mean. Each tracker retains its cause of death; it's not shared between them and each tracker should only set this once. Are you suggesting making the error global and retaining only the first fault across all trackers? bq. Also, this may be an opportunity to give the MiniMRCluster and MinDFS cluster a common base class rather than continue to duplicate code. Refactoring the Mini\*Clusters is out of scope for this issue. This is just making the cause of test failures related to HADOOP-4744 clearer. Tests should not timeout if TaskTracker/JobTracker crashes in MiniMRCluster --- Key: MAPREDUCE-1366 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1366 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Reporter: Amareshwari Sriramadasu Fix For: 0.22.0 Attachments: M1366-0.patch Currently tests timeout if there is any problem bringing up JobTracker or TaskTracker in MiniMRCluster. Instead tests should fail saying JT/TT crashed. See test timeout on MAPREDUCE-1365 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1366) Tests should not timeout if TaskTracker/JobTracker crashes in MiniMRCluster
[ https://issues.apache.org/jira/browse/MAPREDUCE-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated MAPREDUCE-1366: - Status: Patch Available (was: Open) Tests should not timeout if TaskTracker/JobTracker crashes in MiniMRCluster --- Key: MAPREDUCE-1366 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1366 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Reporter: Amareshwari Sriramadasu Fix For: 0.22.0 Attachments: M1366-0.patch Currently tests timeout if there is any problem bringing up JobTracker or TaskTracker in MiniMRCluster. Instead tests should fail saying JT/TT crashed. See test timeout on MAPREDUCE-1365 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1372) ConcurrentModificationException in JobInProgress
[ https://issues.apache.org/jira/browse/MAPREDUCE-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799410#action_12799410 ] Arun C Murthy commented on MAPREDUCE-1372: -- I don't believe findbugs ever found this. ConcurrentModificationException in JobInProgress Key: MAPREDUCE-1372 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1372 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.20.1 Reporter: Amareshwari Sriramadasu Priority: Blocker Fix For: 0.21.0 We have seen the following ConcurrentModificationException in one of our clusters {noformat} java.io.IOException: java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793) at java.util.HashMap$KeyIterator.next(HashMap.java:828) at org.apache.hadoop.mapred.JobInProgress.findNewMapTask(JobInProgress.java:2018) at org.apache.hadoop.mapred.JobInProgress.obtainNewMapTask(JobInProgress.java:1077) at org.apache.hadoop.mapred.CapacityTaskScheduler$MapSchedulingMgr.obtainNewTask(CapacityTaskScheduler.java:796) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.getTaskFromQueue(CapacityTaskScheduler.java:589) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.assignTasks(CapacityTaskScheduler.java:677) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.access$500(CapacityTaskScheduler.java:348) at org.apache.hadoop.mapred.CapacityTaskScheduler.addMapTask(CapacityTaskScheduler.java:1397) at org.apache.hadoop.mapred.CapacityTaskScheduler.assignTasks(CapacityTaskScheduler.java:1349) at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:2976) at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-815) Add AvroInputFormat and AvroOutputFormat so that hadoop can use Avro Serialization
[ https://issues.apache.org/jira/browse/MAPREDUCE-815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799422#action_12799422 ] Aaron Kimball commented on MAPREDUCE-815: - Doug: * I agree; I'll make this be the key. The value will be the byte offset. * My current implementation gives a log message at level WARN the first time a non-null value is received; it then ignores the value and continues operating. Add AvroInputFormat and AvroOutputFormat so that hadoop can use Avro Serialization -- Key: MAPREDUCE-815 URL: https://issues.apache.org/jira/browse/MAPREDUCE-815 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: Ravi Gummadi Assignee: Aaron Kimball MapReduce needs AvroInputFormat similar to other InputFormats like TextInputFormat to be able to use avro serialization in hadoop. Similarly AvroOutputFormat is needed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1342) Potential JT deadlock in faulty TT tracking
[ https://issues.apache.org/jira/browse/MAPREDUCE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799431#action_12799431 ] Todd Lipcon commented on MAPREDUCE-1342: Now that some more methods have been made synchronized, I'd like to rerun jcarder another time, just to make sure we didn't introduce a new deadlock while fixing this one. I should have a chance to do so in the next day or two. Potential JT deadlock in faulty TT tracking --- Key: MAPREDUCE-1342 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1342 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.20.1 Reporter: Todd Lipcon Assignee: Amareshwari Sriramadasu Fix For: 0.21.0 Attachments: cycle0.png, mapreduce-1342-1.patch, mapreduce-1342-2.patch, patch-1342-1.txt, patch-1342-2-ydist.txt, patch-1342-2.txt, patch-1342-ydist.txt, patch-1342.txt JT$FaultyTrackersInfo.incrementFaults first locks potentiallyFaultyTrackers, and then calls blackListTracker, which calls removeHostCapacity, which locks JT.taskTrackers On the other hand, JT.blacklistedTaskTrackers() locks taskTrackers, then calls faultyTrackers.isBlacklisted() which goes on to lock potentiallyFaultyTrackers. I haven't produced such a deadlock, but the lock ordering here is inverted and therefore could deadlock. Not sure if this goes back to 0.21 or just in trunk. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1367) LocalJobRunner should support parallel mapper execution
[ https://issues.apache.org/jira/browse/MAPREDUCE-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799448#action_12799448 ] Todd Lipcon commented on MAPREDUCE-1367: - MapTaskRunnable: make most of the instance variables final? - it should be clearer that mapOutputFiles is an out param for getMapTaskRunnables - ie that it's expected to be intitially empty. Alternatively, it looks like you may be able to remove this variable entirely by instead storing the MapOutputFile in the MapRunnable instance, and then iterating directly over the MapRunnables in the reducer. Does that make sense? However, the way you've done it is less invasive to the reduce side, so if you don't see the benefit, feel free to ignore this suggestion. - Does this handle the degenerate case of 0-map jobs? It sounds ridiculous, but I recall previous JIRAs for this situation, since occasionally people have a cron job that periodically processes a given directory. If the directory is empty, it may generate a job with no input splits and thus no tasks. Aside from that, looks good to me. LocalJobRunner should support parallel mapper execution --- Key: MAPREDUCE-1367 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1367 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Aaron Kimball Assignee: Aaron Kimball Attachments: MAPREDUCE-1367.2.patch, MAPREDUCE-1367.3.patch, MAPREDUCE-1367.4.patch, MAPREDUCE-1367.patch The LocalJobRunner currently supports only a single execution thread. Given the prevalence of multi-core CPUs, it makes sense to allow users to run multiple tasks in parallel for improved performance on small (local-only) jobs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1372) ConcurrentModificationException in JobInProgress
[ https://issues.apache.org/jira/browse/MAPREDUCE-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799449#action_12799449 ] Arun C Murthy commented on MAPREDUCE-1372: -- Ok, being awake while debugging synchronization errors is (kinda) useful, so here goes: The primary cause for this bug is that we are resolving nodes (adding them to nodesAtMaxLevel) during job initialization - during which we cannot lock up the JobTracker. A reasonable fix is to queue the resolutions and resolve them under the JobTracker lock in a separate thread... this has the added benefit that we resolve more of the hosts simultaneously, there-by decreasing the number of forks we make via the usual ScriptBasedMapping. This actually needs a bit more work, we'll need the JIP to call 'wait' on itself while the hosts are being resolved and then the thread doing the resolutions will have to signal the JIP to continue. This is necessary since the JIP needs all nodes to be resolved to build up all it's cache tables for scheduling. Sigh. Thoughts? ConcurrentModificationException in JobInProgress Key: MAPREDUCE-1372 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1372 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.20.1 Reporter: Amareshwari Sriramadasu Priority: Blocker Fix For: 0.21.0 We have seen the following ConcurrentModificationException in one of our clusters {noformat} java.io.IOException: java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793) at java.util.HashMap$KeyIterator.next(HashMap.java:828) at org.apache.hadoop.mapred.JobInProgress.findNewMapTask(JobInProgress.java:2018) at org.apache.hadoop.mapred.JobInProgress.obtainNewMapTask(JobInProgress.java:1077) at org.apache.hadoop.mapred.CapacityTaskScheduler$MapSchedulingMgr.obtainNewTask(CapacityTaskScheduler.java:796) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.getTaskFromQueue(CapacityTaskScheduler.java:589) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.assignTasks(CapacityTaskScheduler.java:677) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.access$500(CapacityTaskScheduler.java:348) at org.apache.hadoop.mapred.CapacityTaskScheduler.addMapTask(CapacityTaskScheduler.java:1397) at org.apache.hadoop.mapred.CapacityTaskScheduler.assignTasks(CapacityTaskScheduler.java:1349) at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:2976) at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1372) ConcurrentModificationException in JobInProgress
[ https://issues.apache.org/jira/browse/MAPREDUCE-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated MAPREDUCE-1372: - Attachment: M1372-0.patch It's a hackish fix, but using the [keySet|http://java.sun.com/javase/6/docs/api/java/util/concurrent/ConcurrentHashMap.html#keySet()] in a ConcurrentHashMap allows the collection to be iterated over while another thread is modifying it. ConcurrentModificationException in JobInProgress Key: MAPREDUCE-1372 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1372 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.20.1 Reporter: Amareshwari Sriramadasu Priority: Blocker Fix For: 0.21.0 Attachments: M1372-0.patch We have seen the following ConcurrentModificationException in one of our clusters {noformat} java.io.IOException: java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793) at java.util.HashMap$KeyIterator.next(HashMap.java:828) at org.apache.hadoop.mapred.JobInProgress.findNewMapTask(JobInProgress.java:2018) at org.apache.hadoop.mapred.JobInProgress.obtainNewMapTask(JobInProgress.java:1077) at org.apache.hadoop.mapred.CapacityTaskScheduler$MapSchedulingMgr.obtainNewTask(CapacityTaskScheduler.java:796) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.getTaskFromQueue(CapacityTaskScheduler.java:589) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.assignTasks(CapacityTaskScheduler.java:677) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.access$500(CapacityTaskScheduler.java:348) at org.apache.hadoop.mapred.CapacityTaskScheduler.addMapTask(CapacityTaskScheduler.java:1397) at org.apache.hadoop.mapred.CapacityTaskScheduler.assignTasks(CapacityTaskScheduler.java:1349) at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:2976) at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1218) Collecting cpu and memory usage for TaskTrackers
[ https://issues.apache.org/jira/browse/MAPREDUCE-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated MAPREDUCE-1218: Status: Patch Available (was: Open) Collecting cpu and memory usage for TaskTrackers Key: MAPREDUCE-1218 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1218 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 0.22.0 Environment: linux Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.22.0 Attachments: MAPREDUCE-1218-rename.sh, MAPREDUCE-1218-v2.patch, MAPREDUCE-1218-v3.patch, MAPREDUCE-1218-v4.patch, MAPREDUCE-1218-v5.patch, MAPREDUCE-1218-v6.1.patch, MAPREDUCE-1218-v6.patch, MAPREDUCE-1218.patch The information can be used for resource aware scheduling. Note that this is related to MAPREDUCE-220. There the per task resource information is collected. This one collects the per machine information. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1213) TaskTrackers restart is very slow because it deletes distributed cache directory synchronously
[ https://issues.apache.org/jira/browse/MAPREDUCE-1213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Shao updated MAPREDUCE-1213: -- Attachment: MAPREDUCE-1213.branch-0.20.patch Patch for 0.20. TaskTrackers restart is very slow because it deletes distributed cache directory synchronously -- Key: MAPREDUCE-1213 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1213 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.1 Reporter: dhruba borthakur Assignee: Zheng Shao Fix For: 0.22.0 Attachments: MAPREDUCE-1213.1.patch, MAPREDUCE-1213.2.patch, MAPREDUCE-1213.3.patch, MAPREDUCE-1213.4.patch, MAPREDUCE-1213.branch-0.20.patch We are seeing that when we restart a tasktracker, it tries to recursively delete all the file in the distributed cache. It invoked FileUtil.fullyDelete() which is very very slow. This means that the TaskTracker cannot join the cluster for an extended period of time (upto 2 hours for us). The problem is acute if the number of files in a distributed cache is a few-thousands. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-181) Secure job submission
[ https://issues.apache.org/jira/browse/MAPREDUCE-181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj Das updated MAPREDUCE-181: -- Attachment: 181.20.s.3.patch The patch for the yahoo 0.20 branch (not to be committed) Secure job submission -- Key: MAPREDUCE-181 URL: https://issues.apache.org/jira/browse/MAPREDUCE-181 Project: Hadoop Map/Reduce Issue Type: Sub-task Reporter: Amar Kamat Assignee: Devaraj Das Fix For: 0.22.0 Attachments: 181-1.patch, 181-2.patch, 181-3.patch, 181-3.patch, 181-4.patch, 181-5.1.patch, 181-5.1.patch, 181-6.patch, 181-8.patch, 181.20.s.3.patch, hadoop-3578-branch-20-example-2.patch, hadoop-3578-branch-20-example.patch, HADOOP-3578-v2.6.patch, HADOOP-3578-v2.7.patch, MAPRED-181-v3.32.patch, MAPRED-181-v3.8.patch Currently the jobclient accesses the {{mapred.system.dir}} to add job details. Hence the {{mapred.system.dir}} has the permissions of {{rwx-wx-wx}}. This could be a security loophole where the job files might get overwritten/tampered after the job submission. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-679) XML-based metrics as JSP servlet for JobTracker
[ https://issues.apache.org/jira/browse/MAPREDUCE-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799463#action_12799463 ] Amir Youssefi commented on MAPREDUCE-679: - What's level of effort to back-port this to 0.20? Are there other JIRAs which this patch is dependent on? XML-based metrics as JSP servlet for JobTracker --- Key: MAPREDUCE-679 URL: https://issues.apache.org/jira/browse/MAPREDUCE-679 Project: Hadoop Map/Reduce Issue Type: New Feature Components: jobtracker Reporter: Aaron Kimball Assignee: Aaron Kimball Fix For: 0.21.0 Attachments: example-jobtracker-completed-job.xml, example-jobtracker-running-job.xml, MAPREDUCE-679.2.patch, MAPREDUCE-679.3.patch, MAPREDUCE-679.4.patch, MAPREDUCE-679.5.patch, MAPREDUCE-679.6.patch, MAPREDUCE-679.7.patch, MAPREDUCE-679.patch In HADOOP-4559, a general REST API for reporting metrics was proposed but work seems to have stalled. In the interim, we have a simple XML translation of the existing JobTracker status page which provides the same metrics (including the tables of running/completed/failed jobs) as the human-readable page. This is a relatively lightweight addition to provide some machine-understandable metrics reporting. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1327) Oracle database import via sqoop fails when a table contains the column types such as TIMESTAMP(6) WITH LOCAL TIME ZONE and TIMESTAMP(6) WITH TIME ZONE
[ https://issues.apache.org/jira/browse/MAPREDUCE-1327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Leonid Furman updated MAPREDUCE-1327: - Attachment: MAPREDUCE-1327.4.patch Aaron, The new patch MAPREDUCE-1327.4.patch is available. I modified the code for unit test OracleManagerTest.java such that it takes timezone offset into consideration when comparing the test results. Thanks! Oracle database import via sqoop fails when a table contains the column types such as TIMESTAMP(6) WITH LOCAL TIME ZONE and TIMESTAMP(6) WITH TIME ZONE --- Key: MAPREDUCE-1327 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1327 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/sqoop Affects Versions: 0.22.0 Reporter: Leonid Furman Fix For: 0.22.0 Attachments: MAPREDUCE-1327.3.patch, MAPREDUCE-1327.4.patch, MAPREDUCE-1327.patch Original Estimate: 96h Remaining Estimate: 96h When Oracle table contains the columns TIMESTAMP(6) WITH LOCAL TIME ZONE and TIMESTAMP(6) WITH TIME ZONE, Sqoop fails to map values for those columns to valid Java data types, resulting in the following exception: ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.NullPointerException java.lang.NullPointerException at org.apache.hadoop.sqoop.orm.ClassWriter.generateFields(ClassWriter.java:253) at org.apache.hadoop.sqoop.orm.ClassWriter.generateClassForColumns(ClassWriter.java:701) at org.apache.hadoop.sqoop.orm.ClassWriter.generate(ClassWriter.java:597) at org.apache.hadoop.sqoop.Sqoop.generateORM(Sqoop.java:75) at org.apache.hadoop.sqoop.Sqoop.importTable(Sqoop.java:87) at org.apache.hadoop.sqoop.Sqoop.run(Sqoop.java:175) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.hadoop.sqoop.Sqoop.main(Sqoop.java:201) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) I have modified the code for Hadoop and Sqoop so this bug is fixed on my machine. Please let me know if you would like me to generate the patch and upload it to this ticket. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1372) ConcurrentModificationException in JobInProgress
[ https://issues.apache.org/jira/browse/MAPREDUCE-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799477#action_12799477 ] Todd Lipcon commented on MAPREDUCE-1372: Could nodesAtMaxLevel be made a CopyOnWriteArraySet? This is less efficient for mutations, but mutations are rare once the cache has been set up, and it avoids the CME possibility since iterators are safe and fast. ConcurrentModificationException in JobInProgress Key: MAPREDUCE-1372 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1372 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.20.1 Reporter: Amareshwari Sriramadasu Priority: Blocker Fix For: 0.21.0 Attachments: M1372-0.patch We have seen the following ConcurrentModificationException in one of our clusters {noformat} java.io.IOException: java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793) at java.util.HashMap$KeyIterator.next(HashMap.java:828) at org.apache.hadoop.mapred.JobInProgress.findNewMapTask(JobInProgress.java:2018) at org.apache.hadoop.mapred.JobInProgress.obtainNewMapTask(JobInProgress.java:1077) at org.apache.hadoop.mapred.CapacityTaskScheduler$MapSchedulingMgr.obtainNewTask(CapacityTaskScheduler.java:796) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.getTaskFromQueue(CapacityTaskScheduler.java:589) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.assignTasks(CapacityTaskScheduler.java:677) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.access$500(CapacityTaskScheduler.java:348) at org.apache.hadoop.mapred.CapacityTaskScheduler.addMapTask(CapacityTaskScheduler.java:1397) at org.apache.hadoop.mapred.CapacityTaskScheduler.assignTasks(CapacityTaskScheduler.java:1349) at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:2976) at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1218) Collecting cpu and memory usage for TaskTrackers
[ https://issues.apache.org/jira/browse/MAPREDUCE-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen updated MAPREDUCE-1218: -- Release Note: This patch allows TaskTracker reports it's current available memory and CPU usage to JobTracker through heartbeat. The information can be used for scheduling and monitoring in the JobTracker. Hadoop Flags: [Incompatible change] Collecting cpu and memory usage for TaskTrackers Key: MAPREDUCE-1218 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1218 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 0.22.0 Environment: linux Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.22.0 Attachments: MAPREDUCE-1218-rename.sh, MAPREDUCE-1218-v2.patch, MAPREDUCE-1218-v3.patch, MAPREDUCE-1218-v4.patch, MAPREDUCE-1218-v5.patch, MAPREDUCE-1218-v6.1.patch, MAPREDUCE-1218-v6.patch, MAPREDUCE-1218.patch The information can be used for resource aware scheduling. Note that this is related to MAPREDUCE-220. There the per task resource information is collected. This one collects the per machine information. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1367) LocalJobRunner should support parallel mapper execution
[ https://issues.apache.org/jira/browse/MAPREDUCE-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799484#action_12799484 ] Hadoop QA commented on MAPREDUCE-1367: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12430043/MAPREDUCE-1367.4.patch against trunk revision 898486. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/378/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/378/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/378/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/378/console This message is automatically generated. LocalJobRunner should support parallel mapper execution --- Key: MAPREDUCE-1367 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1367 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Aaron Kimball Assignee: Aaron Kimball Attachments: MAPREDUCE-1367.2.patch, MAPREDUCE-1367.3.patch, MAPREDUCE-1367.4.patch, MAPREDUCE-1367.patch The LocalJobRunner currently supports only a single execution thread. Given the prevalence of multi-core CPUs, it makes sense to allow users to run multiple tasks in parallel for improved performance on small (local-only) jobs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1218) Collecting cpu and memory usage for TaskTrackers
[ https://issues.apache.org/jira/browse/MAPREDUCE-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen updated MAPREDUCE-1218: -- Status: Open (was: Patch Available) Collecting cpu and memory usage for TaskTrackers Key: MAPREDUCE-1218 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1218 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 0.22.0 Environment: linux Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.22.0 Attachments: MAPREDUCE-1218-rename.sh, MAPREDUCE-1218-v2.patch, MAPREDUCE-1218-v3.patch, MAPREDUCE-1218-v4.patch, MAPREDUCE-1218-v5.patch, MAPREDUCE-1218-v6.1.patch, MAPREDUCE-1218-v6.2.patch, MAPREDUCE-1218-v6.patch, MAPREDUCE-1218.patch The information can be used for resource aware scheduling. Note that this is related to MAPREDUCE-220. There the per task resource information is collected. This one collects the per machine information. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1218) Collecting cpu and memory usage for TaskTrackers
[ https://issues.apache.org/jira/browse/MAPREDUCE-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen updated MAPREDUCE-1218: -- Attachment: MAPREDUCE-1218-v6.2.patch Changed the version number of InterTrackerProtocol to 30L, because this patch makes the heartbeat incompatible with the old version. Collecting cpu and memory usage for TaskTrackers Key: MAPREDUCE-1218 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1218 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 0.22.0 Environment: linux Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.22.0 Attachments: MAPREDUCE-1218-rename.sh, MAPREDUCE-1218-v2.patch, MAPREDUCE-1218-v3.patch, MAPREDUCE-1218-v4.patch, MAPREDUCE-1218-v5.patch, MAPREDUCE-1218-v6.1.patch, MAPREDUCE-1218-v6.2.patch, MAPREDUCE-1218-v6.patch, MAPREDUCE-1218.patch The information can be used for resource aware scheduling. Note that this is related to MAPREDUCE-220. There the per task resource information is collected. This one collects the per machine information. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1218) Collecting cpu and memory usage for TaskTrackers
[ https://issues.apache.org/jira/browse/MAPREDUCE-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen updated MAPREDUCE-1218: -- Release Note: This patch allows TaskTracker reports it's current available memory and CPU usage to JobTracker through heartbeat. The information can be used for scheduling and monitoring in the JobTracker. This patch changes the version of InterTrackerProtocal. (was: This patch allows TaskTracker reports it's current available memory and CPU usage to JobTracker through heartbeat. The information can be used for scheduling and monitoring in the JobTracker.) Status: Patch Available (was: Open) Collecting cpu and memory usage for TaskTrackers Key: MAPREDUCE-1218 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1218 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 0.22.0 Environment: linux Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.22.0 Attachments: MAPREDUCE-1218-rename.sh, MAPREDUCE-1218-v2.patch, MAPREDUCE-1218-v3.patch, MAPREDUCE-1218-v4.patch, MAPREDUCE-1218-v5.patch, MAPREDUCE-1218-v6.1.patch, MAPREDUCE-1218-v6.2.patch, MAPREDUCE-1218-v6.patch, MAPREDUCE-1218.patch The information can be used for resource aware scheduling. Note that this is related to MAPREDUCE-220. There the per task resource information is collected. This one collects the per machine information. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1302) TrackerDistributedCacheManager can delete file asynchronously
[ https://issues.apache.org/jira/browse/MAPREDUCE-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Shao updated MAPREDUCE-1302: -- Attachment: MAPREDUCE-1302.branch-0.20.on.top.of.MAPREDUCE-1213.patch This patch is for branch-0.20 (on top of MAPREDUCE-1213.branch-0.20.patch) TrackerDistributedCacheManager can delete file asynchronously - Key: MAPREDUCE-1302 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1302 Project: Hadoop Map/Reduce Issue Type: Improvement Components: tasktracker Affects Versions: 0.20.2, 0.21.0, 0.22.0 Reporter: Zheng Shao Assignee: Zheng Shao Fix For: 0.22.0 Attachments: MAPREDUCE-1302.0.patch, MAPREDUCE-1302.1.patch, MAPREDUCE-1302.2.patch, MAPREDUCE-1302.3.patch, MAPREDUCE-1302.4.patch, MAPREDUCE-1302.5.patch, MAPREDUCE-1302.branch-0.20.on.top.of.MAPREDUCE-1213.patch With the help of AsyncDiskService from MAPREDUCE-1213, we should be able to delete files from distributed cache asynchronously. That will help make task initialization faster, because task initialization calls the code that localizes files into the cache and may delete some other files. The deletion can slow down the task initialization speed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1366) Tests should not timeout if TaskTracker/JobTracker crashes in MiniMRCluster
[ https://issues.apache.org/jira/browse/MAPREDUCE-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799499#action_12799499 ] Hadoop QA commented on MAPREDUCE-1366: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12429994/M1366-0.patch against trunk revision 898486. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/267/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/267/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/267/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/267/console This message is automatically generated. Tests should not timeout if TaskTracker/JobTracker crashes in MiniMRCluster --- Key: MAPREDUCE-1366 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1366 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Reporter: Amareshwari Sriramadasu Fix For: 0.22.0 Attachments: M1366-0.patch Currently tests timeout if there is any problem bringing up JobTracker or TaskTracker in MiniMRCluster. Instead tests should fail saying JT/TT crashed. See test timeout on MAPREDUCE-1365 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1372) ConcurrentModificationException in JobInProgress
[ https://issues.apache.org/jira/browse/MAPREDUCE-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799507#action_12799507 ] Chris Douglas commented on MAPREDUCE-1372: -- bq. Could nodesAtMaxLevel be made a CopyOnWriteArraySet? The cache includes hosts for remote clusters, as well; it may be fairly large and not necessarily stable. The O\(n) cost of adding entries is also not encouraging, particularly since it creates a copy of the array as it's searching for a match on every insertion. Why there isn't an equivalent ConcurrentHashSet in the platform- as with HashSet/HashMap- is a mystery to me. Arun's sounds like the correct fix; decreasing the number of forks at the JobTracker is always a good idea. Verifying its correctness may be harder than changing the data structure, though. ConcurrentModificationException in JobInProgress Key: MAPREDUCE-1372 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1372 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.20.1 Reporter: Amareshwari Sriramadasu Priority: Blocker Fix For: 0.21.0 Attachments: M1372-0.patch We have seen the following ConcurrentModificationException in one of our clusters {noformat} java.io.IOException: java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793) at java.util.HashMap$KeyIterator.next(HashMap.java:828) at org.apache.hadoop.mapred.JobInProgress.findNewMapTask(JobInProgress.java:2018) at org.apache.hadoop.mapred.JobInProgress.obtainNewMapTask(JobInProgress.java:1077) at org.apache.hadoop.mapred.CapacityTaskScheduler$MapSchedulingMgr.obtainNewTask(CapacityTaskScheduler.java:796) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.getTaskFromQueue(CapacityTaskScheduler.java:589) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.assignTasks(CapacityTaskScheduler.java:677) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.access$500(CapacityTaskScheduler.java:348) at org.apache.hadoop.mapred.CapacityTaskScheduler.addMapTask(CapacityTaskScheduler.java:1397) at org.apache.hadoop.mapred.CapacityTaskScheduler.assignTasks(CapacityTaskScheduler.java:1349) at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:2976) at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1372) ConcurrentModificationException in JobInProgress
[ https://issues.apache.org/jira/browse/MAPREDUCE-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799509#action_12799509 ] Todd Lipcon commented on MAPREDUCE-1372: Maybe I misunderstood Arun's idea, but it sounded to me like it added a lot of complexity. This JIRA indicates this bug exists in 0.20.1 - do we anticipate fixing it for branch-20, or only for 21? If for branch-20, I think the COWArraySet is safer, no? bq. Why there isn't an equivalent ConcurrentHashSet in the platform- as with HashSet/HashMap- is a mystery to me. What about ConcurrentSkipListSet? It's logarithmic and has weakly consistent iteration. ConcurrentModificationException in JobInProgress Key: MAPREDUCE-1372 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1372 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.20.1 Reporter: Amareshwari Sriramadasu Priority: Blocker Fix For: 0.21.0 Attachments: M1372-0.patch We have seen the following ConcurrentModificationException in one of our clusters {noformat} java.io.IOException: java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793) at java.util.HashMap$KeyIterator.next(HashMap.java:828) at org.apache.hadoop.mapred.JobInProgress.findNewMapTask(JobInProgress.java:2018) at org.apache.hadoop.mapred.JobInProgress.obtainNewMapTask(JobInProgress.java:1077) at org.apache.hadoop.mapred.CapacityTaskScheduler$MapSchedulingMgr.obtainNewTask(CapacityTaskScheduler.java:796) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.getTaskFromQueue(CapacityTaskScheduler.java:589) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.assignTasks(CapacityTaskScheduler.java:677) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.access$500(CapacityTaskScheduler.java:348) at org.apache.hadoop.mapred.CapacityTaskScheduler.addMapTask(CapacityTaskScheduler.java:1397) at org.apache.hadoop.mapred.CapacityTaskScheduler.assignTasks(CapacityTaskScheduler.java:1349) at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:2976) at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1338) need security keys storage solution
[ https://issues.apache.org/jira/browse/MAPREDUCE-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boris Shkolnik updated MAPREDUCE-1338: -- Attachment: MAPREDUCE-1338-4.patch addressed comments by Devaraj. need security keys storage solution --- Key: MAPREDUCE-1338 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1338 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: Boris Shkolnik Assignee: Boris Shkolnik Attachments: HADOOP-6325.patch, MAPREDUCE-1338-2.patch, MAPREDUCE-1338-4.patch, MAPREDUCE-1338.patch set, get, store, load security keys key alias - byte[] key value - byte[] store/load from DataInput/Output stream -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1367) LocalJobRunner should support parallel mapper execution
[ https://issues.apache.org/jira/browse/MAPREDUCE-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Kimball updated MAPREDUCE-1367: - Status: Open (was: Patch Available) LocalJobRunner should support parallel mapper execution --- Key: MAPREDUCE-1367 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1367 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Aaron Kimball Assignee: Aaron Kimball Attachments: MAPREDUCE-1367.2.patch, MAPREDUCE-1367.3.patch, MAPREDUCE-1367.4.patch, MAPREDUCE-1367.5.patch, MAPREDUCE-1367.patch The LocalJobRunner currently supports only a single execution thread. Given the prevalence of multi-core CPUs, it makes sense to allow users to run multiple tasks in parallel for improved performance on small (local-only) jobs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1367) LocalJobRunner should support parallel mapper execution
[ https://issues.apache.org/jira/browse/MAPREDUCE-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Kimball updated MAPREDUCE-1367: - Status: Patch Available (was: Open) LocalJobRunner should support parallel mapper execution --- Key: MAPREDUCE-1367 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1367 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Aaron Kimball Assignee: Aaron Kimball Attachments: MAPREDUCE-1367.2.patch, MAPREDUCE-1367.3.patch, MAPREDUCE-1367.4.patch, MAPREDUCE-1367.5.patch, MAPREDUCE-1367.patch The LocalJobRunner currently supports only a single execution thread. Given the prevalence of multi-core CPUs, it makes sense to allow users to run multiple tasks in parallel for improved performance on small (local-only) jobs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1367) LocalJobRunner should support parallel mapper execution
[ https://issues.apache.org/jira/browse/MAPREDUCE-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Kimball updated MAPREDUCE-1367: - Attachment: MAPREDUCE-1367.5.patch Todd, good catch: it didn't handle the 0-mapper case. It does now. Testcase included. LocalJobRunner should support parallel mapper execution --- Key: MAPREDUCE-1367 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1367 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Aaron Kimball Assignee: Aaron Kimball Attachments: MAPREDUCE-1367.2.patch, MAPREDUCE-1367.3.patch, MAPREDUCE-1367.4.patch, MAPREDUCE-1367.5.patch, MAPREDUCE-1367.patch The LocalJobRunner currently supports only a single execution thread. Given the prevalence of multi-core CPUs, it makes sense to allow users to run multiple tasks in parallel for improved performance on small (local-only) jobs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1327) Oracle database import via sqoop fails when a table contains the column types such as TIMESTAMP(6) WITH LOCAL TIME ZONE and TIMESTAMP(6) WITH TIME ZONE
[ https://issues.apache.org/jira/browse/MAPREDUCE-1327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799534#action_12799534 ] Aaron Kimball commented on MAPREDUCE-1327: -- Thanks for the update. This patch looks good. All sqoop tests pass on my machine, so +1 from me, pending a +1 from Hudson. Oracle database import via sqoop fails when a table contains the column types such as TIMESTAMP(6) WITH LOCAL TIME ZONE and TIMESTAMP(6) WITH TIME ZONE --- Key: MAPREDUCE-1327 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1327 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/sqoop Affects Versions: 0.22.0 Reporter: Leonid Furman Fix For: 0.22.0 Attachments: MAPREDUCE-1327.3.patch, MAPREDUCE-1327.4.patch, MAPREDUCE-1327.5.patch, MAPREDUCE-1327.patch Original Estimate: 96h Remaining Estimate: 96h When Oracle table contains the columns TIMESTAMP(6) WITH LOCAL TIME ZONE and TIMESTAMP(6) WITH TIME ZONE, Sqoop fails to map values for those columns to valid Java data types, resulting in the following exception: ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.NullPointerException java.lang.NullPointerException at org.apache.hadoop.sqoop.orm.ClassWriter.generateFields(ClassWriter.java:253) at org.apache.hadoop.sqoop.orm.ClassWriter.generateClassForColumns(ClassWriter.java:701) at org.apache.hadoop.sqoop.orm.ClassWriter.generate(ClassWriter.java:597) at org.apache.hadoop.sqoop.Sqoop.generateORM(Sqoop.java:75) at org.apache.hadoop.sqoop.Sqoop.importTable(Sqoop.java:87) at org.apache.hadoop.sqoop.Sqoop.run(Sqoop.java:175) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.hadoop.sqoop.Sqoop.main(Sqoop.java:201) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) I have modified the code for Hadoop and Sqoop so this bug is fixed on my machine. Please let me know if you would like me to generate the patch and upload it to this ticket. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1327) Oracle database import via sqoop fails when a table contains the column types such as TIMESTAMP(6) WITH LOCAL TIME ZONE and TIMESTAMP(6) WITH TIME ZONE
[ https://issues.apache.org/jira/browse/MAPREDUCE-1327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799539#action_12799539 ] Leonid Furman commented on MAPREDUCE-1327: -- Thanks Aaron! It's been a great experience! I also wanted to ask if this patch will be applied to Cloudera source repository any time soon. The reason I am asking is because my HDFS cluster is running on the current Hadoop release version - 0.20.0, which doesn't support Oracle. Therefore, if I build hadoop from trunk and run Sqoop, it will not work. But Cloudera's latest release supports Oracle, and when this patch MAPREDUCE-1327 is applied to Cloudera, installed on both the namenode and HDFS cluster, Sqoop should work as expected. Thank you in advance. Oracle database import via sqoop fails when a table contains the column types such as TIMESTAMP(6) WITH LOCAL TIME ZONE and TIMESTAMP(6) WITH TIME ZONE --- Key: MAPREDUCE-1327 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1327 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/sqoop Affects Versions: 0.22.0 Reporter: Leonid Furman Fix For: 0.22.0 Attachments: MAPREDUCE-1327.3.patch, MAPREDUCE-1327.4.patch, MAPREDUCE-1327.5.patch, MAPREDUCE-1327.patch Original Estimate: 96h Remaining Estimate: 96h When Oracle table contains the columns TIMESTAMP(6) WITH LOCAL TIME ZONE and TIMESTAMP(6) WITH TIME ZONE, Sqoop fails to map values for those columns to valid Java data types, resulting in the following exception: ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.NullPointerException java.lang.NullPointerException at org.apache.hadoop.sqoop.orm.ClassWriter.generateFields(ClassWriter.java:253) at org.apache.hadoop.sqoop.orm.ClassWriter.generateClassForColumns(ClassWriter.java:701) at org.apache.hadoop.sqoop.orm.ClassWriter.generate(ClassWriter.java:597) at org.apache.hadoop.sqoop.Sqoop.generateORM(Sqoop.java:75) at org.apache.hadoop.sqoop.Sqoop.importTable(Sqoop.java:87) at org.apache.hadoop.sqoop.Sqoop.run(Sqoop.java:175) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.hadoop.sqoop.Sqoop.main(Sqoop.java:201) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) I have modified the code for Hadoop and Sqoop so this bug is fixed on my machine. Please let me know if you would like me to generate the patch and upload it to this ticket. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1342) Potential JT deadlock in faulty TT tracking
[ https://issues.apache.org/jira/browse/MAPREDUCE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-1342: --- Attachment: patch-1342-2.txt Attaching the patch again. As Hudson picked up wrong patch. Potential JT deadlock in faulty TT tracking --- Key: MAPREDUCE-1342 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1342 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.20.1 Reporter: Todd Lipcon Assignee: Amareshwari Sriramadasu Fix For: 0.21.0 Attachments: cycle0.png, mapreduce-1342-1.patch, mapreduce-1342-2.patch, patch-1342-1.txt, patch-1342-2-ydist.txt, patch-1342-2.txt, patch-1342-2.txt, patch-1342-ydist.txt, patch-1342.txt JT$FaultyTrackersInfo.incrementFaults first locks potentiallyFaultyTrackers, and then calls blackListTracker, which calls removeHostCapacity, which locks JT.taskTrackers On the other hand, JT.blacklistedTaskTrackers() locks taskTrackers, then calls faultyTrackers.isBlacklisted() which goes on to lock potentiallyFaultyTrackers. I haven't produced such a deadlock, but the lock ordering here is inverted and therefore could deadlock. Not sure if this goes back to 0.21 or just in trunk. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1214) Add support for counters in Hadoop Local Mode
[ https://issues.apache.org/jira/browse/MAPREDUCE-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankit Modi updated MAPREDUCE-1214: -- Attachment: MAPREDUCE-1214.2.patch Hi Jeff, Here is another patch based on suggestions provided in your patch. It does not change any api's. The only issue pending in this is number of bytes written. The value that gets reported here is very high. Add support for counters in Hadoop Local Mode - Key: MAPREDUCE-1214 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1214 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Ankit Modi Attachments: MAPREDUCE-1214.2.patch, MAPREDUCE-1214.patch Currently there is no support for counters ( Records and Bytes written ) in Hadoop Local Mode. Pig requires to provide counters to user when running in Hadoop Local Mode. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1372) ConcurrentModificationException in JobInProgress
[ https://issues.apache.org/jira/browse/MAPREDUCE-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799612#action_12799612 ] Amareshwari Sriramadasu commented on MAPREDUCE-1372: bq. This actually needs a bit more work, we'll need the JIP to call 'wait' on itself while the hosts are being resolved and then the thread doing the resolutions will have to signal the JIP to continue. Node resolution through heartbeat(JT.addNewTracker) also need to wait for the thread, right? Earlier node resolution was done by a thread, but is removed through HADOOP-3780 and HADOOP-3620. ConcurrentModificationException in JobInProgress Key: MAPREDUCE-1372 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1372 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.20.1 Reporter: Amareshwari Sriramadasu Priority: Blocker Fix For: 0.21.0 Attachments: M1372-0.patch We have seen the following ConcurrentModificationException in one of our clusters {noformat} java.io.IOException: java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793) at java.util.HashMap$KeyIterator.next(HashMap.java:828) at org.apache.hadoop.mapred.JobInProgress.findNewMapTask(JobInProgress.java:2018) at org.apache.hadoop.mapred.JobInProgress.obtainNewMapTask(JobInProgress.java:1077) at org.apache.hadoop.mapred.CapacityTaskScheduler$MapSchedulingMgr.obtainNewTask(CapacityTaskScheduler.java:796) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.getTaskFromQueue(CapacityTaskScheduler.java:589) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.assignTasks(CapacityTaskScheduler.java:677) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.access$500(CapacityTaskScheduler.java:348) at org.apache.hadoop.mapred.CapacityTaskScheduler.addMapTask(CapacityTaskScheduler.java:1397) at org.apache.hadoop.mapred.CapacityTaskScheduler.assignTasks(CapacityTaskScheduler.java:1349) at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:2976) at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (MAPREDUCE-1374) Reduce memory footprint of FileSplit
[ https://issues.apache.org/jira/browse/MAPREDUCE-1374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Shao reassigned MAPREDUCE-1374: - Assignee: Zheng Shao Reduce memory footprint of FileSplit Key: MAPREDUCE-1374 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1374 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 0.20.1, 0.21.0, 0.22.0 Reporter: Zheng Shao Assignee: Zheng Shao We can have many FileInput objects in the memory, depending on the number of mappers. It will save tons of memory on JobTracker and JobClient if we intern those Strings for host names. {code} FileInputFormat.java: for (NodeInfo host: hostList) { // Strip out the port number from the host name -retVal[index++] = host.node.getName().split(:)[0]; +retVal[index++] = host.node.getName().split(:)[0].intern(); if (index == replicationFactor) { done = true; break; } } {code} More on String.intern(): http://www.javaworld.com/javaworld/javaqa/2003-12/01-qa-1212-intern.html It will also save a lot of memory by changing the class of {{file}} from {{Path}} to {{String}}. {{Path}} contains a {{java.net.URI}} which internally contains ~10 String fields. This will also be a huge saving. {code} private Path file; {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1374) Reduce memory footprint of FileSplit
[ https://issues.apache.org/jira/browse/MAPREDUCE-1374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799632#action_12799632 ] Zheng Shao commented on MAPREDUCE-1374: --- This experiment is done on hadoop-0.20. It shows JobClient memory usage by submitting a map-reduce job with around 200K mappers: jmap before using this patch: (OOM before getting to the same stage as the second example) {code} num #instances #bytes class name -- 1:188870 18107344 [C 2:2426169704640 java.lang.String 3: 428506543408 constMethodKlass 4: 732185271696 org.apache.hadoop.hive.ql.io.HiveInputFormat$HiveInputSplit 5: 428505151504 methodKlass 6: 35704693192 constantPoolKlass 7: 720773647360 symbolKlass 8: 733073518736 org.apache.hadoop.mapred.FileSplit 9: 754243075008 [Ljava.lang.String; 10: 35702818968 instanceKlassKlass 11: 27412524096 constantPoolCacheKlass ... 14: 100691449936 java.net.URI ... 23: 10065 241560 org.apache.hadoop.fs.Path {code} jmap after this patch: {code} num #instances #bytes class name -- 1:199014 14329008 org.apache.hadoop.hive.ql.io.HiveInputFormat$HiveInputSplit 2:2018019818856 [Ljava.lang.String; 3:1996849584832 org.apache.hadoop.mapred.FileSplit 4: 565948211632 [C 5: 428516543872 constMethodKlass 6: 428515151624 methodKlass 7: 35704693616 constantPoolKlass 8: 720913648368 symbolKlass 9: 35702818968 instanceKlassKlass 10: 25172675256 [Ljava.lang.Object; 11: 47632531104 [I 12: 27412524320 constantPoolCacheKlass 13: 622752491000 java.lang.String ... 31: 456 65664 java.net.URI ... 69: 452 10848 org.apache.hadoop.fs.Path {code} String:FileSplit ratio: before this patch: 3.3 : 1 after this patch: 0.3 : 1 We reduced the number of String object by 10 times! Reduce memory footprint of FileSplit Key: MAPREDUCE-1374 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1374 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 0.20.1, 0.21.0, 0.22.0 Reporter: Zheng Shao Assignee: Zheng Shao We can have many FileInput objects in the memory, depending on the number of mappers. It will save tons of memory on JobTracker and JobClient if we intern those Strings for host names. {code} FileInputFormat.java: for (NodeInfo host: hostList) { // Strip out the port number from the host name -retVal[index++] = host.node.getName().split(:)[0]; +retVal[index++] = host.node.getName().split(:)[0].intern(); if (index == replicationFactor) { done = true; break; } } {code} More on String.intern(): http://www.javaworld.com/javaworld/javaqa/2003-12/01-qa-1212-intern.html It will also save a lot of memory by changing the class of {{file}} from {{Path}} to {{String}}. {{Path}} contains a {{java.net.URI}} which internally contains ~10 String fields. This will also be a huge saving. {code} private Path file; {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1374) Reduce memory footprint of FileSplit
[ https://issues.apache.org/jira/browse/MAPREDUCE-1374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Shao updated MAPREDUCE-1374: -- Attachment: MAPREDUCE-1374.1.patch I am not sure whether I should create a new String[] in the constructor and then change the elements. Since file is private, this should be compatible with any other derived classes. Reduce memory footprint of FileSplit Key: MAPREDUCE-1374 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1374 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 0.20.1, 0.21.0, 0.22.0 Reporter: Zheng Shao Assignee: Zheng Shao Attachments: MAPREDUCE-1374.1.patch We can have many FileInput objects in the memory, depending on the number of mappers. It will save tons of memory on JobTracker and JobClient if we intern those Strings for host names. {code} FileInputFormat.java: for (NodeInfo host: hostList) { // Strip out the port number from the host name -retVal[index++] = host.node.getName().split(:)[0]; +retVal[index++] = host.node.getName().split(:)[0].intern(); if (index == replicationFactor) { done = true; break; } } {code} More on String.intern(): http://www.javaworld.com/javaworld/javaqa/2003-12/01-qa-1212-intern.html It will also save a lot of memory by changing the class of {{file}} from {{Path}} to {{String}}. {{Path}} contains a {{java.net.URI}} which internally contains ~10 String fields. This will also be a huge saving. {code} private Path file; {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1374) Reduce memory footprint of FileSplit
[ https://issues.apache.org/jira/browse/MAPREDUCE-1374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Shao updated MAPREDUCE-1374: -- Fix Version/s: 0.22.0 0.21.0 Status: Patch Available (was: Open) Reduce memory footprint of FileSplit Key: MAPREDUCE-1374 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1374 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 0.20.1, 0.21.0, 0.22.0 Reporter: Zheng Shao Assignee: Zheng Shao Fix For: 0.21.0, 0.22.0 Attachments: MAPREDUCE-1374.1.patch, MAPREDUCE-1374.2.patch We can have many FileInput objects in the memory, depending on the number of mappers. It will save tons of memory on JobTracker and JobClient if we intern those Strings for host names. {code} FileInputFormat.java: for (NodeInfo host: hostList) { // Strip out the port number from the host name -retVal[index++] = host.node.getName().split(:)[0]; +retVal[index++] = host.node.getName().split(:)[0].intern(); if (index == replicationFactor) { done = true; break; } } {code} More on String.intern(): http://www.javaworld.com/javaworld/javaqa/2003-12/01-qa-1212-intern.html It will also save a lot of memory by changing the class of {{file}} from {{Path}} to {{String}}. {{Path}} contains a {{java.net.URI}} which internally contains ~10 String fields. This will also be a huge saving. {code} private Path file; {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.