[jira] [Commented] (MAPREDUCE-5848) MapReduce counts forcibly preempted containers as FAILED
[ https://issues.apache.org/jira/browse/MAPREDUCE-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13975008#comment-13975008 ] Vinod Kumar Vavilapalli commented on MAPREDUCE-5848: We should also expose a task-attempt-diagnostic which says the container got preempted, that will help a bit. bq. When a task attempt receives SIGTERM from the NM it causes the FileSystem to close via the shutdown hook and often causes exceptions within the task. I haven't seen this in YarnChild. Is FileSystem itself installing this? > MapReduce counts forcibly preempted containers as FAILED > > > Key: MAPREDUCE-5848 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5848 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.1.0-beta >Reporter: Carlo Curino >Assignee: Subramaniam Krishnan > Attachments: YARN-1958.patch > > > The MapReduce AM is considering a forcibly preempted container as FAILED, > while I think it should be considered as KILLED (i.e., not count against the > maximum number of failures). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5848) MapReduce counts forcibly preempted containers as FAILED
[ https://issues.apache.org/jira/browse/MAPREDUCE-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13974973#comment-13974973 ] Carlo Curino commented on MAPREDUCE-5848: - Jason, thanks for moving to MR.. I mislabel it. Regarding your comment, I was suspecting that something like that was going one (hence my doubting the patch was enough). On the positive side, the AM should know the containers was on the short-list to be killed from previous preemption messages it received so maybe it could count a failure of a container "doomed" by preemption as a kill? Or simply postpone the decision on FAIL/KILL. Not sure... I don't have time to look at this now, but if you provide a fix I am happy to help you validate it (for different reasons we are testing preemption under rather extreme scenarios). > MapReduce counts forcibly preempted containers as FAILED > > > Key: MAPREDUCE-5848 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5848 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.1.0-beta >Reporter: Carlo Curino >Assignee: Subramaniam Krishnan > Attachments: YARN-1958.patch > > > The MapReduce AM is considering a forcibly preempted container as FAILED, > while I think it should be considered as KILLED (i.e., not count against the > maximum number of failures). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-4937) MR AM handles an oversized split metainfo file poorly
[ https://issues.apache.org/jira/browse/MAPREDUCE-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13974884#comment-13974884 ] Hudson commented on MAPREDUCE-4937: --- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1762 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1762/]) MAPREDUCE-4937. MR AM handles an oversized split metainfo file poorly. Contributed by Eric Payne (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1588559) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/event/JobEventType.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestJobImpl.java > MR AM handles an oversized split metainfo file poorly > - > > Key: MAPREDUCE-4937 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4937 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 2.0.2-alpha, 0.23.5 >Reporter: Jason Lowe >Assignee: Eric Payne > Fix For: 3.0.0, 2.5.0 > > Attachments: MAPREDUCE-4937.MRAMHandlOversizeSplits.txt, > MAPREDUCE-4937.MRAMHandlOversizeSplits.txt, > MAPREDUCE-4937.MRAMHandlOversizeSplits.txt > > > When an job runs with a split metainfo file that's larger than it has been > configured to handle then it just crashes. This leaves the user with a > less-than-ideal debug session since there are no useful diagnostic messages > sent to the client for this failure. In addition it crashes before > registering/unregistering with the RM and crashes without generating history, > so the proxy URL is not very useful and there's no archived configuration to > check to see what setting the AM was using when it encountered the error. > The AM should handle this error case more gracefully and treat the failure as > it does any other failed job, with a proper unregistration from the RM and > with history. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5642) TestMiniMRChildTask fails on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-5642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13974882#comment-13974882 ] Hudson commented on MAPREDUCE-5642: --- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1762 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1762/]) MAPREDUCE-5642. TestMiniMRChildTask fails on Windows. Contributed by Chuan Liu. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1588605) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestMiniMRChildTask.java > TestMiniMRChildTask fails on Windows > > > Key: MAPREDUCE-5642 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5642 > Project: Hadoop Map/Reduce > Issue Type: Test > Components: test >Affects Versions: 3.0.0, 2.4.0 >Reporter: Chuan Liu >Assignee: Chuan Liu >Priority: Minor > Fix For: 3.0.0, 2.5.0 > > Attachments: MAPREDUCE-5642.patch > > > The test fails on Windows as a regression from MAPREDUCE-5451. In > MAPREDUCE-5451, we set default config of "mapreduce.admin.user.env" to > "PATH=%PATH%;%HADOOP_COMMON_HOME%\\bin" on Windows. In the test, we set > "PATH=%PATH%;tmp" for "mapreduce.map.env" and "mapreduce.map.env". Because > the the change in MAPREDUCE-5451, PATH will be set twice now and the value we > get in the child tasks no longer matches the previous expected value. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5642) TestMiniMRChildTask fails on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-5642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13974825#comment-13974825 ] Hudson commented on MAPREDUCE-5642: --- SUCCESS: Integrated in Hadoop-Yarn-trunk #545 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/545/]) MAPREDUCE-5642. TestMiniMRChildTask fails on Windows. Contributed by Chuan Liu. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1588605) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestMiniMRChildTask.java > TestMiniMRChildTask fails on Windows > > > Key: MAPREDUCE-5642 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5642 > Project: Hadoop Map/Reduce > Issue Type: Test > Components: test >Affects Versions: 3.0.0, 2.4.0 >Reporter: Chuan Liu >Assignee: Chuan Liu >Priority: Minor > Fix For: 3.0.0, 2.5.0 > > Attachments: MAPREDUCE-5642.patch > > > The test fails on Windows as a regression from MAPREDUCE-5451. In > MAPREDUCE-5451, we set default config of "mapreduce.admin.user.env" to > "PATH=%PATH%;%HADOOP_COMMON_HOME%\\bin" on Windows. In the test, we set > "PATH=%PATH%;tmp" for "mapreduce.map.env" and "mapreduce.map.env". Because > the the change in MAPREDUCE-5451, PATH will be set twice now and the value we > get in the child tasks no longer matches the previous expected value. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-4937) MR AM handles an oversized split metainfo file poorly
[ https://issues.apache.org/jira/browse/MAPREDUCE-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13974827#comment-13974827 ] Hudson commented on MAPREDUCE-4937: --- SUCCESS: Integrated in Hadoop-Yarn-trunk #545 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/545/]) MAPREDUCE-4937. MR AM handles an oversized split metainfo file poorly. Contributed by Eric Payne (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1588559) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/event/JobEventType.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestJobImpl.java > MR AM handles an oversized split metainfo file poorly > - > > Key: MAPREDUCE-4937 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4937 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 2.0.2-alpha, 0.23.5 >Reporter: Jason Lowe >Assignee: Eric Payne > Fix For: 3.0.0, 2.5.0 > > Attachments: MAPREDUCE-4937.MRAMHandlOversizeSplits.txt, > MAPREDUCE-4937.MRAMHandlOversizeSplits.txt, > MAPREDUCE-4937.MRAMHandlOversizeSplits.txt > > > When an job runs with a split metainfo file that's larger than it has been > configured to handle then it just crashes. This leaves the user with a > less-than-ideal debug session since there are no useful diagnostic messages > sent to the client for this failure. In addition it crashes before > registering/unregistering with the RM and crashes without generating history, > so the proxy URL is not very useful and there's no archived configuration to > check to see what setting the AM was using when it encountered the error. > The AM should handle this error case more gracefully and treat the failure as > it does any other failed job, with a proper unregistration from the RM and > with history. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5465) Container killed before hprof dumps profile.out
[ https://issues.apache.org/jira/browse/MAPREDUCE-5465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13974786#comment-13974786 ] Hadoop QA commented on MAPREDUCE-5465: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12640939/MAPREDUCE-5465-4.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 8 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 1288 javac compiler warnings (more than the trunk's current 1287 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4539//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4539//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-app.html Javac warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4539//artifact/trunk/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4539//console This message is automatically generated. > Container killed before hprof dumps profile.out > --- > > Key: MAPREDUCE-5465 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5465 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am, mrv2 >Affects Versions: trunk, 2.0.3-alpha >Reporter: Radim Kolar >Assignee: Ming Ma > Attachments: MAPREDUCE-5465-2.patch, MAPREDUCE-5465-3.patch, > MAPREDUCE-5465-4.patch, MAPREDUCE-5465.patch > > > If there is profiling enabled for mapper or reducer then hprof dumps > profile.out at process exit. It is dumped after task signaled to AM that work > is finished. > AM kills container with finished work without waiting for hprof to finish > dumps. If hprof is dumping larger outputs (such as with depth=4 while depth=3 > works) , it could not finish dump in time before being killed making entire > dump unusable because cpu and heap stats are missing. > There needs to be better delay before container is killed if profiling is > enabled. -- This message was sent by Atlassian JIRA (v6.2#6252)