[jira] [Updated] (MAPREDUCE-6611) Localization timeout diagnostic for taskattempts
[ https://issues.apache.org/jira/browse/MAPREDUCE-6611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6611: Attachment: MAPREDUCE-6611.patch > Localization timeout diagnostic for taskattempts > > > Key: MAPREDUCE-6611 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6611 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6611.patch > > > When a container takes too long to localize it manifests as a timeout, and > there's no indication that localization was the issue. We need diagnostics > for timeouts to indicate the container was still localizing when the timeout > occurred. Dependent upon YARN-4589 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6611) Localization timeout diagnostic for taskattempts
Chang Li created MAPREDUCE-6611: --- Summary: Localization timeout diagnostic for taskattempts Key: MAPREDUCE-6611 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6611 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Chang Li Assignee: Chang Li When a container takes too long to localize it manifests as a timeout, and there's no indication that localization was the issue. We need diagnostics for timeouts to indicate the container was still localizing when the timeout occurred. Dependent upon YARN-4589 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6533) testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken
[ https://issues.apache.org/jira/browse/MAPREDUCE-6533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6533: Attachment: MAPREDUCE-6533.6.3.patch Thanks for very patient and careful review [~jlowe]! update .6.3 patch accordingly > testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken > - > > Key: MAPREDUCE-6533 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6533 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6533.2.patch, MAPREDUCE-6533.3.patch, > MAPREDUCE-6533.4.patch, MAPREDUCE-6533.4.patch, MAPREDUCE-6533.5.patch, > MAPREDUCE-6533.6.2.patch, MAPREDUCE-6533.6.3.patch, MAPREDUCE-6533.6.patch, > MAPREDUCE-6533.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6533) testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken
[ https://issues.apache.org/jira/browse/MAPREDUCE-6533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6533: Attachment: MAPREDUCE-6533.6.2.patch .6.2 fix tear down as well > testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken > - > > Key: MAPREDUCE-6533 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6533 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6533.2.patch, MAPREDUCE-6533.3.patch, > MAPREDUCE-6533.4.patch, MAPREDUCE-6533.4.patch, MAPREDUCE-6533.5.patch, > MAPREDUCE-6533.6.2.patch, MAPREDUCE-6533.6.patch, MAPREDUCE-6533.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6533) testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken
[ https://issues.apache.org/jira/browse/MAPREDUCE-6533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15000502#comment-15000502 ] Chang Li commented on MAPREDUCE-6533: - sorry, over look the comment about tearDown part. will fix that > testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken > - > > Key: MAPREDUCE-6533 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6533 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6533.2.patch, MAPREDUCE-6533.3.patch, > MAPREDUCE-6533.4.patch, MAPREDUCE-6533.4.patch, MAPREDUCE-6533.5.patch, > MAPREDUCE-6533.6.patch, MAPREDUCE-6533.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6533) testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken
[ https://issues.apache.org/jira/browse/MAPREDUCE-6533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6533: Attachment: MAPREDUCE-6533.6.patch Thanks [~jlowe] for further review! update .6 patch accordingly > testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken > - > > Key: MAPREDUCE-6533 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6533 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6533.2.patch, MAPREDUCE-6533.3.patch, > MAPREDUCE-6533.4.patch, MAPREDUCE-6533.4.patch, MAPREDUCE-6533.5.patch, > MAPREDUCE-6533.6.patch, MAPREDUCE-6533.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6533) testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken
[ https://issues.apache.org/jira/browse/MAPREDUCE-6533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6533: Attachment: MAPREDUCE-6533.5.patch Thanks [~djp] for review! update .5 patch according to your suggestion > testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken > - > > Key: MAPREDUCE-6533 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6533 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6533.2.patch, MAPREDUCE-6533.3.patch, > MAPREDUCE-6533.4.patch, MAPREDUCE-6533.4.patch, MAPREDUCE-6533.5.patch, > MAPREDUCE-6533.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6533) testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken
[ https://issues.apache.org/jira/browse/MAPREDUCE-6533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6533: Attachment: MAPREDUCE-6533.4.patch > testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken > - > > Key: MAPREDUCE-6533 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6533 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6533.2.patch, MAPREDUCE-6533.3.patch, > MAPREDUCE-6533.4.patch, MAPREDUCE-6533.4.patch, MAPREDUCE-6533.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6533) testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken
[ https://issues.apache.org/jira/browse/MAPREDUCE-6533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6533: Attachment: MAPREDUCE-6533.4.patch Thanks [~jlowe] for further review! update .4 patch to address your concern > testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken > - > > Key: MAPREDUCE-6533 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6533 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6533.2.patch, MAPREDUCE-6533.3.patch, > MAPREDUCE-6533.4.patch, MAPREDUCE-6533.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6533) testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken
[ https://issues.apache.org/jira/browse/MAPREDUCE-6533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6533: Attachment: MAPREDUCE-6533.3.patch .3 patch fix whitespace issue > testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken > - > > Key: MAPREDUCE-6533 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6533 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6533.2.patch, MAPREDUCE-6533.3.patch, > MAPREDUCE-6533.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6533) testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken
[ https://issues.apache.org/jira/browse/MAPREDUCE-6533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6533: Attachment: MAPREDUCE-6533.2.patch Thanks [~jlowe] for review! have updated .2 patch according to your suggestion to create them as Path object > testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken > - > > Key: MAPREDUCE-6533 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6533 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6533.2.patch, MAPREDUCE-6533.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6533) testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken
[ https://issues.apache.org/jira/browse/MAPREDUCE-6533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14986258#comment-14986258 ] Chang Li commented on MAPREDUCE-6533: - [~jlowe], thanks for review and comment concerns! Right, this broken test used to work, its current error seem to be caused by jdk change. In the test, the {{TEST_ROOT_DIR}} is {code}file:/Users/changli/hadoop/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/target/test-dir/ {code} when I run the test on my machine. But {{TEST_VISIBILITY_DIR}} is {code} file:/Users/changli/hadoop/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/file:/Users/changli/hadoop/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/target/test-dir/TestCacheVisibility/{code} which used to be {code}file:/Users/changli/hadoop/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/target/test-dir/TestCacheVisibility{code}. When a file created by {{File(String parent, String child)}}, its absolute pathname and pathname differed as above. The {{testDetermineCacheVisibilities}} intended to test under {{file:/Users/changli/hadoop/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/target/test-dir/TestCacheVisibility}} which is TestCacheVisibility dir under {{TEST_ROOT_DIR}}, so I reconstruct the TEST_VISIBILITY_DIR string by this simple way rather than the unnecessarily complicate means from File to URI then to String. > testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken > - > > Key: MAPREDUCE-6533 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6533 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6533.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete
[ https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14985440#comment-14985440 ] Chang Li commented on MAPREDUCE-5003: - have opened MAPREDUCE-6533 for TestClientDistributedCacheManager failure > AM recovery should recreate records for attempts that were incomplete > - > > Key: MAPREDUCE-5003 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5003.1.patch, MAPREDUCE-5003.10.patch, > MAPREDUCE-5003.2.patch, MAPREDUCE-5003.3.patch, MAPREDUCE-5003.4.patch, > MAPREDUCE-5003.5.patch, MAPREDUCE-5003.5.patch, MAPREDUCE-5003.6.patch, > MAPREDUCE-5003.7.patch, MAPREDUCE-5003.8.patch, MAPREDUCE-5003.9.patch, > MAPREDUCE-5003.9.patch > > > As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task > attempt entries for *all* task attempts launched by the prior app attempt > even if those task attempts did not complete. The attempts would have to be > marked as killed or something similar to indicate it is no longer running. > Having records for the task attempts enables the user to see what nodes were > associated with the attempts and potentially access their logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6512) FileOutputCommitter tasks unconditionally create parent directories
[ https://issues.apache.org/jira/browse/MAPREDUCE-6512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6512: Attachment: MAPREDUCE-6512.2.patch > FileOutputCommitter tasks unconditionally create parent directories > --- > > Key: MAPREDUCE-6512 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6512 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6512.2.patch, MAPREDUCE-6512.2.patch, > MAPREDUCE-6512.patch, MAPREDUCE-6512.patch > > > If the output directory is deleted then subsequent tasks should fail. Instead > they blindly create the missing parent directories, leading the job to be > "succesful" despite potentially missing almost all of the output. Task > attempts should fail if the parent app attempt directory is missing when they > go to create their task attempt directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6533) testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken
[ https://issues.apache.org/jira/browse/MAPREDUCE-6533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6533: Status: Patch Available (was: Open) > testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken > - > > Key: MAPREDUCE-6533 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6533 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6533.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6533) testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken
[ https://issues.apache.org/jira/browse/MAPREDUCE-6533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6533: Attachment: MAPREDUCE-6533.patch > testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken > - > > Key: MAPREDUCE-6533 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6533 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6533.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6533) testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken
Chang Li created MAPREDUCE-6533: --- Summary: testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken Key: MAPREDUCE-6533 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6533 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Chang Li Assignee: Chang Li -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete
[ https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-5003: Attachment: MAPREDUCE-5003.10.patch .10 patch fix some checkstyle. broken test of TestJobHistoryEventHandler is not related to my change. It may be transient since it pass in my local machine with my patch on. broken test of TestRecovery also appear to be transient becaue it pass on my local machine with my patch on. I update testMultipleCrashes of TestRecovery to improve its stability. testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken without applying my patch. Will file jira for that broken test > AM recovery should recreate records for attempts that were incomplete > - > > Key: MAPREDUCE-5003 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5003.1.patch, MAPREDUCE-5003.10.patch, > MAPREDUCE-5003.2.patch, MAPREDUCE-5003.3.patch, MAPREDUCE-5003.4.patch, > MAPREDUCE-5003.5.patch, MAPREDUCE-5003.5.patch, MAPREDUCE-5003.6.patch, > MAPREDUCE-5003.7.patch, MAPREDUCE-5003.8.patch, MAPREDUCE-5003.9.patch, > MAPREDUCE-5003.9.patch > > > As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task > attempt entries for *all* task attempts launched by the prior app attempt > even if those task attempts did not complete. The attempts would have to be > marked as killed or something similar to indicate it is no longer running. > Having records for the task attempts enables the user to see what nodes were > associated with the attempts and potentially access their logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6512) FileOutputCommitter tasks unconditionally create parent directories
[ https://issues.apache.org/jira/browse/MAPREDUCE-6512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6512: Attachment: MAPREDUCE-6512.2.patch .2 patch fix broken tests which are caused by missing parent dir > FileOutputCommitter tasks unconditionally create parent directories > --- > > Key: MAPREDUCE-6512 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6512 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6512.2.patch, MAPREDUCE-6512.patch, > MAPREDUCE-6512.patch > > > If the output directory is deleted then subsequent tasks should fail. Instead > they blindly create the missing parent directories, leading the job to be > "succesful" despite potentially missing almost all of the output. Task > attempts should fail if the parent app attempt directory is missing when they > go to create their task attempt directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete
[ https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-5003: Attachment: MAPREDUCE-5003.9.patch > AM recovery should recreate records for attempts that were incomplete > - > > Key: MAPREDUCE-5003 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5003.1.patch, MAPREDUCE-5003.2.patch, > MAPREDUCE-5003.3.patch, MAPREDUCE-5003.4.patch, MAPREDUCE-5003.5.patch, > MAPREDUCE-5003.5.patch, MAPREDUCE-5003.6.patch, MAPREDUCE-5003.7.patch, > MAPREDUCE-5003.8.patch, MAPREDUCE-5003.9.patch, MAPREDUCE-5003.9.patch > > > As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task > attempt entries for *all* task attempts launched by the prior app attempt > even if those task attempts did not complete. The attempts would have to be > marked as killed or something similar to indicate it is no longer running. > Having records for the task attempts enables the user to see what nodes were > associated with the attempts and potentially access their logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6518) Set SO_KEEPALIVE on shuffle connections
[ https://issues.apache.org/jira/browse/MAPREDUCE-6518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6518: Attachment: MAPREDUCE-6518.4.patch Thanks [~jlowe] for review! Have addressed your concerns in .4 patch > Set SO_KEEPALIVE on shuffle connections > --- > > Key: MAPREDUCE-6518 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6518 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2, nodemanager >Affects Versions: 2.7.1 >Reporter: Nathan Roberts >Assignee: Chang Li > Attachments: MAPREDUCE-6518.4.patch, YARN-4052.2.patch, > YARN-4052.3.patch, YARN-4052.patch > > > Shuffle handler does not set SO_KEEPALIVE so we've seen cases where > FDs/sockets get stuck in ESTABLISHED state indefinitely because the server > did not see the client leave (network cut or otherwise). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6512) FileOutputCommitter tasks unconditionally create parent directories
[ https://issues.apache.org/jira/browse/MAPREDUCE-6512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6512: Attachment: (was: YARN-4269.2.patch) > FileOutputCommitter tasks unconditionally create parent directories > --- > > Key: MAPREDUCE-6512 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6512 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6512.patch, MAPREDUCE-6512.patch > > > If the output directory is deleted then subsequent tasks should fail. Instead > they blindly create the missing parent directories, leading the job to be > "succesful" despite potentially missing almost all of the output. Task > attempts should fail if the parent app attempt directory is missing when they > go to create their task attempt directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6512) FileOutputCommitter tasks unconditionally create parent directories
[ https://issues.apache.org/jira/browse/MAPREDUCE-6512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6512: Attachment: YARN-4269.2.patch > FileOutputCommitter tasks unconditionally create parent directories > --- > > Key: MAPREDUCE-6512 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6512 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6512.patch, MAPREDUCE-6512.patch > > > If the output directory is deleted then subsequent tasks should fail. Instead > they blindly create the missing parent directories, leading the job to be > "succesful" despite potentially missing almost all of the output. Task > attempts should fail if the parent app attempt directory is missing when they > go to create their task attempt directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6512) FileOutputCommitter tasks unconditionally create parent directories
[ https://issues.apache.org/jira/browse/MAPREDUCE-6512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6512: Attachment: MAPREDUCE-6512.patch > FileOutputCommitter tasks unconditionally create parent directories > --- > > Key: MAPREDUCE-6512 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6512 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6512.patch, MAPREDUCE-6512.patch > > > If the output directory is deleted then subsequent tasks should fail. Instead > they blindly create the missing parent directories, leading the job to be > "succesful" despite potentially missing almost all of the output. Task > attempts should fail if the parent app attempt directory is missing when they > go to create their task attempt directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6512) FileOutputCommitter tasks unconditionally create parent directories
[ https://issues.apache.org/jira/browse/MAPREDUCE-6512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6512: Status: Patch Available (was: Open) > FileOutputCommitter tasks unconditionally create parent directories > --- > > Key: MAPREDUCE-6512 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6512 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6512.patch > > > If the output directory is deleted then subsequent tasks should fail. Instead > they blindly create the missing parent directories, leading the job to be > "succesful" despite potentially missing almost all of the output. Task > attempts should fail if the parent app attempt directory is missing when they > go to create their task attempt directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6512) FileOutputCommitter tasks unconditionally create parent directories
[ https://issues.apache.org/jira/browse/MAPREDUCE-6512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6512: Attachment: MAPREDUCE-6512.patch > FileOutputCommitter tasks unconditionally create parent directories > --- > > Key: MAPREDUCE-6512 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6512 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6512.patch > > > If the output directory is deleted then subsequent tasks should fail. Instead > they blindly create the missing parent directories, leading the job to be > "succesful" despite potentially missing almost all of the output. Task > attempts should fail if the parent app attempt directory is missing when they > go to create their task attempt directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6512) FileOutputCommitter tasks unconditionally create parent directories
Chang Li created MAPREDUCE-6512: --- Summary: FileOutputCommitter tasks unconditionally create parent directories Key: MAPREDUCE-6512 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6512 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Chang Li Assignee: Chang Li If the output directory is deleted then subsequent tasks should fail. Instead they blindly create the missing parent directories, leading the job to be "succesful" despite potentially missing almost all of the output. Task attempts should fail if the parent app attempt directory is missing when they go to create their task attempt directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete
[ https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-5003: Attachment: MAPREDUCE-5003.9.patch Thanks for review [~jlowe]! I have uploaded .9 patch to address the concerns you have. About the nit you mentioned, when a previously running task gets recovered, its state will be null, that's why I do the null check. It's null because jobhistory server only record state for a task in those completion event. So recovery will not get value of state for those previously running tasks. .9 patch deals with this problem by doing record state for task in task start event. The way I check backward compatibility is by first check if an old jobhistory files could be parsed by my modified new jobhistory server. I do this by first start a single node cluster without applying my patch, and run some jobs. Then I shutdown the jobhistory server and apply my patch, compile the new code and start up the jobhistory server which will have my changes. I check if the new jobhistory server could load and parse those old jobhistory files. I verified that I can visit all old jobhistory in the UI after restart. Also vice versa, I check if old history server could be compatible with jobhistory files generated by the jobhistory server with my change. I follow the steps above except I first run jobs in the new jobhistory server with my patch applied and then shutdown the history server and remove my patch, recompile the code and start up the jobhistory server without my patch on. I verify that those jobhistory files generated by the new jobhistory server could be parsed by the old jobhistory server. > AM recovery should recreate records for attempts that were incomplete > - > > Key: MAPREDUCE-5003 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5003.1.patch, MAPREDUCE-5003.2.patch, > MAPREDUCE-5003.3.patch, MAPREDUCE-5003.4.patch, MAPREDUCE-5003.5.patch, > MAPREDUCE-5003.5.patch, MAPREDUCE-5003.6.patch, MAPREDUCE-5003.7.patch, > MAPREDUCE-5003.8.patch, MAPREDUCE-5003.9.patch > > > As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task > attempt entries for *all* task attempts launched by the prior app attempt > even if those task attempts did not complete. The attempts would have to be > marked as killed or something similar to indicate it is no longer running. > Having records for the task attempts enables the user to see what nodes were > associated with the attempts and potentially access their logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5982) Task attempts that fail from the ASSIGNED state can disappear
[ https://issues.apache.org/jira/browse/MAPREDUCE-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-5982: Attachment: MAPREDUCE-5982.branch.2.7.patch [~jlowe] I uploaded 2.7 patch. > Task attempts that fail from the ASSIGNED state can disappear > - > > Key: MAPREDUCE-5982 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5982 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 0.23.10, 2.2.1, 2.7.1 >Reporter: Jason Lowe >Assignee: Chang Li > Fix For: 2.8.0 > > Attachments: MAPREDUCE-5982.2.patch, MAPREDUCE-5982.3.patch, > MAPREDUCE-5982.4.patch, MAPREDUCE-5982.5.patch, MAPREDUCE-5982.6.patch, > MAPREDUCE-5982.branch.2.7.patch, MAPREDUCE-5982.patch > > > If a task attempt fails in the ASSIGNED state, e.g.: container launch fails, > then it can disappear from the job history. The task overview page will show > subsequent attempts but the attempt in question is simply missing. For > example attempt ID 1 appears but the attempt ID 0 is missing. Similarly in > the job overview page the task attempt doesn't appear in any of the > failed/killed/succeeded counts or pages. It's as if the task attempt never > existed, but the AM logs show otherwise. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5982) Task attempts that fail from the ASSIGNED state can disappear
[ https://issues.apache.org/jira/browse/MAPREDUCE-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14804585#comment-14804585 ] Chang Li commented on MAPREDUCE-5982: - Thanks [~jlowe] for review and commit! I will work on a patch for branch-2.7. > Task attempts that fail from the ASSIGNED state can disappear > - > > Key: MAPREDUCE-5982 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5982 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 0.23.10, 2.2.1, 2.7.1 >Reporter: Jason Lowe >Assignee: Chang Li > Fix For: 2.8.0 > > Attachments: MAPREDUCE-5982.2.patch, MAPREDUCE-5982.3.patch, > MAPREDUCE-5982.4.patch, MAPREDUCE-5982.5.patch, MAPREDUCE-5982.6.patch, > MAPREDUCE-5982.patch > > > If a task attempt fails in the ASSIGNED state, e.g.: container launch fails, > then it can disappear from the job history. The task overview page will show > subsequent attempts but the attempt in question is simply missing. For > example attempt ID 1 appears but the attempt ID 0 is missing. Similarly in > the job overview page the task attempt doesn't appear in any of the > failed/killed/succeeded counts or pages. It's as if the task attempt never > existed, but the AM logs show otherwise. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete
[ https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744644#comment-14744644 ] Chang Li commented on MAPREDUCE-5003: - The broken unit test is unrelated to my change. Have test this test on my local machine with my patch on and the test passed. [~jlowe] please help review the latest patch. Thanks! > AM recovery should recreate records for attempts that were incomplete > - > > Key: MAPREDUCE-5003 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5003.1.patch, MAPREDUCE-5003.2.patch, > MAPREDUCE-5003.3.patch, MAPREDUCE-5003.4.patch, MAPREDUCE-5003.5.patch, > MAPREDUCE-5003.5.patch, MAPREDUCE-5003.6.patch, MAPREDUCE-5003.7.patch, > MAPREDUCE-5003.8.patch > > > As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task > attempt entries for *all* task attempts launched by the prior app attempt > even if those task attempts did not complete. The attempts would have to be > marked as killed or something similar to indicate it is no longer running. > Having records for the task attempts enables the user to see what nodes were > associated with the attempts and potentially access their logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete
[ https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-5003: Attachment: MAPREDUCE-5003.8.patch fix whitespace and some checkstyle issues > AM recovery should recreate records for attempts that were incomplete > - > > Key: MAPREDUCE-5003 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5003.1.patch, MAPREDUCE-5003.2.patch, > MAPREDUCE-5003.3.patch, MAPREDUCE-5003.4.patch, MAPREDUCE-5003.5.patch, > MAPREDUCE-5003.5.patch, MAPREDUCE-5003.6.patch, MAPREDUCE-5003.7.patch, > MAPREDUCE-5003.8.patch > > > As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task > attempt entries for *all* task attempts launched by the prior app attempt > even if those task attempts did not complete. The attempts would have to be > marked as killed or something similar to indicate it is no longer running. > Having records for the task attempts enables the user to see what nodes were > associated with the attempts and potentially access their logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete
[ https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-5003: Attachment: MAPREDUCE-5003.7.patch Thanks [~jlowe] for review! Have reproduced the problem about broken link for recovered incomplete task attempts. .7 fixed this issue. Also addressed your other suggestions. Because I touched TaskAttemptStarted event, I have checked and make sure that there is no backward compatibility issue. > AM recovery should recreate records for attempts that were incomplete > - > > Key: MAPREDUCE-5003 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5003.1.patch, MAPREDUCE-5003.2.patch, > MAPREDUCE-5003.3.patch, MAPREDUCE-5003.4.patch, MAPREDUCE-5003.5.patch, > MAPREDUCE-5003.5.patch, MAPREDUCE-5003.6.patch, MAPREDUCE-5003.7.patch > > > As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task > attempt entries for *all* task attempts launched by the prior app attempt > even if those task attempts did not complete. The attempts would have to be > marked as killed or something similar to indicate it is no longer running. > Having records for the task attempts enables the user to see what nodes were > associated with the attempts and potentially access their logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5982) Task attempts that fail from the ASSIGNED state can disappear
[ https://issues.apache.org/jira/browse/MAPREDUCE-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741606#comment-14741606 ] Chang Li commented on MAPREDUCE-5982: - [~jlowe] please help review the latest patch. Thanks! > Task attempts that fail from the ASSIGNED state can disappear > - > > Key: MAPREDUCE-5982 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5982 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 0.23.10, 2.2.1, 2.7.1 >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5982.2.patch, MAPREDUCE-5982.3.patch, > MAPREDUCE-5982.4.patch, MAPREDUCE-5982.5.patch, MAPREDUCE-5982.6.patch, > MAPREDUCE-5982.patch > > > If a task attempt fails in the ASSIGNED state, e.g.: container launch fails, > then it can disappear from the job history. The task overview page will show > subsequent attempts but the attempt in question is simply missing. For > example attempt ID 1 appears but the attempt ID 0 is missing. Similarly in > the job overview page the task attempt doesn't appear in any of the > failed/killed/succeeded counts or pages. It's as if the task attempt never > existed, but the AM logs show otherwise. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5982) Task attempts that fail from the ASSIGNED state can disappear
[ https://issues.apache.org/jira/browse/MAPREDUCE-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-5982: Attachment: MAPREDUCE-5982.6.patch fixed broken tests and checkstyles > Task attempts that fail from the ASSIGNED state can disappear > - > > Key: MAPREDUCE-5982 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5982 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 0.23.10, 2.2.1, 2.7.1 >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5982.2.patch, MAPREDUCE-5982.3.patch, > MAPREDUCE-5982.4.patch, MAPREDUCE-5982.5.patch, MAPREDUCE-5982.6.patch, > MAPREDUCE-5982.patch > > > If a task attempt fails in the ASSIGNED state, e.g.: container launch fails, > then it can disappear from the job history. The task overview page will show > subsequent attempts but the attempt in question is simply missing. For > example attempt ID 1 appears but the attempt ID 0 is missing. Similarly in > the job overview page the task attempt doesn't appear in any of the > failed/killed/succeeded counts or pages. It's as if the task attempt never > existed, but the AM logs show otherwise. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5982) Task attempts that fail from the ASSIGNED state can disappear
[ https://issues.apache.org/jira/browse/MAPREDUCE-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-5982: Status: Patch Available (was: Open) Thanks [~jlowe] for review! I have updated my patch according to your suggestion. > Task attempts that fail from the ASSIGNED state can disappear > - > > Key: MAPREDUCE-5982 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5982 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 2.7.1, 2.2.1, 0.23.10 >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5982.2.patch, MAPREDUCE-5982.3.patch, > MAPREDUCE-5982.4.patch, MAPREDUCE-5982.5.patch, MAPREDUCE-5982.patch > > > If a task attempt fails in the ASSIGNED state, e.g.: container launch fails, > then it can disappear from the job history. The task overview page will show > subsequent attempts but the attempt in question is simply missing. For > example attempt ID 1 appears but the attempt ID 0 is missing. Similarly in > the job overview page the task attempt doesn't appear in any of the > failed/killed/succeeded counts or pages. It's as if the task attempt never > existed, but the AM logs show otherwise. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5982) Task attempts that fail from the ASSIGNED state can disappear
[ https://issues.apache.org/jira/browse/MAPREDUCE-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-5982: Attachment: MAPREDUCE-5982.5.patch > Task attempts that fail from the ASSIGNED state can disappear > - > > Key: MAPREDUCE-5982 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5982 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 0.23.10, 2.2.1, 2.7.1 >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5982.2.patch, MAPREDUCE-5982.3.patch, > MAPREDUCE-5982.4.patch, MAPREDUCE-5982.5.patch, MAPREDUCE-5982.patch > > > If a task attempt fails in the ASSIGNED state, e.g.: container launch fails, > then it can disappear from the job history. The task overview page will show > subsequent attempts but the attempt in question is simply missing. For > example attempt ID 1 appears but the attempt ID 0 is missing. Similarly in > the job overview page the task attempt doesn't appear in any of the > failed/killed/succeeded counts or pages. It's as if the task attempt never > existed, but the AM logs show otherwise. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5002) AM could potentially allocate a reduce container to a map attempt
[ https://issues.apache.org/jira/browse/MAPREDUCE-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-5002: Attachment: MAPREDUCE-5002.6.patch fix whitespace issue > AM could potentially allocate a reduce container to a map attempt > - > > Key: MAPREDUCE-5002 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5002 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 2.0.3-alpha, 0.23.7, 2.7.0 >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5002.1.patch, MAPREDUCE-5002.2.patch, > MAPREDUCE-5002.2.patch, MAPREDUCE-5002.3.patch, MAPREDUCE-5002.4.patch, > MAPREDUCE-5002.5.patch, MAPREDUCE-5002.6.patch > > > As discussed in MAPREDUCE-4982, after MAPREDUCE-4893 it is theoretically > possible for the AM to accidentally assign a reducer container to a map > attempt if the AM doesn't find a reduce attempt actively looking for the > container (e.g.: the RM accidentally allocated too many reducer containers). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5002) AM could potentially allocate a reduce container to a map attempt
[ https://issues.apache.org/jira/browse/MAPREDUCE-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-5002: Attachment: MAPREDUCE-5002.5.patch .5 patch improve some naming and comment > AM could potentially allocate a reduce container to a map attempt > - > > Key: MAPREDUCE-5002 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5002 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 2.0.3-alpha, 0.23.7, 2.7.0 >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5002.1.patch, MAPREDUCE-5002.2.patch, > MAPREDUCE-5002.2.patch, MAPREDUCE-5002.3.patch, MAPREDUCE-5002.4.patch, > MAPREDUCE-5002.5.patch > > > As discussed in MAPREDUCE-4982, after MAPREDUCE-4893 it is theoretically > possible for the AM to accidentally assign a reducer container to a map > attempt if the AM doesn't find a reduce attempt actively looking for the > container (e.g.: the RM accidentally allocated too many reducer containers). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5002) AM could potentially allocate a reduce container to a map attempt
[ https://issues.apache.org/jira/browse/MAPREDUCE-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739369#comment-14739369 ] Chang Li commented on MAPREDUCE-5002: - Thanks [~jlowe] for review! Have reworked my unit test. Please help review the updated patch. Thanks! > AM could potentially allocate a reduce container to a map attempt > - > > Key: MAPREDUCE-5002 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5002 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 2.0.3-alpha, 0.23.7, 2.7.0 >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5002.1.patch, MAPREDUCE-5002.2.patch, > MAPREDUCE-5002.2.patch, MAPREDUCE-5002.3.patch, MAPREDUCE-5002.4.patch > > > As discussed in MAPREDUCE-4982, after MAPREDUCE-4893 it is theoretically > possible for the AM to accidentally assign a reducer container to a map > attempt if the AM doesn't find a reduce attempt actively looking for the > container (e.g.: the RM accidentally allocated too many reducer containers). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5002) AM could potentially allocate a reduce container to a map attempt
[ https://issues.apache.org/jira/browse/MAPREDUCE-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-5002: Attachment: MAPREDUCE-5002.4.patch there was a debugging print in .3 patch I forgot to delete. Upload .4 patch fix that > AM could potentially allocate a reduce container to a map attempt > - > > Key: MAPREDUCE-5002 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5002 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 2.0.3-alpha, 0.23.7, 2.7.0 >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5002.1.patch, MAPREDUCE-5002.2.patch, > MAPREDUCE-5002.2.patch, MAPREDUCE-5002.3.patch, MAPREDUCE-5002.4.patch > > > As discussed in MAPREDUCE-4982, after MAPREDUCE-4893 it is theoretically > possible for the AM to accidentally assign a reducer container to a map > attempt if the AM doesn't find a reduce attempt actively looking for the > container (e.g.: the RM accidentally allocated too many reducer containers). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5002) AM could potentially allocate a reduce container to a map attempt
[ https://issues.apache.org/jira/browse/MAPREDUCE-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-5002: Attachment: MAPREDUCE-5002.3.patch > AM could potentially allocate a reduce container to a map attempt > - > > Key: MAPREDUCE-5002 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5002 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 2.0.3-alpha, 0.23.7, 2.7.0 >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5002.1.patch, MAPREDUCE-5002.2.patch, > MAPREDUCE-5002.2.patch, MAPREDUCE-5002.3.patch > > > As discussed in MAPREDUCE-4982, after MAPREDUCE-4893 it is theoretically > possible for the AM to accidentally assign a reducer container to a map > attempt if the AM doesn't find a reduce attempt actively looking for the > container (e.g.: the RM accidentally allocated too many reducer containers). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6442) Stack trace missing for client protocol provider creation error
[ https://issues.apache.org/jira/browse/MAPREDUCE-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14729241#comment-14729241 ] Chang Li commented on MAPREDUCE-6442: - Hi [~ozawa], thanks for review. But usually log based change doesn't need a unit test. Also for my change {code} catch (Exception e) { LOG.info("Failed to use " + provider.getClass().getName() + " due to error: ", e); } {code} that exception is caught, and error message merely gets logged. The exception is not thrown. Do you have any recommendation how to test that in a unit test? Moreover, I have manually test this change. I verified that the stack trace is printed and I post the stack trace in the above comment. > Stack trace missing for client protocol provider creation error > --- > > Key: MAPREDUCE-6442 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6442 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6442.2.patch, MAPREDUCE-6442.patch > > > when provider creation fail dump the stack trace rather than just print out > the message -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6442) Stack trace missing for client protocol provider creation error
[ https://issues.apache.org/jira/browse/MAPREDUCE-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14727836#comment-14727836 ] Chang Li commented on MAPREDUCE-6442: - [~jlowe] thanks for review! I test by pass a null conf to initialize cluster and the log showed the stack trace {code} Failed to use org.apache.hadoop.mapred.YarnClientProtocolProvider due to error: java.lang.NullPointerException at org.apache.hadoop.mapred.YarnClientProtocolProvider.create(YarnClientProtocolProvider.java:33) at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:95) at org.apache.hadoop.mapreduce.Cluster.(Cluster.java:82) at org.apache.hadoop.mapreduce.Cluster.(Cluster.java:75) {code} > Stack trace missing for client protocol provider creation error > --- > > Key: MAPREDUCE-6442 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6442 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6442.2.patch, MAPREDUCE-6442.patch > > > when provider creation fail dump the stack trace rather than just print out > the message -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6442) Stack trace missing for client protocol provider creation error
[ https://issues.apache.org/jira/browse/MAPREDUCE-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6442: Status: Patch Available (was: Open) > Stack trace missing for client protocol provider creation error > --- > > Key: MAPREDUCE-6442 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6442 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6442.2.patch, MAPREDUCE-6442.patch > > > when provider creation fail dump the stack trace rather than just print out > the message -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6442) Stack trace missing for client protocol provider creation error
[ https://issues.apache.org/jira/browse/MAPREDUCE-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6442: Attachment: MAPREDUCE-6442.2.patch [~jlowe] thanks for review. Updated my patch. > Stack trace missing for client protocol provider creation error > --- > > Key: MAPREDUCE-6442 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6442 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6442.2.patch, MAPREDUCE-6442.patch > > > when provider creation fail dump the stack trace rather than just print out > the message -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6442) when provider creation fail dump the stack trace
[ https://issues.apache.org/jira/browse/MAPREDUCE-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14649933#comment-14649933 ] Chang Li commented on MAPREDUCE-6442: - [~jlowe] please help review. Thanks! > when provider creation fail dump the stack trace > > > Key: MAPREDUCE-6442 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6442 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6442.patch > > > when provider creation fail dump the stack trace rather than just print out > the message -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6442) when provider creation fail dump the stack trace
[ https://issues.apache.org/jira/browse/MAPREDUCE-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6442: Attachment: MAPREDUCE-6442.patch > when provider creation fail dump the stack trace > > > Key: MAPREDUCE-6442 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6442 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6442.patch > > > when provider creation fail dump the stack trace rather than just print out > the message -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6442) when provider creation fail dump the stack trace
Chang Li created MAPREDUCE-6442: --- Summary: when provider creation fail dump the stack trace Key: MAPREDUCE-6442 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6442 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Chang Li Assignee: Chang Li when provider creation fail dump the stack trace rather than just print out the message -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5982) Task attempts that fail from the ASSIGNED state can disappear
[ https://issues.apache.org/jira/browse/MAPREDUCE-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14649778#comment-14649778 ] Chang Li commented on MAPREDUCE-5982: - [~jlowe] please help review. > Task attempts that fail from the ASSIGNED state can disappear > - > > Key: MAPREDUCE-5982 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5982 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 0.23.10, 2.2.1, 2.7.1 >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5982.2.patch, MAPREDUCE-5982.3.patch, > MAPREDUCE-5982.4.patch, MAPREDUCE-5982.patch > > > If a task attempt fails in the ASSIGNED state, e.g.: container launch fails, > then it can disappear from the job history. The task overview page will show > subsequent attempts but the attempt in question is simply missing. For > example attempt ID 1 appears but the attempt ID 0 is missing. Similarly in > the job overview page the task attempt doesn't appear in any of the > failed/killed/succeeded counts or pages. It's as if the task attempt never > existed, but the AM logs show otherwise. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5982) Task attempts that fail from the ASSIGNED state can disappear
[ https://issues.apache.org/jira/browse/MAPREDUCE-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-5982: Attachment: MAPREDUCE-5982.4.patch > Task attempts that fail from the ASSIGNED state can disappear > - > > Key: MAPREDUCE-5982 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5982 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 0.23.10, 2.2.1, 2.7.1 >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5982.2.patch, MAPREDUCE-5982.3.patch, > MAPREDUCE-5982.4.patch, MAPREDUCE-5982.patch > > > If a task attempt fails in the ASSIGNED state, e.g.: container launch fails, > then it can disappear from the job history. The task overview page will show > subsequent attempts but the attempt in question is simply missing. For > example attempt ID 1 appears but the attempt ID 0 is missing. Similarly in > the job overview page the task attempt doesn't appear in any of the > failed/killed/succeeded counts or pages. It's as if the task attempt never > existed, but the AM logs show otherwise. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5982) Task attempts that fail from the ASSIGNED state can disappear
[ https://issues.apache.org/jira/browse/MAPREDUCE-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-5982: Attachment: MAPREDUCE-5982.3.patch checkstyle fix > Task attempts that fail from the ASSIGNED state can disappear > - > > Key: MAPREDUCE-5982 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5982 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 0.23.10, 2.2.1, 2.7.1 >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5982.2.patch, MAPREDUCE-5982.3.patch, > MAPREDUCE-5982.patch > > > If a task attempt fails in the ASSIGNED state, e.g.: container launch fails, > then it can disappear from the job history. The task overview page will show > subsequent attempts but the attempt in question is simply missing. For > example attempt ID 1 appears but the attempt ID 0 is missing. Similarly in > the job overview page the task attempt doesn't appear in any of the > failed/killed/succeeded counts or pages. It's as if the task attempt never > existed, but the AM logs show otherwise. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5982) Task attempts that fail from the ASSIGNED state can disappear
[ https://issues.apache.org/jira/browse/MAPREDUCE-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-5982: Attachment: MAPREDUCE-5982.2.patch whitespace fix > Task attempts that fail from the ASSIGNED state can disappear > - > > Key: MAPREDUCE-5982 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5982 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 0.23.10, 2.2.1, 2.7.1 >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5982.2.patch, MAPREDUCE-5982.patch > > > If a task attempt fails in the ASSIGNED state, e.g.: container launch fails, > then it can disappear from the job history. The task overview page will show > subsequent attempts but the attempt in question is simply missing. For > example attempt ID 1 appears but the attempt ID 0 is missing. Similarly in > the job overview page the task attempt doesn't appear in any of the > failed/killed/succeeded counts or pages. It's as if the task attempt never > existed, but the AM logs show otherwise. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5982) Task attempts that fail from the ASSIGNED state can disappear
[ https://issues.apache.org/jira/browse/MAPREDUCE-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-5982: Affects Version/s: 2.7.1 > Task attempts that fail from the ASSIGNED state can disappear > - > > Key: MAPREDUCE-5982 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5982 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 0.23.10, 2.2.1, 2.7.1 >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5982.patch > > > If a task attempt fails in the ASSIGNED state, e.g.: container launch fails, > then it can disappear from the job history. The task overview page will show > subsequent attempts but the attempt in question is simply missing. For > example attempt ID 1 appears but the attempt ID 0 is missing. Similarly in > the job overview page the task attempt doesn't appear in any of the > failed/killed/succeeded counts or pages. It's as if the task attempt never > existed, but the AM logs show otherwise. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5982) Task attempts that fail from the ASSIGNED state can disappear
[ https://issues.apache.org/jira/browse/MAPREDUCE-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-5982: Attachment: MAPREDUCE-5982.patch > Task attempts that fail from the ASSIGNED state can disappear > - > > Key: MAPREDUCE-5982 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5982 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 0.23.10, 2.2.1 >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5982.patch > > > If a task attempt fails in the ASSIGNED state, e.g.: container launch fails, > then it can disappear from the job history. The task overview page will show > subsequent attempts but the attempt in question is simply missing. For > example attempt ID 1 appears but the attempt ID 0 is missing. Similarly in > the job overview page the task attempt doesn't appear in any of the > failed/killed/succeeded counts or pages. It's as if the task attempt never > existed, but the AM logs show otherwise. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5982) Task attempts that fail from the ASSIGNED state can disappear
[ https://issues.apache.org/jira/browse/MAPREDUCE-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-5982: Status: Patch Available (was: Open) > Task attempts that fail from the ASSIGNED state can disappear > - > > Key: MAPREDUCE-5982 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5982 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 2.2.1, 0.23.10 >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5982.patch > > > If a task attempt fails in the ASSIGNED state, e.g.: container launch fails, > then it can disappear from the job history. The task overview page will show > subsequent attempts but the attempt in question is simply missing. For > example attempt ID 1 appears but the attempt ID 0 is missing. Similarly in > the job overview page the task attempt doesn't appear in any of the > failed/killed/succeeded counts or pages. It's as if the task attempt never > existed, but the AM logs show otherwise. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MAPREDUCE-5982) Task attempts that fail from the ASSIGNED state can disappear
[ https://issues.apache.org/jira/browse/MAPREDUCE-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li reassigned MAPREDUCE-5982: --- Assignee: Chang Li > Task attempts that fail from the ASSIGNED state can disappear > - > > Key: MAPREDUCE-5982 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5982 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 0.23.10, 2.2.1 >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5982.patch > > > If a task attempt fails in the ASSIGNED state, e.g.: container launch fails, > then it can disappear from the job history. The task overview page will show > subsequent attempts but the attempt in question is simply missing. For > example attempt ID 1 appears but the attempt ID 0 is missing. Similarly in > the job overview page the task attempt doesn't appear in any of the > failed/killed/succeeded counts or pages. It's as if the task attempt never > existed, but the AM logs show otherwise. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6384) add the last reporting reducer host info and attempt id on the map error message due to too many fetch failure
[ https://issues.apache.org/jira/browse/MAPREDUCE-6384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6384: Attachment: MAPREDUCE-6384.10.patch Thanks [~jlowe] for further review! I have changed the typo and replaced all the tabs in testFetchFailureAttemptFinishTime() with spaces. > add the last reporting reducer host info and attempt id on the map error > message due to too many fetch failure > -- > > Key: MAPREDUCE-6384 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6384 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6384.10.patch, MAPREDUCE-6384.2.patch, > MAPREDUCE-6384.3.patch, MAPREDUCE-6384.4.2.patch, MAPREDUCE-6384.4.patch, > MAPREDUCE-6384.5.patch, MAPREDUCE-6384.6.patch, MAPREDUCE-6384.7.patch, > MAPREDUCE-6384.8.patch, MAPREDUCE-6384.9.patch, MAPREDUCE-6384.9.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6420) Interrupted Exception in LocalContainerLauncher should be logged in warn/info levle
[ https://issues.apache.org/jira/browse/MAPREDUCE-6420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6420: Status: Patch Available (was: Open) [~knoguchi], [~jlowe] please help review. Thanks! > Interrupted Exception in LocalContainerLauncher should be logged in warn/info > levle > --- > > Key: MAPREDUCE-6420 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6420 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6420.1.patch > > > Interrupted Exception in LocalContainerLauncher should be logged in warn/info > levle instead of error because it won't fail the job. Otherwise it will cause > some confusions during debugging -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6420) Interrupted Exception in LocalContainerLauncher should be logged in warn/info levle
[ https://issues.apache.org/jira/browse/MAPREDUCE-6420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6420: Attachment: MAPREDUCE-6420.1.patch > Interrupted Exception in LocalContainerLauncher should be logged in warn/info > levle > --- > > Key: MAPREDUCE-6420 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6420 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6420.1.patch > > > Interrupted Exception in LocalContainerLauncher should be logged in warn/info > levle instead of error because it won't fail the job. Otherwise it will cause > some confusions during debugging -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6420) Interrupted Exception in LocalContainerLauncher should be logged in warn/info levle
Chang Li created MAPREDUCE-6420: --- Summary: Interrupted Exception in LocalContainerLauncher should be logged in warn/info levle Key: MAPREDUCE-6420 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6420 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Chang Li Assignee: Chang Li Interrupted Exception in LocalContainerLauncher should be logged in warn/info levle instead of error because it won't fail the job -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6420) Interrupted Exception in LocalContainerLauncher should be logged in warn/info levle
[ https://issues.apache.org/jira/browse/MAPREDUCE-6420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6420: Description: Interrupted Exception in LocalContainerLauncher should be logged in warn/info levle instead of error because it won't fail the job. Otherwise it will cause some confusions during debugging (was: Interrupted Exception in LocalContainerLauncher should be logged in warn/info levle instead of error because it won't fail the job) > Interrupted Exception in LocalContainerLauncher should be logged in warn/info > levle > --- > > Key: MAPREDUCE-6420 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6420 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > > Interrupted Exception in LocalContainerLauncher should be logged in warn/info > levle instead of error because it won't fail the job. Otherwise it will cause > some confusions during debugging -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6418) MRApp should not shutdown LogManager during shutdown
[ https://issues.apache.org/jira/browse/MAPREDUCE-6418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6418: Attachment: MAPREDUCE-6418.4.patch .4 patch fixed some whitespace issues I just noticed > MRApp should not shutdown LogManager during shutdown > > > Key: MAPREDUCE-6418 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6418 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6418.1.patch, MAPREDUCE-6418.2.patch, > MAPREDUCE-6418.3.patch, MAPREDUCE-6418.4.patch > > > Tests in TestRecovery.java lost their logs after recovery due to the change > of MAPREDUCE-5694. MRApp should overwrite those changes to allow log after am > recover to be shown. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6418) MRApp should not shutdown LogManager during shutdown
[ https://issues.apache.org/jira/browse/MAPREDUCE-6418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6418: Attachment: MAPREDUCE-6418.3.patch Thanks [~jlowe] for review and provide valuable suggestions! I just updated my patch to improve the comments and move comment inside the function body. > MRApp should not shutdown LogManager during shutdown > > > Key: MAPREDUCE-6418 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6418 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6418.1.patch, MAPREDUCE-6418.2.patch, > MAPREDUCE-6418.3.patch > > > Tests in TestRecovery.java lost their logs after recovery due to the change > of MAPREDUCE-5694. MRApp should overwrite those changes to allow log after am > recover to be shown. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete
[ https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14606139#comment-14606139 ] Chang Li commented on MAPREDUCE-5003: - [~jlowe] thanks for review! I have updated my patch. Could you please review the latest patch. Thanks! > AM recovery should recreate records for attempts that were incomplete > - > > Key: MAPREDUCE-5003 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5003.1.patch, MAPREDUCE-5003.2.patch, > MAPREDUCE-5003.3.patch, MAPREDUCE-5003.4.patch, MAPREDUCE-5003.5.patch, > MAPREDUCE-5003.5.patch, MAPREDUCE-5003.6.patch > > > As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task > attempt entries for *all* task attempts launched by the prior app attempt > even if those task attempts did not complete. The attempts would have to be > marked as killed or something similar to indicate it is no longer running. > Having records for the task attempts enables the user to see what nodes were > associated with the attempts and potentially access their logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6418) MRApp should not shutdown LogManager during shutdown
[ https://issues.apache.org/jira/browse/MAPREDUCE-6418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6418: Attachment: MAPREDUCE-6418.2.patch [~jlowe] thanks for review! I just updated my patch according to your suggestions. > MRApp should not shutdown LogManager during shutdown > > > Key: MAPREDUCE-6418 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6418 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6418.1.patch, MAPREDUCE-6418.2.patch > > > Tests in TestRecovery.java lost their logs after recovery due to the change > of MAPREDUCE-5694. MRApp should overwrite those changes to allow log after am > recover to be shown. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete
[ https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-5003: Attachment: MAPREDUCE-5003.6.patch > AM recovery should recreate records for attempts that were incomplete > - > > Key: MAPREDUCE-5003 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5003.1.patch, MAPREDUCE-5003.2.patch, > MAPREDUCE-5003.3.patch, MAPREDUCE-5003.4.patch, MAPREDUCE-5003.5.patch, > MAPREDUCE-5003.5.patch, MAPREDUCE-5003.6.patch > > > As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task > attempt entries for *all* task attempts launched by the prior app attempt > even if those task attempts did not complete. The attempts would have to be > marked as killed or something similar to indicate it is no longer running. > Having records for the task attempts enables the user to see what nodes were > associated with the attempts and potentially access their logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete
[ https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-5003: Attachment: MAPREDUCE-5003.5.patch > AM recovery should recreate records for attempts that were incomplete > - > > Key: MAPREDUCE-5003 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5003.1.patch, MAPREDUCE-5003.2.patch, > MAPREDUCE-5003.3.patch, MAPREDUCE-5003.4.patch, MAPREDUCE-5003.5.patch, > MAPREDUCE-5003.5.patch > > > As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task > attempt entries for *all* task attempts launched by the prior app attempt > even if those task attempts did not complete. The attempts would have to be > marked as killed or something similar to indicate it is no longer running. > Having records for the task attempts enables the user to see what nodes were > associated with the attempts and potentially access their logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete
[ https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-5003: Attachment: MAPREDUCE-5003.5.patch > AM recovery should recreate records for attempts that were incomplete > - > > Key: MAPREDUCE-5003 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5003.1.patch, MAPREDUCE-5003.2.patch, > MAPREDUCE-5003.3.patch, MAPREDUCE-5003.4.patch, MAPREDUCE-5003.5.patch > > > As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task > attempt entries for *all* task attempts launched by the prior app attempt > even if those task attempts did not complete. The attempts would have to be > marked as killed or something similar to indicate it is no longer running. > Having records for the task attempts enables the user to see what nodes were > associated with the attempts and potentially access their logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete
[ https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-5003: Status: Patch Available (was: Open) > AM recovery should recreate records for attempts that were incomplete > - > > Key: MAPREDUCE-5003 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5003.1.patch, MAPREDUCE-5003.2.patch, > MAPREDUCE-5003.3.patch, MAPREDUCE-5003.4.patch > > > As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task > attempt entries for *all* task attempts launched by the prior app attempt > even if those task attempts did not complete. The attempts would have to be > marked as killed or something similar to indicate it is no longer running. > Having records for the task attempts enables the user to see what nodes were > associated with the attempts and potentially access their logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete
[ https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-5003: Attachment: MAPREDUCE-5003.4.patch > AM recovery should recreate records for attempts that were incomplete > - > > Key: MAPREDUCE-5003 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5003.1.patch, MAPREDUCE-5003.2.patch, > MAPREDUCE-5003.3.patch, MAPREDUCE-5003.4.patch > > > As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task > attempt entries for *all* task attempts launched by the prior app attempt > even if those task attempts did not complete. The attempts would have to be > marked as killed or something similar to indicate it is no longer running. > Having records for the task attempts enables the user to see what nodes were > associated with the attempts and potentially access their logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6418) MRApp should not shutdown LogManager during shutdown
[ https://issues.apache.org/jira/browse/MAPREDUCE-6418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6418: Status: Patch Available (was: Open) [~jlowe] please help review. Thanks! > MRApp should not shutdown LogManager during shutdown > > > Key: MAPREDUCE-6418 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6418 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6418.1.patch > > > Tests in TestRecovery.java lost their logs after recovery due to the change > of MAPREDUCE-5694. MRApp should overwrite those changes to allow log after am > recover to be shown. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6418) MRApp should not shutdown LogManager during shutdown
[ https://issues.apache.org/jira/browse/MAPREDUCE-6418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6418: Attachment: MAPREDUCE-6418.1.patch > MRApp should not shutdown LogManager during shutdown > > > Key: MAPREDUCE-6418 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6418 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6418.1.patch > > > Tests in TestRecovery.java lost their logs after recovery due to the change > of MAPREDUCE-5694. MRApp should overwrite those changes to allow log after am > recover to be shown. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6418) MRApp should not shutdown LogManager during shutdown
Chang Li created MAPREDUCE-6418: --- Summary: MRApp should not shutdown LogManager during shutdown Key: MAPREDUCE-6418 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6418 Project: Hadoop Map/Reduce Issue Type: Bug Environment: Tests in TestRecovery.java lost their logs after recovery due to the change of MAPREDUCE-5694. MRApp should overwrite those changes to allow log after am recover to be shown. Reporter: Chang Li Assignee: Chang Li -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6418) MRApp should not shutdown LogManager during shutdown
[ https://issues.apache.org/jira/browse/MAPREDUCE-6418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6418: Description: Tests in TestRecovery.java lost their logs after recovery due to the change of MAPREDUCE-5694. MRApp should overwrite those changes to allow log after am recover to be shown. > MRApp should not shutdown LogManager during shutdown > > > Key: MAPREDUCE-6418 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6418 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > > Tests in TestRecovery.java lost their logs after recovery due to the change > of MAPREDUCE-5694. MRApp should overwrite those changes to allow log after am > recover to be shown. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6418) MRApp should not shutdown LogManager during shutdown
[ https://issues.apache.org/jira/browse/MAPREDUCE-6418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6418: Environment: (was: Tests in TestRecovery.java lost their logs after recovery due to the change of MAPREDUCE-5694. MRApp should overwrite those changes to allow log after am recover to be shown.) > MRApp should not shutdown LogManager during shutdown > > > Key: MAPREDUCE-6418 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6418 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6384) add the last reporting reducer host info and attempt id on the map error message due to too many fetch failure
[ https://issues.apache.org/jira/browse/MAPREDUCE-6384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14603413#comment-14603413 ] Chang Li commented on MAPREDUCE-6384: - [~jlowe], please help review the latest patch. Thanks! > add the last reporting reducer host info and attempt id on the map error > message due to too many fetch failure > -- > > Key: MAPREDUCE-6384 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6384 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6384.2.patch, MAPREDUCE-6384.3.patch, > MAPREDUCE-6384.4.2.patch, MAPREDUCE-6384.4.patch, MAPREDUCE-6384.5.patch, > MAPREDUCE-6384.6.patch, MAPREDUCE-6384.7.patch, MAPREDUCE-6384.8.patch, > MAPREDUCE-6384.9.patch, MAPREDUCE-6384.9.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6384) add the last reporting reducer host info and attempt id on the map error message due to too many fetch failure
[ https://issues.apache.org/jira/browse/MAPREDUCE-6384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6384: Attachment: MAPREDUCE-6384.9.patch > add the last reporting reducer host info and attempt id on the map error > message due to too many fetch failure > -- > > Key: MAPREDUCE-6384 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6384 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6384.2.patch, MAPREDUCE-6384.3.patch, > MAPREDUCE-6384.4.2.patch, MAPREDUCE-6384.4.patch, MAPREDUCE-6384.5.patch, > MAPREDUCE-6384.6.patch, MAPREDUCE-6384.7.patch, MAPREDUCE-6384.8.patch, > MAPREDUCE-6384.9.patch, MAPREDUCE-6384.9.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6384) add the last reporting reducer host info and attempt id on the map error message due to too many fetch failure
[ https://issues.apache.org/jira/browse/MAPREDUCE-6384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6384: Attachment: MAPREDUCE-6384.9.patch whitespace fix > add the last reporting reducer host info and attempt id on the map error > message due to too many fetch failure > -- > > Key: MAPREDUCE-6384 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6384 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6384.2.patch, MAPREDUCE-6384.3.patch, > MAPREDUCE-6384.4.2.patch, MAPREDUCE-6384.4.patch, MAPREDUCE-6384.5.patch, > MAPREDUCE-6384.6.patch, MAPREDUCE-6384.7.patch, MAPREDUCE-6384.8.patch, > MAPREDUCE-6384.9.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6384) add the last reporting reducer host info and attempt id on the map error message due to too many fetch failure
[ https://issues.apache.org/jira/browse/MAPREDUCE-6384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6384: Attachment: MAPREDUCE-6384.8.patch [~jlowe] thanks for review! Have updated my patch according to your suggestion. > add the last reporting reducer host info and attempt id on the map error > message due to too many fetch failure > -- > > Key: MAPREDUCE-6384 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6384 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6384.2.patch, MAPREDUCE-6384.3.patch, > MAPREDUCE-6384.4.2.patch, MAPREDUCE-6384.4.patch, MAPREDUCE-6384.5.patch, > MAPREDUCE-6384.6.patch, MAPREDUCE-6384.7.patch, MAPREDUCE-6384.8.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6384) add the last reporting reducer host info and attempt id on the map error message due to too many fetch failure
[ https://issues.apache.org/jira/browse/MAPREDUCE-6384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6384: Attachment: MAPREDUCE-6384.7.patch whitespace fix > add the last reporting reducer host info and attempt id on the map error > message due to too many fetch failure > -- > > Key: MAPREDUCE-6384 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6384 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6384.2.patch, MAPREDUCE-6384.3.patch, > MAPREDUCE-6384.4.2.patch, MAPREDUCE-6384.4.patch, MAPREDUCE-6384.5.patch, > MAPREDUCE-6384.6.patch, MAPREDUCE-6384.7.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5002) AM could potentially allocate a reduce container to a map attempt
[ https://issues.apache.org/jira/browse/MAPREDUCE-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596589#comment-14596589 ] Chang Li commented on MAPREDUCE-5002: - [~jlowe] could you please help reveiw? Thanks! > AM could potentially allocate a reduce container to a map attempt > - > > Key: MAPREDUCE-5002 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5002 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 2.0.3-alpha, 0.23.7, 2.7.0 >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5002.1.patch, MAPREDUCE-5002.2.patch, > MAPREDUCE-5002.2.patch > > > As discussed in MAPREDUCE-4982, after MAPREDUCE-4893 it is theoretically > possible for the AM to accidentally assign a reducer container to a map > attempt if the AM doesn't find a reduce attempt actively looking for the > container (e.g.: the RM accidentally allocated too many reducer containers). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6384) add the last reporting reducer host info and attempt id on the map error message due to too many fetch failure
[ https://issues.apache.org/jira/browse/MAPREDUCE-6384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596586#comment-14596586 ] Chang Li commented on MAPREDUCE-6384: - Thansk [~zxu] for review and sorry about the delay in response. I have fixed the indentation issue in my patch and those two whitespace issues. > add the last reporting reducer host info and attempt id on the map error > message due to too many fetch failure > -- > > Key: MAPREDUCE-6384 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6384 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6384.2.patch, MAPREDUCE-6384.3.patch, > MAPREDUCE-6384.4.2.patch, MAPREDUCE-6384.4.patch, MAPREDUCE-6384.5.patch, > MAPREDUCE-6384.6.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6384) add the last reporting reducer host info and attempt id on the map error message due to too many fetch failure
[ https://issues.apache.org/jira/browse/MAPREDUCE-6384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6384: Attachment: MAPREDUCE-6384.6.patch > add the last reporting reducer host info and attempt id on the map error > message due to too many fetch failure > -- > > Key: MAPREDUCE-6384 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6384 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6384.2.patch, MAPREDUCE-6384.3.patch, > MAPREDUCE-6384.4.2.patch, MAPREDUCE-6384.4.patch, MAPREDUCE-6384.5.patch, > MAPREDUCE-6384.6.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5002) AM could potentially allocate a reduce container to a map attempt
[ https://issues.apache.org/jira/browse/MAPREDUCE-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-5002: Affects Version/s: 2.7.0 > AM could potentially allocate a reduce container to a map attempt > - > > Key: MAPREDUCE-5002 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5002 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 2.0.3-alpha, 0.23.7, 2.7.0 >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5002.1.patch, MAPREDUCE-5002.2.patch, > MAPREDUCE-5002.2.patch > > > As discussed in MAPREDUCE-4982, after MAPREDUCE-4893 it is theoretically > possible for the AM to accidentally assign a reducer container to a map > attempt if the AM doesn't find a reduce attempt actively looking for the > container (e.g.: the RM accidentally allocated too many reducer containers). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5002) AM could potentially allocate a reduce container to a map attempt
[ https://issues.apache.org/jira/browse/MAPREDUCE-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-5002: Attachment: MAPREDUCE-5002.2.patch > AM could potentially allocate a reduce container to a map attempt > - > > Key: MAPREDUCE-5002 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5002 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 2.0.3-alpha, 0.23.7 >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5002.1.patch, MAPREDUCE-5002.2.patch, > MAPREDUCE-5002.2.patch > > > As discussed in MAPREDUCE-4982, after MAPREDUCE-4893 it is theoretically > possible for the AM to accidentally assign a reducer container to a map > attempt if the AM doesn't find a reduce attempt actively looking for the > container (e.g.: the RM accidentally allocated too many reducer containers). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5002) AM could potentially allocate a reduce container to a map attempt
[ https://issues.apache.org/jira/browse/MAPREDUCE-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-5002: Attachment: MAPREDUCE-5002.2.patch > AM could potentially allocate a reduce container to a map attempt > - > > Key: MAPREDUCE-5002 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5002 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 2.0.3-alpha, 0.23.7 >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5002.1.patch, MAPREDUCE-5002.2.patch > > > As discussed in MAPREDUCE-4982, after MAPREDUCE-4893 it is theoretically > possible for the AM to accidentally assign a reducer container to a map > attempt if the AM doesn't find a reduce attempt actively looking for the > container (e.g.: the RM accidentally allocated too many reducer containers). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5002) AM could potentially allocate a reduce container to a map attempt
[ https://issues.apache.org/jira/browse/MAPREDUCE-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-5002: Status: Patch Available (was: Open) > AM could potentially allocate a reduce container to a map attempt > - > > Key: MAPREDUCE-5002 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5002 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 0.23.7, 2.0.3-alpha >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5002.1.patch > > > As discussed in MAPREDUCE-4982, after MAPREDUCE-4893 it is theoretically > possible for the AM to accidentally assign a reducer container to a map > attempt if the AM doesn't find a reduce attempt actively looking for the > container (e.g.: the RM accidentally allocated too many reducer containers). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5002) AM could potentially allocate a reduce container to a map attempt
[ https://issues.apache.org/jira/browse/MAPREDUCE-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-5002: Attachment: MAPREDUCE-5002.1.patch > AM could potentially allocate a reduce container to a map attempt > - > > Key: MAPREDUCE-5002 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5002 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 2.0.3-alpha, 0.23.7 >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5002.1.patch > > > As discussed in MAPREDUCE-4982, after MAPREDUCE-4893 it is theoretically > possible for the AM to accidentally assign a reducer container to a map > attempt if the AM doesn't find a reduce attempt actively looking for the > container (e.g.: the RM accidentally allocated too many reducer containers). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MAPREDUCE-5002) AM could potentially allocate a reduce container to a map attempt
[ https://issues.apache.org/jira/browse/MAPREDUCE-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li reassigned MAPREDUCE-5002: --- Assignee: Chang Li > AM could potentially allocate a reduce container to a map attempt > - > > Key: MAPREDUCE-5002 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5002 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 2.0.3-alpha, 0.23.7 >Reporter: Jason Lowe >Assignee: Chang Li > > As discussed in MAPREDUCE-4982, after MAPREDUCE-4893 it is theoretically > possible for the AM to accidentally assign a reducer container to a map > attempt if the AM doesn't find a reduce attempt actively looking for the > container (e.g.: the RM accidentally allocated too many reducer containers). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete
[ https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14583815#comment-14583815 ] Chang Li commented on MAPREDUCE-5003: - [~jlowe] could you please help review, thanks! > AM recovery should recreate records for attempts that were incomplete > - > > Key: MAPREDUCE-5003 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5003.1.patch, MAPREDUCE-5003.2.patch, > MAPREDUCE-5003.3.patch > > > As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task > attempt entries for *all* task attempts launched by the prior app attempt > even if those task attempts did not complete. The attempts would have to be > marked as killed or something similar to indicate it is no longer running. > Having records for the task attempts enables the user to see what nodes were > associated with the attempts and potentially access their logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete
[ https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-5003: Attachment: MAPREDUCE-5003.3.patch checkstyle fix > AM recovery should recreate records for attempts that were incomplete > - > > Key: MAPREDUCE-5003 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5003.1.patch, MAPREDUCE-5003.2.patch, > MAPREDUCE-5003.3.patch > > > As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task > attempt entries for *all* task attempts launched by the prior app attempt > even if those task attempts did not complete. The attempts would have to be > marked as killed or something similar to indicate it is no longer running. > Having records for the task attempts enables the user to see what nodes were > associated with the attempts and potentially access their logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete
[ https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-5003: Attachment: MAPREDUCE-5003.2.patch > AM recovery should recreate records for attempts that were incomplete > - > > Key: MAPREDUCE-5003 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5003.1.patch, MAPREDUCE-5003.2.patch > > > As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task > attempt entries for *all* task attempts launched by the prior app attempt > even if those task attempts did not complete. The attempts would have to be > marked as killed or something similar to indicate it is no longer running. > Having records for the task attempts enables the user to see what nodes were > associated with the attempts and potentially access their logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete
[ https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-5003: Status: Patch Available (was: Open) > AM recovery should recreate records for attempts that were incomplete > - > > Key: MAPREDUCE-5003 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5003.1.patch > > > As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task > attempt entries for *all* task attempts launched by the prior app attempt > even if those task attempts did not complete. The attempts would have to be > marked as killed or something similar to indicate it is no longer running. > Having records for the task attempts enables the user to see what nodes were > associated with the attempts and potentially access their logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete
[ https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-5003: Attachment: MAPREDUCE-5003.1.patch the AM hanging issue mentioned in MAPREDUCE-4992 no longer exists in current recover implementation. Moreover, current recovery implementation for task attempts will recover incomplete task attempts to killed state. So we no longer need to remove those incomplete task attempts for recovery. > AM recovery should recreate records for attempts that were incomplete > - > > Key: MAPREDUCE-5003 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Reporter: Jason Lowe >Assignee: Chang Li > Attachments: MAPREDUCE-5003.1.patch > > > As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task > attempt entries for *all* task attempts launched by the prior app attempt > even if those task attempts did not complete. The attempts would have to be > marked as killed or something similar to indicate it is no longer running. > Having records for the task attempts enables the user to see what nodes were > associated with the attempts and potentially access their logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete
[ https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li reassigned MAPREDUCE-5003: --- Assignee: Chang Li > AM recovery should recreate records for attempts that were incomplete > - > > Key: MAPREDUCE-5003 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Reporter: Jason Lowe >Assignee: Chang Li > > As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task > attempt entries for *all* task attempts launched by the prior app attempt > even if those task attempts did not complete. The attempts would have to be > marked as killed or something similar to indicate it is no longer running. > Having records for the task attempts enables the user to see what nodes were > associated with the attempts and potentially access their logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6384) add the last reporting reducer host info and attempt id on the map error message due to too many fetch failure
[ https://issues.apache.org/jira/browse/MAPREDUCE-6384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6384: Attachment: MAPREDUCE-6384.5.patch Hi [~zxu], thanks for review and good catch! I just updated my patch according to your suggestion. > add the last reporting reducer host info and attempt id on the map error > message due to too many fetch failure > -- > > Key: MAPREDUCE-6384 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6384 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6384.2.patch, MAPREDUCE-6384.3.patch, > MAPREDUCE-6384.4.2.patch, MAPREDUCE-6384.4.patch, MAPREDUCE-6384.5.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6354) shuffle handler should log connection info
[ https://issues.apache.org/jira/browse/MAPREDUCE-6354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6354: Attachment: MAPREDUCE-6354.8.patch > shuffle handler should log connection info > -- > > Key: MAPREDUCE-6354 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6354 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6354.2.patch, MAPREDUCE-6354.3.patch, > MAPREDUCE-6354.4.patch, MAPREDUCE-6354.5.patch, MAPREDUCE-6354.6.patch, > MAPREDUCE-6354.7.patch, MAPREDUCE-6354.8.patch, MAPREDUCE-6354.patch > > > currently, shuffle handler only log connection info in debug mode, we want to > log that info in a more concise way -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6384) add the last reporting reducer host info and attempt id on the map error message due to too many fetch failure
[ https://issues.apache.org/jira/browse/MAPREDUCE-6384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated MAPREDUCE-6384: Attachment: MAPREDUCE-6384.4.2.patch > add the last reporting reducer host info and attempt id on the map error > message due to too many fetch failure > -- > > Key: MAPREDUCE-6384 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6384 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6384.2.patch, MAPREDUCE-6384.3.patch, > MAPREDUCE-6384.4.2.patch, MAPREDUCE-6384.4.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6354) shuffle handler should log connection info
[ https://issues.apache.org/jira/browse/MAPREDUCE-6354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14573047#comment-14573047 ] Chang Li commented on MAPREDUCE-6354: - [~jlowe] thanks a lot for thoughtful review and hearty discussion of how to make this logging more efficient! I have made changes of debug level logging. As for the trace level logging, I think we could wait till this get committed and file another jira to address that issue. Let me know what you think of the latest patch. Thanks! > shuffle handler should log connection info > -- > > Key: MAPREDUCE-6354 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6354 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6354.2.patch, MAPREDUCE-6354.3.patch, > MAPREDUCE-6354.4.patch, MAPREDUCE-6354.5.patch, MAPREDUCE-6354.6.patch, > MAPREDUCE-6354.7.patch, MAPREDUCE-6354.patch > > > currently, shuffle handler only log connection info in debug mode, we want to > log that info in a more concise way -- This message was sent by Atlassian JIRA (v6.3.4#6332)