[jira] [Resolved] (MAPREDUCE-7278) Speculative execution behavior is observed even when mapreduce.map.speculative and mapreduce.reduce.speculative are false
[ https://issues.apache.org/jira/browse/MAPREDUCE-7278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg resolved MAPREDUCE-7278. -- Resolution: Fixed Pulled this back to branch-3.1 for 3.1.5 closing, thank you [~tarunparimi] for your contribution > Speculative execution behavior is observed even when > mapreduce.map.speculative and mapreduce.reduce.speculative are false > - > > Key: MAPREDUCE-7278 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7278 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: task >Affects Versions: 2.8.0, 3.4.0 >Reporter: Tarun Parimi >Assignee: Tarun Parimi >Priority: Major > Fix For: 3.3.0, 3.2.2, 3.4.0, 3.1.5 > > Attachments: MAPREDUCE-7278.001.patch, MAPREDUCE-7278.002.patch, > MAPREDUCE-7278.003.patch, MAPREDUCE-7278.004.patch, Screen Shot 2020-04-30 at > 8.04.27 PM.png > > > When a failed task attempt container is stuck in FAIL_FINISHING_CONTAINER > state for some time, we observe two task attempts are launched simultaneously > even when speculative execution is disabled. > This results in the below message shown in the killed attempts, indicating > speculation has occurred. This is an issue for jobs which require speculative > execution to be strictly disabled. > !Screen Shot 2020-04-30 at 8.04.27 PM.png! > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-7278) Speculative execution behavior is observed even when mapreduce.map.speculative and mapreduce.reduce.speculative are false
[ https://issues.apache.org/jira/browse/MAPREDUCE-7278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated MAPREDUCE-7278: - Fix Version/s: 3.1.5 > Speculative execution behavior is observed even when > mapreduce.map.speculative and mapreduce.reduce.speculative are false > - > > Key: MAPREDUCE-7278 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7278 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: task >Affects Versions: 2.8.0, 3.4.0 >Reporter: Tarun Parimi >Assignee: Tarun Parimi >Priority: Major > Fix For: 3.3.0, 3.2.2, 3.4.0, 3.1.5 > > Attachments: MAPREDUCE-7278.001.patch, MAPREDUCE-7278.002.patch, > MAPREDUCE-7278.003.patch, MAPREDUCE-7278.004.patch, Screen Shot 2020-04-30 at > 8.04.27 PM.png > > > When a failed task attempt container is stuck in FAIL_FINISHING_CONTAINER > state for some time, we observe two task attempts are launched simultaneously > even when speculative execution is disabled. > This results in the below message shown in the killed attempts, indicating > speculation has occurred. This is an issue for jobs which require speculative > execution to be strictly disabled. > !Screen Shot 2020-04-30 at 8.04.27 PM.png! > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-7278) Speculative execution behavior is observed even when mapreduce.map.speculative and mapreduce.reduce.speculative are false
[ https://issues.apache.org/jira/browse/MAPREDUCE-7278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated MAPREDUCE-7278: - Fix Version/s: 3.4.0 3.2.2 3.3.0 > Speculative execution behavior is observed even when > mapreduce.map.speculative and mapreduce.reduce.speculative are false > - > > Key: MAPREDUCE-7278 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7278 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: task >Affects Versions: 2.8.0, 3.4.0 >Reporter: Tarun Parimi >Assignee: Tarun Parimi >Priority: Major > Fix For: 3.3.0, 3.2.2, 3.4.0 > > Attachments: MAPREDUCE-7278.001.patch, MAPREDUCE-7278.002.patch, > MAPREDUCE-7278.003.patch, MAPREDUCE-7278.004.patch, Screen Shot 2020-04-30 at > 8.04.27 PM.png > > > When a failed task attempt container is stuck in FAIL_FINISHING_CONTAINER > state for some time, we observe two task attempts are launched simultaneously > even when speculative execution is disabled. > This results in the below message shown in the killed attempts, indicating > speculation has occurred. This is an issue for jobs which require speculative > execution to be strictly disabled. > !Screen Shot 2020-04-30 at 8.04.27 PM.png! > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-7278) Speculative execution behavior is observed even when mapreduce.map.speculative and mapreduce.reduce.speculative are false
[ https://issues.apache.org/jira/browse/MAPREDUCE-7278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated MAPREDUCE-7278: - Status: In Progress (was: Patch Available) Looking at the releases that have MAPREDUCE-6485 pulling the change back to the active branch-3.2 and branch-3.3. Do we need to also pull this back to 3.1.5? > Speculative execution behavior is observed even when > mapreduce.map.speculative and mapreduce.reduce.speculative are false > - > > Key: MAPREDUCE-7278 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7278 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: task >Affects Versions: 2.8.0, 3.4.0 >Reporter: Tarun Parimi >Assignee: Tarun Parimi >Priority: Major > Attachments: MAPREDUCE-7278.001.patch, MAPREDUCE-7278.002.patch, > MAPREDUCE-7278.003.patch, MAPREDUCE-7278.004.patch, Screen Shot 2020-04-30 at > 8.04.27 PM.png > > > When a failed task attempt container is stuck in FAIL_FINISHING_CONTAINER > state for some time, we observe two task attempts are launched simultaneously > even when speculative execution is disabled. > This results in the below message shown in the killed attempts, indicating > speculation has occurred. This is an issue for jobs which require speculative > execution to be strictly disabled. > !Screen Shot 2020-04-30 at 8.04.27 PM.png! > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7278) Speculative execution behavior is observed even when mapreduce.map.speculative and mapreduce.reduce.speculative are false
[ https://issues.apache.org/jira/browse/MAPREDUCE-7278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17118257#comment-17118257 ] Wilfred Spiegelenburg commented on MAPREDUCE-7278: -- Change is looking good +1, committing to trunk > Speculative execution behavior is observed even when > mapreduce.map.speculative and mapreduce.reduce.speculative are false > - > > Key: MAPREDUCE-7278 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7278 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: task >Affects Versions: 2.8.0, 3.4.0 >Reporter: Tarun Parimi >Assignee: Tarun Parimi >Priority: Major > Attachments: MAPREDUCE-7278.001.patch, MAPREDUCE-7278.002.patch, > MAPREDUCE-7278.003.patch, MAPREDUCE-7278.004.patch, Screen Shot 2020-04-30 at > 8.04.27 PM.png > > > When a failed task attempt container is stuck in FAIL_FINISHING_CONTAINER > state for some time, we observe two task attempts are launched simultaneously > even when speculative execution is disabled. > This results in the below message shown in the killed attempts, indicating > speculation has occurred. This is an issue for jobs which require speculative > execution to be strictly disabled. > !Screen Shot 2020-04-30 at 8.04.27 PM.png! > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7278) Speculative execution behavior is observed even when mapreduce.map.speculative and mapreduce.reduce.speculative are false
[ https://issues.apache.org/jira/browse/MAPREDUCE-7278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17112750#comment-17112750 ] Wilfred Spiegelenburg commented on MAPREDUCE-7278: -- I agree the asf license is not from this patch. Can you please fix the checkstyle issues? Code looks good no comments on that > Speculative execution behavior is observed even when > mapreduce.map.speculative and mapreduce.reduce.speculative are false > - > > Key: MAPREDUCE-7278 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7278 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: task >Affects Versions: 2.8.0, 3.4.0 >Reporter: Tarun Parimi >Assignee: Tarun Parimi >Priority: Major > Attachments: MAPREDUCE-7278.001.patch, MAPREDUCE-7278.002.patch, > MAPREDUCE-7278.003.patch, Screen Shot 2020-04-30 at 8.04.27 PM.png > > > When a failed task attempt container is stuck in FAIL_FINISHING_CONTAINER > state for some time, we observe two task attempts are launched simultaneously > even when speculative execution is disabled. > This results in the below message shown in the killed attempts, indicating > speculation has occurred. This is an issue for jobs which require speculative > execution to be strictly disabled. > !Screen Shot 2020-04-30 at 8.04.27 PM.png! > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7278) Speculative execution behavior is observed even when mapreduce.map.speculative and mapreduce.reduce.speculative are false
[ https://issues.apache.org/jira/browse/MAPREDUCE-7278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17105859#comment-17105859 ] Wilfred Spiegelenburg commented on MAPREDUCE-7278: -- [~tarunparimi] I have added you to the contributor list for the project and assigned the jira to you. The test you describe I think can be converted into a test using a {{MiniMRYarnCluster}} like what is done in the TestSpeculativeExecution. Did you look at that possibility? > Speculative execution behavior is observed even when > mapreduce.map.speculative and mapreduce.reduce.speculative are false > - > > Key: MAPREDUCE-7278 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7278 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: task >Affects Versions: 2.8.0, 3.4.0 >Reporter: Tarun Parimi >Assignee: Tarun Parimi >Priority: Major > Attachments: MAPREDUCE-7278.001.patch, MAPREDUCE-7278.002.patch, > Screen Shot 2020-04-30 at 8.04.27 PM.png > > > When a failed task attempt container is stuck in FAIL_FINISHING_CONTAINER > state for some time, we observe two task attempts are launched simultaneously > even when speculative execution is disabled. > This results in the below message shown in the killed attempts, indicating > speculation has occurred. This is an issue for jobs which require speculative > execution to be strictly disabled. > !Screen Shot 2020-04-30 at 8.04.27 PM.png! > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Assigned] (MAPREDUCE-7278) Speculative execution behavior is observed even when mapreduce.map.speculative and mapreduce.reduce.speculative are false
[ https://issues.apache.org/jira/browse/MAPREDUCE-7278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg reassigned MAPREDUCE-7278: Assignee: Tarun Parimi > Speculative execution behavior is observed even when > mapreduce.map.speculative and mapreduce.reduce.speculative are false > - > > Key: MAPREDUCE-7278 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7278 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: task >Affects Versions: 2.8.0, 3.4.0 >Reporter: Tarun Parimi >Assignee: Tarun Parimi >Priority: Major > Attachments: MAPREDUCE-7278.001.patch, MAPREDUCE-7278.002.patch, > Screen Shot 2020-04-30 at 8.04.27 PM.png > > > When a failed task attempt container is stuck in FAIL_FINISHING_CONTAINER > state for some time, we observe two task attempts are launched simultaneously > even when speculative execution is disabled. > This results in the below message shown in the killed attempts, indicating > speculation has occurred. This is an issue for jobs which require speculative > execution to be strictly disabled. > !Screen Shot 2020-04-30 at 8.04.27 PM.png! > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7269) TestNetworkedJob fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-7269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17076097#comment-17076097 ] Wilfred Spiegelenburg commented on MAPREDUCE-7269: -- Thank you for filing the issue [~aajisaka] The license warning is not related to the change. The changes for the test is in line with the test changes that have been submitted on the YARN side: +1 > TestNetworkedJob fails > -- > > Key: MAPREDUCE-7269 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7269 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Major > > https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1460/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt > {noformat} > [INFO] Running org.apache.hadoop.mapred.TestNetworkedJob > [ERROR] Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: > 20.981 s <<< FAILURE! - in org.apache.hadoop.mapred.TestNetworkedJob > [ERROR] testNetworkedJob(org.apache.hadoop.mapred.TestNetworkedJob) Time > elapsed: 4.588 s <<< FAILURE! > org.junit.ComparisonFailure: expected:<[]default> but was:<[root.]default> > at org.junit.Assert.assertEquals(Assert.java:115) > at org.junit.Assert.assertEquals(Assert.java:144) > at > org.apache.hadoop.mapred.TestNetworkedJob.testNetworkedJob(TestNetworkedJob.java:250) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:748) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7266) historyContext doesn't need to be a class attribute inside JobHistoryServer
[ https://issues.apache.org/jira/browse/MAPREDUCE-7266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17064450#comment-17064450 ] Wilfred Spiegelenburg commented on MAPREDUCE-7266: -- Moved it to the correct project (JHS is MR not YARN). I just noticed that I have not been added yet to the committer lists yet so I cannot add you as a contributor and assign the jira to you. This move should now trigger the build. > historyContext doesn't need to be a class attribute inside JobHistoryServer > --- > > Key: MAPREDUCE-7266 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7266 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: jobhistoryserver >Reporter: Siddharth Ahuja >Priority: Minor > Attachments: YARN-10075.001.patch > > > "historyContext" class attribute at > https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JobHistoryServer.java#L67 > is assigned a cast of another class attribute - "jobHistoryService" - > https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JobHistoryServer.java#L131, > however it does not need to be stored separately because it is only ever > used once in the clas, and that too as an argument while instantiating the > HistoryClientService class at > https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JobHistoryServer.java#L155. > Therefore, we could just delete the lines at > https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JobHistoryServer.java#L67 > and > https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JobHistoryServer.java#L131 > completely and instantiate the HistoryClientService as follows: > {code} > @VisibleForTesting > protected HistoryClientService createHistoryClientService() { > return new HistoryClientService((HistoryContext)jobHistoryService, > this.jhsDTSecretManager); > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Assigned] (MAPREDUCE-7266) historyContext doesn't need to be a class attribute inside JobHistoryServer
[ https://issues.apache.org/jira/browse/MAPREDUCE-7266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg reassigned MAPREDUCE-7266: Component/s: (was: yarn) jobhistoryserver Key: MAPREDUCE-7266 (was: YARN-10075) Assignee: (was: Siddharth Ahuja) Project: Hadoop Map/Reduce (was: Hadoop YARN) > historyContext doesn't need to be a class attribute inside JobHistoryServer > --- > > Key: MAPREDUCE-7266 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7266 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: jobhistoryserver >Reporter: Siddharth Ahuja >Priority: Minor > Attachments: YARN-10075.001.patch > > > "historyContext" class attribute at > https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JobHistoryServer.java#L67 > is assigned a cast of another class attribute - "jobHistoryService" - > https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JobHistoryServer.java#L131, > however it does not need to be stored separately because it is only ever > used once in the clas, and that too as an argument while instantiating the > HistoryClientService class at > https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JobHistoryServer.java#L155. > Therefore, we could just delete the lines at > https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JobHistoryServer.java#L67 > and > https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JobHistoryServer.java#L131 > completely and instantiate the HistoryClientService as follows: > {code} > @VisibleForTesting > protected HistoryClientService createHistoryClientService() { > return new HistoryClientService((HistoryContext)jobHistoryService, > this.jhsDTSecretManager); > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7249) Invalid event TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP causes job failure
[ https://issues.apache.org/jira/browse/MAPREDUCE-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16984279#comment-16984279 ] Wilfred Spiegelenburg commented on MAPREDUCE-7249: -- thank you [~prabhujoseph] for the commit and [~pbacsko] for the review > Invalid event TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP causes > job failure > > > Key: MAPREDUCE-7249 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7249 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster, mrv2 >Affects Versions: 3.1.0 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Critical > Labels: Reviewed > Fix For: 3.3.0, 3.1.4, 3.2.2 > > Attachments: MAPREDUCE-7249-001.patch, MAPREDUCE-7249-002.patch, > MAPREDUCE-7249-branch-3.2.001.patch > > > Same issue as in MAPREDUCE-7240 but this one has a different state in which > the Exception {{TA_TOO_MANY_FETCH_FAILURE}} event is received: > {code} > 2019-11-18 23:03:40,270 ERROR [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Can't handle > this event at current state for attempt_1568654141590_630203_m_003108_1 > org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: > TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1183) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:148) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1388) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1380) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:182) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109) > {code} > The stack trace is from a CDH release which is highly patched 2.6 release. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7249) Invalid event TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP causes job failure
[ https://issues.apache.org/jira/browse/MAPREDUCE-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16984109#comment-16984109 ] Wilfred Spiegelenburg commented on MAPREDUCE-7249: -- the asf license warning is correct: there are files in the branch that should not be there This is caused by the checking for YARN-9011, filed YARN-9993 to remove the files. > Invalid event TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP causes > job failure > > > Key: MAPREDUCE-7249 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7249 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster, mrv2 >Affects Versions: 3.1.0 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Critical > Attachments: MAPREDUCE-7249-001.patch, MAPREDUCE-7249-002.patch, > MAPREDUCE-7249-branch-3.2.001.patch > > > Same issue as in MAPREDUCE-7240 but this one has a different state in which > the Exception {{TA_TOO_MANY_FETCH_FAILURE}} event is received: > {code} > 2019-11-18 23:03:40,270 ERROR [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Can't handle > this event at current state for attempt_1568654141590_630203_m_003108_1 > org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: > TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1183) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:148) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1388) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1380) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:182) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109) > {code} > The stack trace is from a CDH release which is highly patched 2.6 release. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7249) Invalid event TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP causes job failure
[ https://issues.apache.org/jira/browse/MAPREDUCE-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16984042#comment-16984042 ] Wilfred Spiegelenburg commented on MAPREDUCE-7249: -- Also attaching a patch for branch-3.2 and earlier [^MAPREDUCE-7249-branch-3.2.001.patch] applies to both 3.2 and 3.1 branches > Invalid event TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP causes > job failure > > > Key: MAPREDUCE-7249 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7249 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster, mrv2 >Affects Versions: 3.1.0 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Critical > Attachments: MAPREDUCE-7249-001.patch, MAPREDUCE-7249-002.patch, > MAPREDUCE-7249-branch-3.2.001.patch > > > Same issue as in MAPREDUCE-7240 but this one has a different state in which > the Exception {{TA_TOO_MANY_FETCH_FAILURE}} event is received: > {code} > 2019-11-18 23:03:40,270 ERROR [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Can't handle > this event at current state for attempt_1568654141590_630203_m_003108_1 > org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: > TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1183) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:148) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1388) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1380) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:182) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109) > {code} > The stack trace is from a CDH release which is highly patched 2.6 release. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-7249) Invalid event TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP causes job failure
[ https://issues.apache.org/jira/browse/MAPREDUCE-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated MAPREDUCE-7249: - Attachment: MAPREDUCE-7249-branch-3.2.001.patch > Invalid event TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP causes > job failure > > > Key: MAPREDUCE-7249 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7249 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster, mrv2 >Affects Versions: 3.1.0 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Critical > Attachments: MAPREDUCE-7249-001.patch, MAPREDUCE-7249-002.patch, > MAPREDUCE-7249-branch-3.2.001.patch > > > Same issue as in MAPREDUCE-7240 but this one has a different state in which > the Exception {{TA_TOO_MANY_FETCH_FAILURE}} event is received: > {code} > 2019-11-18 23:03:40,270 ERROR [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Can't handle > this event at current state for attempt_1568654141590_630203_m_003108_1 > org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: > TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1183) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:148) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1388) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1380) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:182) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109) > {code} > The stack trace is from a CDH release which is highly patched 2.6 release. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7249) Invalid event TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP causes job failure
[ https://issues.apache.org/jira/browse/MAPREDUCE-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16984034#comment-16984034 ] Wilfred Spiegelenburg commented on MAPREDUCE-7249: -- fixed the checkstyle issues in both files, the \{{TaskAttemptImpl.java}} change has become slightly larger as I fixed the whole block. > Invalid event TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP causes > job failure > > > Key: MAPREDUCE-7249 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7249 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster, mrv2 >Affects Versions: 3.1.0 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Critical > Attachments: MAPREDUCE-7249-001.patch, MAPREDUCE-7249-002.patch > > > Same issue as in MAPREDUCE-7240 but this one has a different state in which > the Exception {{TA_TOO_MANY_FETCH_FAILURE}} event is received: > {code} > 2019-11-18 23:03:40,270 ERROR [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Can't handle > this event at current state for attempt_1568654141590_630203_m_003108_1 > org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: > TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1183) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:148) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1388) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1380) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:182) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109) > {code} > The stack trace is from a CDH release which is highly patched 2.6 release. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-7249) Invalid event TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP causes job failure
[ https://issues.apache.org/jira/browse/MAPREDUCE-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated MAPREDUCE-7249: - Attachment: MAPREDUCE-7249-002.patch > Invalid event TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP causes > job failure > > > Key: MAPREDUCE-7249 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7249 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster, mrv2 >Affects Versions: 3.1.0 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Critical > Attachments: MAPREDUCE-7249-001.patch, MAPREDUCE-7249-002.patch > > > Same issue as in MAPREDUCE-7240 but this one has a different state in which > the Exception {{TA_TOO_MANY_FETCH_FAILURE}} event is received: > {code} > 2019-11-18 23:03:40,270 ERROR [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Can't handle > this event at current state for attempt_1568654141590_630203_m_003108_1 > org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: > TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1183) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:148) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1388) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1380) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:182) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109) > {code} > The stack trace is from a CDH release which is highly patched 2.6 release. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-7249) Invalid event TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP causes job failure
[ https://issues.apache.org/jira/browse/MAPREDUCE-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated MAPREDUCE-7249: - Attachment: MAPREDUCE-7249-001.patch Status: Patch Available (was: Open) patch for the issue > Invalid event TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP causes > job failure > > > Key: MAPREDUCE-7249 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7249 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster, mrv2 >Affects Versions: 3.1.0 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Critical > Attachments: MAPREDUCE-7249-001.patch > > > Same issue as in MAPREDUCE-7240 but this one has a different state in which > the Exception {{TA_TOO_MANY_FETCH_FAILURE}} event is received: > {code} > 2019-11-18 23:03:40,270 ERROR [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Can't handle > this event at current state for attempt_1568654141590_630203_m_003108_1 > org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: > TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1183) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:148) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1388) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1380) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:182) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109) > {code} > The stack trace is from a CDH release which is highly patched 2.6 release. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-7249) Invalid event TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP causes job failure
[ https://issues.apache.org/jira/browse/MAPREDUCE-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated MAPREDUCE-7249: - Summary: Invalid event TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP causes job failure (was: Invalid event TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP cause job ) > Invalid event TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP causes > job failure > > > Key: MAPREDUCE-7249 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7249 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster, mrv2 >Affects Versions: 3.1.0 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Critical > > Same issue as in MAPREDUCE-7240 but this one has a different state in which > the Exception {{TA_TOO_MANY_FETCH_FAILURE}} event is received: > {code} > 2019-11-18 23:03:40,270 ERROR [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Can't handle > this event at current state for attempt_1568654141590_630203_m_003108_1 > org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: > TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1183) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:148) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1388) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1380) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:182) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109) > {code} > The stack trace is from a CDH release which is highly patched 2.6 release. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7249) Invalid event TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP cause job
Wilfred Spiegelenburg created MAPREDUCE-7249: Summary: Invalid event TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP cause job Key: MAPREDUCE-7249 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7249 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster, mrv2 Affects Versions: 3.1.0 Reporter: Wilfred Spiegelenburg Assignee: Wilfred Spiegelenburg Same issue as in MAPREDUCE-7240 but this one has a different state in which the Exception {{TA_TOO_MANY_FETCH_FAILURE}} event is received: {code} 2019-11-18 23:03:40,270 ERROR [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Can't handle this event at current state for attempt_1568654141590_630203_m_003108_1 org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1183) at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:148) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1388) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1380) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:182) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109) {code} The stack trace is from a CDH release which is highly patched 2.6 release. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7240) Exception ' Invalid event: TA_TOO_MANY_FETCH_FAILURE at SUCCESS_FINISHING_CONTAINER' cause job error
[ https://issues.apache.org/jira/browse/MAPREDUCE-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16982981#comment-16982981 ] Wilfred Spiegelenburg commented on MAPREDUCE-7240: -- I checked the PRs that are linked to this jira. Jason gave a +1 on the trunk version in [PR #1674|https://github.com/apache/hadoop/pull/1674]. If your patch follows that change we should be good to go. +1 (non binding) For the concern raised in this [comment|https://issues.apache.org/jira/browse/MAPREDUCE-7240?focusedCommentId=16982254&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16982254]: if the container ignores the newly raised event then the AM needs to handle that as per normal. The main issue in the current code is that because it does not handle the fetch failure event a {{InvalidStateTransitionException}} is raised which causes the job to fail. After the change the event is handled and the job should continue and finish processing. The job can still fail as per normal but the single too many fetch failures event does not cause the job to fail immediately. > Exception ' Invalid event: TA_TOO_MANY_FETCH_FAILURE at > SUCCESS_FINISHING_CONTAINER' cause job error > > > Key: MAPREDUCE-7240 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7240 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.8.2 >Reporter: luhuachao >Assignee: luhuachao >Priority: Critical > Labels: kerberos > Attachments: MAPREDUCE-7240-001.patch, > application_1566552310686_260041.log > > > *log in appmaster* > {noformat} > 2019-09-03 17:18:43,090 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Too many fetch-failures > for output of task attempt: attempt_1566552310686_260041_m_52_0 ... > raising fetch failure to map > 2019-09-03 17:18:43,091 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Too many fetch-failures > for output of task attempt: attempt_1566552310686_260041_m_49_0 ... > raising fetch failure to map > 2019-09-03 17:18:43,091 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Too many fetch-failures > for output of task attempt: attempt_1566552310686_260041_m_51_0 ... > raising fetch failure to map > 2019-09-03 17:18:43,091 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Too many fetch-failures > for output of task attempt: attempt_1566552310686_260041_m_50_0 ... > raising fetch failure to map > 2019-09-03 17:18:43,091 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Too many fetch-failures > for output of task attempt: attempt_1566552310686_260041_m_53_0 ... > raising fetch failure to map > 2019-09-03 17:18:43,092 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: > attempt_1566552310686_260041_m_52_0 transitioned from state SUCCEEDED to > FAILED, event type is TA_TOO_MANY_FETCH_FAILURE and nodeId=yarn095:45454 > 2019-09-03 17:18:43,092 ERROR [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Can't handle > this event at current state for attempt_1566552310686_260041_m_49_0 > org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: > TA_TOO_MANY_FETCH_FAILURE at SUCCESS_FINISHING_CONTAINER > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1206) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:146) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1458) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1450) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110) > at java.lang.Thread.run(Thread.java:745) > 2019-09-03 17:18:43,093 ERROR [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Can't handle > this event at current state for attempt_1566552310686_260041_m
[jira] [Commented] (MAPREDUCE-7225) Fix broken current folder expansion during MR job start
[ https://issues.apache.org/jira/browse/MAPREDUCE-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16897735#comment-16897735 ] Wilfred Spiegelenburg commented on MAPREDUCE-7225: -- Thank you [~pbacsko] for the patch I looked over the change from patch v3 and the impact seems minimal. Change option 2 and 3 are not really an option as we have too many calls throughout the repository and probably in projects outside hadoop that could break due to a change. +1 non binding from me. > Fix broken current folder expansion during MR job start > --- > > Key: MAPREDUCE-7225 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7225 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 2.9.0, 3.0.3 >Reporter: Adam Antal >Assignee: Peter Bacsko >Priority: Major > Attachments: MAPREDUCE-7225-001.patch, MAPREDUCE-7225-002.patch, > MAPREDUCE-7225-002.patch, MAPREDUCE-7225-003.patch > > > Starting a sleep job giving "." as files that should be localized is working > fine up until 2.9.0, but after that the user is given an > IllegalArgumentException. This change is a side-effect of HADOOP-12747 where > {{GenericOptionsParser#validateFiles}} function got modified. > Can be reproduced by starting a sleep job with "-files ." given as extra > parameter. Log: > {noformat} > sudo -u hdfs hadoop jar hadoop-mapreduce-client-jobclient-3.0.0.jar sleep > -files . -m 1 -r 1 -rt 2000 -mt 2000 > WARNING: Use "yarn jar" to launch YARN applications. > 19/07/17 08:13:26 INFO client.ConfiguredRMFailoverProxyProvider: Failing over > to rm21 > 19/07/17 08:13:26 INFO mapreduce.JobResourceUploader: Disabling Erasure > Coding for path: /user/hdfs/.staging/job_1563349475208_0017 > 19/07/17 08:13:26 INFO mapreduce.JobSubmitter: Cleaning up the staging area > /user/hdfs/.staging/job_1563349475208_0017 > java.lang.IllegalArgumentException: Can not create a Path from an empty string > at org.apache.hadoop.fs.Path.checkPathArg(Path.java:168) > at org.apache.hadoop.fs.Path.(Path.java:180) > at org.apache.hadoop.fs.Path.(Path.java:125) > at > org.apache.hadoop.mapreduce.JobResourceUploader.copyRemoteFiles(JobResourceUploader.java:686) > at > org.apache.hadoop.mapreduce.JobResourceUploader.uploadFiles(JobResourceUploader.java:262) > at > org.apache.hadoop.mapreduce.JobResourceUploader.uploadResourcesInternal(JobResourceUploader.java:203) > at > org.apache.hadoop.mapreduce.JobResourceUploader.uploadResources(JobResourceUploader.java:131) > at > org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:99) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:194) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1726) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567) > at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1588) > at org.apache.hadoop.mapreduce.SleepJob.run(SleepJob.java:273) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.mapreduce.SleepJob.main(SleepJob.java:194) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71) > at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) > at > org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:139) > at > org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:147) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.util.RunJar.run(RunJar.java:313) > at org.apache.hadoop.util.RunJar.main(RunJar.java:227) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: mapreduce-issue
[jira] [Commented] (MAPREDUCE-7196) FairScheduler queue ACLs not implemented for application actions
[ https://issues.apache.org/jira/browse/MAPREDUCE-7196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16802737#comment-16802737 ] Wilfred Spiegelenburg commented on MAPREDUCE-7196: -- Moved from the YARN project. The mapred-default.xml needs to be updated as the queue administrators that existed in the MR1 time (JT & TT) do not exist anymore and they get confused with the YARN queue admins which are not related at all. > FairScheduler queue ACLs not implemented for application actions > > > Key: MAPREDUCE-7196 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7196 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: documentation >Reporter: Tristan Stevens >Priority: Major > > The mapred-site.xml options mapreduce.job.acl-modify-job and > mapreduce.job.acl-view-job both specify that queue ACLs should apply for read > and modify operations on a job, however according to > org.apache.hadoop.yarn.server.security.ApplicationACLsManager.java this > feature has not been implemented. > This is very important otherwise it is difficult to manage a cluster with a > complicated queue hierarchy without either putting everyone in the admin ACL, > getting many support tickets or asking people to remember to set > mapreduce.job.acl-modify-job and mapreduce.job.acl-view-job. > Extract from mapred-default.xml: > bq. Irrespective of this ACL configuration, (a) job-owner, (b) the user who > started the cluster, (c) members of an admin configured supergroup configured > via mapreduce.cluster.permissions.supergroup and *(d) queue administrators of > the queue to which this job was submitted* to configured via > acl-administer-jobs for the specific queue in mapred-queues.xml can do all > the view operations on a job. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Moved] (MAPREDUCE-7196) FairScheduler queue ACLs not implemented for application actions
[ https://issues.apache.org/jira/browse/MAPREDUCE-7196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg moved YARN-8026 to MAPREDUCE-7196: Component/s: (was: fairscheduler) documentation Issue Type: Improvement (was: Bug) Key: MAPREDUCE-7196 (was: YARN-8026) Project: Hadoop Map/Reduce (was: Hadoop YARN) > FairScheduler queue ACLs not implemented for application actions > > > Key: MAPREDUCE-7196 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7196 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: documentation >Reporter: Tristan Stevens >Priority: Major > > The mapred-site.xml options mapreduce.job.acl-modify-job and > mapreduce.job.acl-view-job both specify that queue ACLs should apply for read > and modify operations on a job, however according to > org.apache.hadoop.yarn.server.security.ApplicationACLsManager.java this > feature has not been implemented. > This is very important otherwise it is difficult to manage a cluster with a > complicated queue hierarchy without either putting everyone in the admin ACL, > getting many support tickets or asking people to remember to set > mapreduce.job.acl-modify-job and mapreduce.job.acl-view-job. > Extract from mapred-default.xml: > bq. Irrespective of this ACL configuration, (a) job-owner, (b) the user who > started the cluster, (c) members of an admin configured supergroup configured > via mapreduce.cluster.permissions.supergroup and *(d) queue administrators of > the queue to which this job was submitted* to configured via > acl-administer-jobs for the specific queue in mapred-queues.xml can do all > the view operations on a job. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7180) Relaunching Failed Containers
[ https://issues.apache.org/jira/browse/MAPREDUCE-7180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16784121#comment-16784121 ] Wilfred Spiegelenburg commented on MAPREDUCE-7180: -- What I meant is that if an application fails we re-run the application. Any finished tasks are OK they are recovered, running tasks are killed and restarted. If they had failed once or more times for the first attempt and we relaunched them with larger heaps we start the process of increasing the containers again from scratch, wasting more resources. I think what Daniel proposed is the simplest most elegant solution. If we have a task that fails due to exceeding the container we should fail the application and let the end user and or admin sort it out. Even for an Oozie workflow or in the Hive case running jobs through beeline you can set the size of the container etc via the command line. I think finding the cause is not that difficult but as part of the change to fail the application we could make it really clear in the diagnostics of the application what failed and which action to take. The message for the container exceeding the settings has already been extended via YARN-7580 and should be clearer in 3.1 and later. > Relaunching Failed Containers > - > > Key: MAPREDUCE-7180 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7180 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Components: mrv1, mrv2 >Reporter: David Mollitor >Priority: Major > > In my experience, it is very common that a MR job completely fails because a > single Mapper/Reducer container is using more memory than has been reserved > in YARN. The following message is logging the the MapReduce > ApplicationMaster: > {code} > Container [pid=46028,containerID=container_e54_1435155934213_16721_01_003666] > is running beyond physical memory limits. > Current usage: 1.0 GB of 1 GB physical memory used; 2.7 GB of 2.1 GB virtual > memory used. Killing container. > {code} > In this case, the container is re-launched on another node, and of course, it > is killed again for the same reason. This process happens three (maybe > four?) times before the entire MapReduce job fails. It's often said that the > definition of insanity is doing the same thing over and over and expecting > different results. > For all intents and purposes, the amount of resources requested by Mappers > and Reducers is a fixed amount; based on the default configuration values. > Users can set the memory on a per-job basis, but it's a pain, not exact, and > requires intimate knowledge of the MapReduce framework and its memory usage > patterns. > I propose that if the MR ApplicationMaster detects that a container is killed > because of this specific memory resource constraint, that it requests a > larger container for the subsequent task attempt. > For example, increase the requested memory size by 50% each time the > container fails and the task is retried. This will prevent many Job failures > and allow for additional memory tuning, per-Job, after the fact, to get > better performance (v.s. fail/succeed). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7180) Relaunching Failed Containers
[ https://issues.apache.org/jira/browse/MAPREDUCE-7180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16782937#comment-16782937 ] Wilfred Spiegelenburg commented on MAPREDUCE-7180: -- The 80/20 case as DanieI said will not work for all cases but it handles almost all use cases. The headroom ratio is configurable which means that if you know you have a high overhead due to the type of code you run you can set it cluster wide I would be in favour of not wasting resources and fail the application when the JVM goes OOM for one or more tasks. The re-run with adjusted settings has more drawbacks than advantages I think. The main reason I am not in favour of the auto retries is the hiding of possible issues and not providing a guarantee that it will work. There is a good chance that when one mapper or reducer fails due to memory issues that there are more mappers or reducers that will fail in the same way. Multiple tasks failing increases the overhead on the cluster like Jim mentioned in his example. With data growing or small code changes either in the app, MR framework or JVM over time you could be putting a lot of extra strain on a cluster. What if the application still failed due to task failures: how do we handle an application re-run? Won't that start from scratch again and thus waste more resources. > Relaunching Failed Containers > - > > Key: MAPREDUCE-7180 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7180 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Components: mrv1, mrv2 >Reporter: BELUGA BEHR >Priority: Major > > In my experience, it is very common that a MR job completely fails because a > single Mapper/Reducer container is using more memory than has been reserved > in YARN. The following message is logging the the MapReduce > ApplicationMaster: > {code} > Container [pid=46028,containerID=container_e54_1435155934213_16721_01_003666] > is running beyond physical memory limits. > Current usage: 1.0 GB of 1 GB physical memory used; 2.7 GB of 2.1 GB virtual > memory used. Killing container. > {code} > In this case, the container is re-launched on another node, and of course, it > is killed again for the same reason. This process happens three (maybe > four?) times before the entire MapReduce job fails. It's often said that the > definition of insanity is doing the same thing over and over and expecting > different results. > For all intents and purposes, the amount of resources requested by Mappers > and Reducers is a fixed amount; based on the default configuration values. > Users can set the memory on a per-job basis, but it's a pain, not exact, and > requires intimate knowledge of the MapReduce framework and its memory usage > patterns. > I propose that if the MR ApplicationMaster detects that a container is killed > because of this specific memory resource constraint, that it requests a > larger container for the subsequent task attempt. > For example, increase the requested memory size by 50% each time the > container fails and the task is retried. This will prevent many Job failures > and allow for additional memory tuning, per-Job, after the fact, to get > better performance (v.s. fail/succeed). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7180) Relaunching Failed Containers
[ https://issues.apache.org/jira/browse/MAPREDUCE-7180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16761403#comment-16761403 ] Wilfred Spiegelenburg commented on MAPREDUCE-7180: -- When you use MAPREDUCE-5785 you should not see the type of failures that you are trying to prevent. The heap and its overhead should always fit in the container unless you have some special off heap case. You should thus only expect to see them for the 3rd party library and or off heap issues. What you are trying to implement is really only relevant for the edge cases like the misconfiguration which you state are not really the goal as the job will still fail. That is why I think adding all this to hide a misconfiguration is the wrong thing to do. > Relaunching Failed Containers > - > > Key: MAPREDUCE-7180 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7180 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Components: mrv1, mrv2 >Reporter: BELUGA BEHR >Priority: Major > > In my experience, it is very common that a MR job completely fails because a > single Mapper/Reducer container is using more memory than has been reserved > in YARN. The following message is logging the the MapReduce > ApplicationMaster: > {code} > Container [pid=46028,containerID=container_e54_1435155934213_16721_01_003666] > is running beyond physical memory limits. > Current usage: 1.0 GB of 1 GB physical memory used; 2.7 GB of 2.1 GB virtual > memory used. Killing container. > {code} > In this case, the container is re-launched on another node, and of course, it > is killed again for the same reason. This process happens three (maybe > four?) times before the entire MapReduce job fails. It's often said that the > definition of insanity is doing the same thing over and over and expecting > different results. > For all intents and purposes, the amount of resources requested by Mappers > and Reducers is a fixed amount; based on the default configuration values. > Users can set the memory on a per-job basis, but it's a pain, not exact, and > requires intimate knowledge of the MapReduce framework and its memory usage > patterns. > I propose that if the MR ApplicationMaster detects that a container is killed > because of this specific memory resource constraint, that it requests a > larger container for the subsequent task attempt. > For example, increase the requested memory size by 50% each time the > container fails and the task is retried. This will prevent many Job failures > and allow for additional memory tuning, per-Job, after the fact, to get > better performance (v.s. fail/succeed). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7180) Relaunching Failed Containers
[ https://issues.apache.org/jira/browse/MAPREDUCE-7180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16760412#comment-16760412 ] Wilfred Spiegelenburg commented on MAPREDUCE-7180: -- I have some reservations also on just growing on a failure. Letting the application fail is the best way to get the job reviewed and configured correctly. For a properly configured job we should see the GC kick in way before we run over the size of the container. If your default settings do not take care of that you are not managing the cluster correctly. In MAPREDUCE-5785 we introduced the automatic calculation of the heap size based on the container size and vice versa. If you use that control it should mean that you never get into this situation. What happens when the application relies on that calculation for the heap and or container size and still fails? How are you going to handle that case if the container fails with the same message? Are you going to also change the ratio heap to container that is configured. That case could be caused by the mapper or reducer using more off heap memory (3rd party library). How is that going to work with this auto re-run? Another point to consider is that I can always run over the container by setting an overly large heap. As an example: I know my job can run in a 1GB heap as I have tried it. I now set 10GB as the heap as a test. GCs will not kick in as the heap is not really full and will just keep growing way above the 1GB. If I would configure the job to run in a 2GB container then the overly large heap will cause it to fail. It might even fail when I make the container 4GB or 8GB. Just doubling and re-running is going to be problematic. Using the available configuration and the smarts that is build in is a far better solution. > Relaunching Failed Containers > - > > Key: MAPREDUCE-7180 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7180 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Components: mrv1, mrv2 >Reporter: BELUGA BEHR >Priority: Major > > In my experience, it is very common that a MR job completely fails because a > single Mapper/Reducer container is using more memory than has been reserved > in YARN. The following message is logging the the MapReduce > ApplicationMaster: > {code} > Container [pid=46028,containerID=container_e54_1435155934213_16721_01_003666] > is running beyond physical memory limits. > Current usage: 1.0 GB of 1 GB physical memory used; 2.7 GB of 2.1 GB virtual > memory used. Killing container. > {code} > In this case, the container is re-launched on another node, and of course, it > is killed again for the same reason. This process happens three (maybe > four?) times before the entire MapReduce job fails. It's often said that the > definition of insanity is doing the same thing over and over and expecting > different results. > For all intents and purposes, the amount of resources requested by Mappers > and Reducers is a fixed amount; based on the default configuration values. > Users can set the memory on a per-job basis, but it's a pain, not exact, and > requires intimate knowledge of the MapReduce framework and its memory usage > patterns. > I propose that if the MR ApplicationMaster detects that a container is killed > because of this specific memory resource constraint, that it requests a > larger container for the subsequent task attempt. > For example, increase the requested memory size by 50% each time the > container fails and the task is retried. This will prevent many Job failures > and allow for additional memory tuning, per-Job, after the fact, to get > better performance (v.s. fail/succeed). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7159) FrameworkUploader: ensure proper permissions of generated framework tar.gz if restrictive umask is used
[ https://issues.apache.org/jira/browse/MAPREDUCE-7159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16710874#comment-16710874 ] Wilfred Spiegelenburg commented on MAPREDUCE-7159: -- Patch looks good to me +1 (non binding) Thank you for the fix. > FrameworkUploader: ensure proper permissions of generated framework tar.gz if > restrictive umask is used > --- > > Key: MAPREDUCE-7159 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7159 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 3.1.1 >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: MAPREDUCE-7159-001.patch, MAPREDUCE-7159-002.patch, > MAPREDUCE-7159-003.patch, MAPREDUCE-7159-004.patch, MAPREDUCE-7159-005.patch, > MAPREDUCE-7159-006.patch > > > Using certain umask values (like 027) makes files unreadable to "others". > This causes problems if the FrameworkUploader > (https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-uploader/src/main/java/org/apache/hadoop/mapred/uploader/FrameworkUploader.java) > is used - it's necessary that the compressed MR framework is readable by all > users, otherwise they won't be able to run MR jobs. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7159) FrameworkUploader: ensure proper permissions of generated framework tar.gz if restrictive umask is used
[ https://issues.apache.org/jira/browse/MAPREDUCE-7159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709538#comment-16709538 ] Wilfred Spiegelenburg commented on MAPREDUCE-7159: -- Thank you for the updated patch [~pbacsko]. I agree it is probably not a good idea to fix something that was created externally, it might not even be allowed to change the permissions. Instead of fixing the broken permissions we should abort the upload or log a WARN/ERROR message that the permissions are broken. The uploaded framework is not usable and from an end user perspective it is not the correct thing. If we know it is broken we should report it back to the user. > FrameworkUploader: ensure proper permissions of generated framework tar.gz if > restrictive umask is used > --- > > Key: MAPREDUCE-7159 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7159 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 3.1.1 >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: MAPREDUCE-7159-001.patch, MAPREDUCE-7159-002.patch, > MAPREDUCE-7159-003.patch, MAPREDUCE-7159-004.patch, MAPREDUCE-7159-005.patch > > > Using certain umask values (like 027) makes files unreadable to "others". > This causes problems if the FrameworkUploader > (https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-uploader/src/main/java/org/apache/hadoop/mapred/uploader/FrameworkUploader.java) > is used - it's necessary that the compressed MR framework is readable by all > users, otherwise they won't be able to run MR jobs. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7159) FrameworkUploader: ensure proper permissions of generated framework tar.gz if restrictive umask is used
[ https://issues.apache.org/jira/browse/MAPREDUCE-7159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16707882#comment-16707882 ] Wilfred Spiegelenburg commented on MAPREDUCE-7159: -- Thank you [~pbacsko] I have two questions: # you currently have fixed just the distributed filesystem. Does this same issue not happen if we do not have a distributed file system and directly create the stream? # In this case there are restrictive settings on the file itself, what f there are restrictive settings in the path? That case does not seem to be handled at all as the only thing we check in {{validateTargetPath}} is the start of the URI. We need to have at least traversal rights on the whole path. > FrameworkUploader: ensure proper permissions of generated framework tar.gz if > restrictive umask is used > --- > > Key: MAPREDUCE-7159 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7159 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 3.1.1 >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: MAPREDUCE-7159-001.patch, MAPREDUCE-7159-002.patch, > MAPREDUCE-7159-003.patch > > > Using certain umask values (like 027) makes files unreadable to "others". > This causes problems if the FrameworkUploader > (https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-uploader/src/main/java/org/apache/hadoop/mapred/uploader/FrameworkUploader.java) > is used - it's necessary that the compressed MR framework is readable by all > users, otherwise they won't be able to run MR jobs. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7094) LocalDistributedCacheManager leaves classloaders open, which leaks FDs
[ https://issues.apache.org/jira/browse/MAPREDUCE-7094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16471401#comment-16471401 ] Wilfred Spiegelenburg commented on MAPREDUCE-7094: -- Thank you [~szita] patch is looking good +1 (non binding) > LocalDistributedCacheManager leaves classloaders open, which leaks FDs > -- > > Key: MAPREDUCE-7094 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7094 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Adam Szita >Assignee: Adam Szita >Priority: Major > Attachments: MAPREDUCE-7094.0.patch, MAPREDUCE-7094.1.patch, > MAPREDUCE-7094.2.patch > > > When a user starts a local mapred task from Hive's beeline, it will leave > open file descriptors on the HS2 process (which runs the mapred task). > I debugged this and saw that it is caused by LocalDistributedCacheManager > class, which creates a new URLClassLoader, with a classpath for the two jars > seen below. Somewhere down the line Loaders will be created in this > URLClassLoader for these files effectively creating the FD's on the OS level. > This is never cleaned up after execution, although > LocalDistributedCacheManager removes the files, it will not close the > ClassLoader, so FDs are left open although they point to deleted files at > that time: > {code:java} > [root@host-1 ~]# lsof -p 14439 | grep hadoop-hive > java 14439 hive DEL REG 8,1 3348748 > /tmp/hadoop-hive/mapred/local/1525789796610/hive-exec-core.jar > java 14439 hive DEL REG 8,1 3348750 > /tmp/hadoop-hive/mapred/local/1525789796609/hive-exec-1.1.0-cdh5.13.4-SNAPSHOT-core.jar > java 14439 hive 649r REG 8,1 8112438 3348750 > /tmp/hadoop-hive/mapred/local/1525789796609/hive-exec-1.1.0-cdh5.13.4-SNAPSHOT-core.jar > (deleted) > java 14439 hive 650r REG 8,1 8112438 3348748 > /tmp/hadoop-hive/mapred/local/1525789796610/hive-exec-core.jar (deleted) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7094) LocalDistributedCacheManager leaves classloaders open, which leaks FDs
[ https://issues.apache.org/jira/browse/MAPREDUCE-7094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469771#comment-16469771 ] Wilfred Spiegelenburg commented on MAPREDUCE-7094: -- Based on the code path there is always only one class loader created and active. So why do we need to keep track of multiple and can we not just close the one open class loader correctly in the close method? That would also fix your findbugs issue. There is only one call to the {{makeClassLoader}} method which is not inside a loop in {{LocalJobRunner}}. There is also no loop in {{LocalDistributedCacheManager}} to create multiple loaders at the same time. > LocalDistributedCacheManager leaves classloaders open, which leaks FDs > -- > > Key: MAPREDUCE-7094 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7094 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Adam Szita >Priority: Major > Attachments: MAPREDUCE-7094.0.patch, MAPREDUCE-7094.1.patch > > > When a user starts a local mapred task from Hive's beeline, it will leave > open file descriptors on the HS2 process (which runs the mapred task). > I debugged this and saw that it is caused by LocalDistributedCacheManager > class, which creates a new URLClassLoader, with a classpath for the two jars > seen below. Somewhere down the line Loaders will be created in this > URLClassLoader for these files effectively creating the FD's on the OS level. > This is never cleaned up after execution, although > LocalDistributedCacheManager removes the files, it will not close the > ClassLoader, so FDs are left open although they point to deleted files at > that time: > {code:java} > [root@host-1 ~]# lsof -p 14439 | grep hadoop-hive > java 14439 hive DEL REG 8,1 3348748 > /tmp/hadoop-hive/mapred/local/1525789796610/hive-exec-core.jar > java 14439 hive DEL REG 8,1 3348750 > /tmp/hadoop-hive/mapred/local/1525789796609/hive-exec-1.1.0-cdh5.13.4-SNAPSHOT-core.jar > java 14439 hive 649r REG 8,1 8112438 3348750 > /tmp/hadoop-hive/mapred/local/1525789796609/hive-exec-1.1.0-cdh5.13.4-SNAPSHOT-core.jar > (deleted) > java 14439 hive 650r REG 8,1 8112438 3348748 > /tmp/hadoop-hive/mapred/local/1525789796610/hive-exec-core.jar (deleted) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7072) mapred job -history prints duplicate counter in human output
[ https://issues.apache.org/jira/browse/MAPREDUCE-7072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16455808#comment-16455808 ] Wilfred Spiegelenburg commented on MAPREDUCE-7072: -- No idea what I did yesterday but the diff was not correct and did not correspond to what I had locally. Fixed it up: compile pass, tests pass > mapred job -history prints duplicate counter in human output > > > Key: MAPREDUCE-7072 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7072 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client >Affects Versions: 3.0.0 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Major > Attachments: MAPREDUCE-7072-branch-2.02.patch, > MAPREDUCE-7072-branch-2.03.patch, MAPREDUCE-7072.01.patch, > MAPREDUCE-7072.02.patch > > > 'mapred job -history' command prints duplicate entries for counters only for > the human output format. It does not do this for the JSON format. > mapred job -history /user/history/somefile.jhist -format human > {code} > > |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000 > ... > |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000 > > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-7072) mapred job -history prints duplicate counter in human output
[ https://issues.apache.org/jira/browse/MAPREDUCE-7072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated MAPREDUCE-7072: - Attachment: MAPREDUCE-7072-branch-2.03.patch > mapred job -history prints duplicate counter in human output > > > Key: MAPREDUCE-7072 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7072 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client >Affects Versions: 3.0.0 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Major > Attachments: MAPREDUCE-7072-branch-2.02.patch, > MAPREDUCE-7072-branch-2.03.patch, MAPREDUCE-7072.01.patch, > MAPREDUCE-7072.02.patch > > > 'mapred job -history' command prints duplicate entries for counters only for > the human output format. It does not do this for the JSON format. > mapred job -history /user/history/somefile.jhist -format human > {code} > > |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000 > ... > |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000 > > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7072) mapred job -history prints duplicate counter in human output
[ https://issues.apache.org/jira/browse/MAPREDUCE-7072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16453412#comment-16453412 ] Wilfred Spiegelenburg commented on MAPREDUCE-7072: -- Branch 2 patch attached, compiles with java 7 after removing two fields in the job info creation > mapred job -history prints duplicate counter in human output > > > Key: MAPREDUCE-7072 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7072 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client >Affects Versions: 3.0.0 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Major > Attachments: MAPREDUCE-7072-branch-2.02.patch, > MAPREDUCE-7072.01.patch, MAPREDUCE-7072.02.patch > > > 'mapred job -history' command prints duplicate entries for counters only for > the human output format. It does not do this for the JSON format. > mapred job -history /user/history/somefile.jhist -format human > {code} > > |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000 > ... > |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000 > > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-7072) mapred job -history prints duplicate counter in human output
[ https://issues.apache.org/jira/browse/MAPREDUCE-7072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated MAPREDUCE-7072: - Attachment: MAPREDUCE-7072-branch-2.02.patch > mapred job -history prints duplicate counter in human output > > > Key: MAPREDUCE-7072 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7072 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client >Affects Versions: 3.0.0 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Major > Attachments: MAPREDUCE-7072-branch-2.02.patch, > MAPREDUCE-7072.01.patch, MAPREDUCE-7072.02.patch > > > 'mapred job -history' command prints duplicate entries for counters only for > the human output format. It does not do this for the JSON format. > mapred job -history /user/history/somefile.jhist -format human > {code} > > |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000 > ... > |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000 > > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7072) mapred job -history prints duplicate counter in human output
[ https://issues.apache.org/jira/browse/MAPREDUCE-7072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16449915#comment-16449915 ] Wilfred Spiegelenburg commented on MAPREDUCE-7072: -- updated the patch with the review comments, I completely overlooked this far more elegant solution. > mapred job -history prints duplicate counter in human output > > > Key: MAPREDUCE-7072 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7072 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client >Affects Versions: 3.0.0 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Major > Attachments: MAPREDUCE-7072.01.patch, MAPREDUCE-7072.02.patch > > > 'mapred job -history' command prints duplicate entries for counters only for > the human output format. It does not do this for the JSON format. > mapred job -history /user/history/somefile.jhist -format human > {code} > > |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000 > ... > |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000 > > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-7072) mapred job -history prints duplicate counter in human output
[ https://issues.apache.org/jira/browse/MAPREDUCE-7072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated MAPREDUCE-7072: - Attachment: MAPREDUCE-7072.02.patch > mapred job -history prints duplicate counter in human output > > > Key: MAPREDUCE-7072 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7072 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client >Affects Versions: 3.0.0 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Major > Attachments: MAPREDUCE-7072.01.patch, MAPREDUCE-7072.02.patch > > > 'mapred job -history' command prints duplicate entries for counters only for > the human output format. It does not do this for the JSON format. > mapred job -history /user/history/somefile.jhist -format human > {code} > > |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000 > ... > |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000 > > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-7072) mapred job -history prints duplicate counter in human output
[ https://issues.apache.org/jira/browse/MAPREDUCE-7072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated MAPREDUCE-7072: - Status: Patch Available (was: Open) patch including tests: the json and human printer have both been changed to ignore the deprecated counters and keep code in sync > mapred job -history prints duplicate counter in human output > > > Key: MAPREDUCE-7072 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7072 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client >Affects Versions: 3.0.0 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Major > Attachments: MAPREDUCE-7072.01.patch > > > 'mapred job -history' command prints duplicate entries for counters only for > the human output format. It does not do this for the JSON format. > mapred job -history /user/history/somefile.jhist -format human > {code} > > |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000 > ... > |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000 > > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-7072) mapred job -history prints duplicate counter in human output
[ https://issues.apache.org/jira/browse/MAPREDUCE-7072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated MAPREDUCE-7072: - Attachment: MAPREDUCE-7072.01.patch > mapred job -history prints duplicate counter in human output > > > Key: MAPREDUCE-7072 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7072 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client >Affects Versions: 3.0.0 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Major > Attachments: MAPREDUCE-7072.01.patch > > > 'mapred job -history' command prints duplicate entries for counters only for > the human output format. It does not do this for the JSON format. > mapred job -history /user/history/somefile.jhist -format human > {code} > > |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000 > ... > |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000 > > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7072) mapred job -history prints duplicate counter in human output
[ https://issues.apache.org/jira/browse/MAPREDUCE-7072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425564#comment-16425564 ] Wilfred Spiegelenburg commented on MAPREDUCE-7072: -- The root cause of the issue is located in the {{AbstractCounters}} code {{getGroupNames()}} When you track through the code in the debugger the number of counter groups returned is higher than expected. This is due to the fact that we add the deprecated counters names to the list of counter group names before we return. The display name of the counters that are tracked in the deprecated list, stored in the legacyMap, are the same as the display names in the non-deprecated counters. The deprecated counters added are already in the non deprecated list which causes the duplication. It works in the JSON format because it internally uses a HashMap. The HashMap uses the name of the counter groups as the key. The keys clash and we thus overwrite the existing value with the value from the deprecated value. To track where this issue is coming from: MAPREDUCE-4053 changed the iteration to work for oozie and seems related to OOZIE-777 and the HadoopELFunctions which still seems to use the deprecated counter name. Changing what the method returns is thus not possible without breaking oozie. We can use the iterator that can be returned by the abstract counters as it does not include the deprecated names. > mapred job -history prints duplicate counter in human output > > > Key: MAPREDUCE-7072 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7072 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client >Affects Versions: 3.0.0 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg >Priority: Major > > 'mapred job -history' command prints duplicate entries for counters only for > the human output format. It does not do this for the JSON format. > mapred job -history /user/history/somefile.jhist -format human > {code} > > |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000 > ... > |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000 > > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7072) mapred job -history prints duplicate counter in human output
Wilfred Spiegelenburg created MAPREDUCE-7072: Summary: mapred job -history prints duplicate counter in human output Key: MAPREDUCE-7072 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7072 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 3.0.0 Reporter: Wilfred Spiegelenburg Assignee: Wilfred Spiegelenburg 'mapred job -history' command prints duplicate entries for counters only for the human output format. It does not do this for the JSON format. mapred job -history /user/history/somefile.jhist -format human {code} |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000 ... |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000 {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7028) Concurrent task progress updates causing NPE in Application Master
[ https://issues.apache.org/jira/browse/MAPREDUCE-7028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305929#comment-16305929 ] Wilfred Spiegelenburg commented on MAPREDUCE-7028: -- I logged YARN-7689 to fix the TestRMContainerAllocator failures because I ran into it while looking at something else. > Concurrent task progress updates causing NPE in Application Master > -- > > Key: MAPREDUCE-7028 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7028 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 3.1.0, 3.0.1, 2.10.0, 2.9.1, 2.8.4, 2.7.6 >Reporter: Gergo Repas >Assignee: Gergo Repas > Attachments: MAPREDUCE-7028.000.patch, MAPREDUCE-7028.001.patch > > > Concurrent task progress updates can cause a NullPointerException in the > Application Master (stack trace is with code at current trunk): > {quote} > 2017-12-20 06:49:42,369 INFO [IPC Server handler 9 on 39501] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1513780867907_0001_m_02_0 is : 0.02677883 > 2017-12-20 06:49:42,369 INFO [IPC Server handler 13 on 39501] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1513780867907_0001_m_02_0 is : 0.02677883 > 2017-12-20 06:49:42,383 FATAL [AsyncDispatcher event handler] > org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread > java.lang.NullPointerException > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$StatusUpdater.transition(TaskAttemptImpl.java:2450) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$StatusUpdater.transition(TaskAttemptImpl.java:2433) > at > org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1362) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:154) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1543) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1535) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126) > at java.lang.Thread.run(Thread.java:748) > 2017-12-20 06:49:42,385 INFO [IPC Server handler 13 on 39501] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1513780867907_0001_m_02_0 is : 0.02677883 > 2017-12-20 06:49:42,386 INFO [AsyncDispatcher ShutDown handler] > org.apache.hadoop.yarn.event.AsyncDispatcher: Exiting, bbye.. > {quote} > This happened naturally in several big wordcount runs, and I could reproduce > this reliably by artificially making task updates more frequent. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6447) reduce shuffle throws "java.lang.OutOfMemoryError: Java heap space"
[ https://issues.apache.org/jira/browse/MAPREDUCE-6447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16148409#comment-16148409 ] Wilfred Spiegelenburg commented on MAPREDUCE-6447: -- [~prateekrungta] and [~shuzhangyao]: I have seen this same issue a number of times and people keep referring to this open MR issue. I have dug into this and have found that there is nothing wrong with the calculation and there is no need for a change in the way we handle this in the code. There is no guarantee that any method devised for the internal calculation will guarantee that you do not get an OOM error. In all cases I have run into I have been able to fix it with a configuration change in MR and the JVM. Let me explain what I have found and why the issue will not be solved by changing the internal calculations. When the JVM throws an OOM for a reducer I collected a heap dumps and looked at what was allocated at the point in time that the OOM was thrown. It most cases the OOM was not thrown due to the total heap being used. As an example: the JVM heap for the reducer was set to 9216MB or 9GB the heap dump showed only a 5896M heap usage. Looking at the usage of the heap it showed that the shuffle input buffer usage was well within its limits. We then tried to lower the {{mapreduce.reduce.shuffle.input.buffer.percent}} from the default 0.9 to 0.6 and found that it did not solve the issue. There was still an OOM around the same point with approximately the same usage of the heap. Lowering it further to 0.4 allowed the job to finish but we saw that the JVM never peaked above about 60% of the assigned heap. This causes a lot of waste on the cluster and is thus not a solution we could accept. Further checks of the GC logging showed that all heap usage was in the old generation for each of the OOM cases. That explains why the reducer was throwing an OOM and the heap dump: it had run out of space in the old generation, not the total heap. Within the heap the old generation can take about 2/3 of the total heap. This is based on the default settings for the generation sizing in the heap [1]. The question then became what caused the JVM to run out of old generation and not using its young generation? This could be explained by the logging from the reducer we had. The reducer logs showed that it was trying to allocate a large shuffle response. In my case about 1.9GB. Eventhough this is a large shuffle response it was within all the limit. The JVM will allocate large objects often directly in the old generation instead of allocating it in the young generation. This behaviour could cause an OOM error in the reducer while not using the full heap and just running out of old generation. Back to the calculations. In the buffer we load all the shuffle data but we set a maximum of 25% of the total buffer in one shuffle response. This is the in memory merge limit. If the shuffle response is larger than 25% of the buffer size we do not store it in the buffer but directly merge to disk. A shuffle response is only accepted and downloaded if we can fit it in memory or if it goes straight to disk. The check and increase of the buffer usage happens before we start the download. Locking makes sure only one thread does this at a time, the number of paralel copies is thus not important. Limiting could lead to a deadlock as explained in the comments in the code. Since we need to prevent deadlocks we allow one shuffle (one thread) to go over that limit. If we would not allow this we could deadlock the reducer. The reducer would be in a state that it can not download new data to reduce. There would also never be a trigger that would cause the data in memory to be merged/spilled and thus the buffer stays as full as it is. Based on all that the maximum size of all the data we could store in the shuffle buffer would be: {code} mapreduce.reduce.shuffle.input.buffer.percent = buffer% = 70% mapreduce.reduce.shuffle.memory.limit.percent = limit% = 25% heap size = 9GB maximum used memory = ((buffer% * (1 + limit%)) * heap size) - 1 byte {code} If that buffer does not fit in the old generation we could throw an OOM error without really running out of memory. This is especially true when the individual shuffle sizes are large but not hit the in memory limit. Everything is still properly calculated and limited. We also do not unknowningly use more than the configured buffer size. If we go over we know exactly how much. The way we worked around the problem without increasing the size of the heap. We did this by changing the generations. The old generation inside the heap was changed by increasing the "NewRatio" setting from 2 (default) to 4. We also changed the "input.buffer.percent" setting to 65%. That worked in our case with the 9GB as the maximum heap for the reducer. Different heap sizes combined with a di
[jira] [Commented] (MAPREDUCE-5496) Document mapreduce.cluster.administrators in mapred-default.xml
[ https://issues.apache.org/jira/browse/MAPREDUCE-5496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15996141#comment-15996141 ] Wilfred Spiegelenburg commented on MAPREDUCE-5496: -- This is a really old one but I just found that this is not fixed yet. Mind if I pick this up? > Document mapreduce.cluster.administrators in mapred-default.xml > --- > > Key: MAPREDUCE-5496 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5496 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: documentation >Affects Versions: 2.1.0-beta >Reporter: Srimanth Gunturi >Assignee: Wilfred Spiegelenburg >Priority: Minor > > {{mapreduce.cluster.administrators}} is not documented anywhere. We should > document it in mapred-default.xml. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Assigned] (MAPREDUCE-5496) Document mapreduce.cluster.administrators in mapred-default.xml
[ https://issues.apache.org/jira/browse/MAPREDUCE-5496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg reassigned MAPREDUCE-5496: Assignee: Wilfred Spiegelenburg > Document mapreduce.cluster.administrators in mapred-default.xml > --- > > Key: MAPREDUCE-5496 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5496 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: documentation >Affects Versions: 2.1.0-beta >Reporter: Srimanth Gunturi >Assignee: Wilfred Spiegelenburg >Priority: Minor > > {{mapreduce.cluster.administrators}} is not documented anywhere. We should > document it in mapred-default.xml. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-6739) allow specifying range on the port that MR AM web server binds to
[ https://issues.apache.org/jira/browse/MAPREDUCE-6739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg resolved MAPREDUCE-6739. -- Resolution: Duplicate Closing this as a duplicate of MAPREDUCE-6404. There has been progress on that jira and none on this one > allow specifying range on the port that MR AM web server binds to > - > > Key: MAPREDUCE-6739 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6739 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Affects Versions: 2.7.2 >Reporter: Haibo Chen >Assignee: Haibo Chen > Labels: supportability > > MR AM web server binds itself to an arbitrary port. This means if the RM web > proxy lives outside of a cluster, the whole port range needs to be wide open. > It'd be nice to reuse yarn.app.mapreduce.am.job.client.port-range to place a > port range restriction on MR AM web server, so that connection from outside > the cluster can be restricted within a range of ports. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-21) NegativeArraySizeException in reducer with new api
[ https://issues.apache.org/jira/browse/MAPREDUCE-21?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15349935#comment-15349935 ] Wilfred Spiegelenburg commented on MAPREDUCE-21: This is fixed in trunk as HADOOP-11901 and should be closed as a dupe of that one. I am no longer on the contributor list for the MAPREDUCE project so I can't do it. > NegativeArraySizeException in reducer with new api > -- > > Key: MAPREDUCE-21 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-21 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: task >Reporter: Amareshwari Sriramadasu > > I observed one of the reducers failing with NegativeArraySizeException with > new api. > The exception trace: > java.lang.NegativeArraySizeException > at > org.apache.hadoop.io.BytesWritable.setCapacity(BytesWritable.java:119) > at org.apache.hadoop.io.BytesWritable.setSize(BytesWritable.java:98) > at org.apache.hadoop.io.BytesWritable.readFields(BytesWritable.java:153) > at > org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67) > at > org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40) > at > org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:142) > at > org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:121) > at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:189) > at > org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:542) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:409) > at org.apache.hadoop.mapred.Child.main(Child.java:159) > The corresponding line in ReduceContext is > {code} > line#142key = keyDeserializer.deserialize(key); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6718) add progress log to JHS during startup
[ https://issues.apache.org/jira/browse/MAPREDUCE-6718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15345577#comment-15345577 ] Wilfred Spiegelenburg commented on MAPREDUCE-6718: -- Patch looks good +1 for me indeed just logging added no new test would be needed for that. > add progress log to JHS during startup > -- > > Key: MAPREDUCE-6718 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6718 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Minor > Labels: supportability > Attachments: mapreduce6718.001.patch > > > When the JHS starts up, it initializes the internal caches and storage via > the HistoryFileManager. If we have a large number of existing finished jobs > then we could spent minutes in this startup phase without logging progress: > 2016-03-14 10:56:01,444 INFO > org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file > system [hdfs://hadoopcdh.itnas01.ieee.org:8020] > 2016-03-14 10:56:11,455 INFO > org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager: Initializing Existing > Jobs... > 2016-03-14 12:01:36,926 INFO > org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage: CachedHistoryStorage > Init > This makes it really difficult to assess if things are working correctly (it > looks hung). We can add logs to notify users of progress. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6718) add progress log to JHS during startup
[ https://issues.apache.org/jira/browse/MAPREDUCE-6718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15339036#comment-15339036 ] Wilfred Spiegelenburg commented on MAPREDUCE-6718: -- We still should have a progress report anything more than a couple of seconds could already cause a customer to say the server has not started. What would happen if I have a cache in the history server setup for 150K jobs or more to be kept? Limiting the cache is OK and we already do that but customers increase the cache size because anything not in the cache can not be accessed. If they run 20K jobs a day and want 7 days to be accessible then the cache must be 150K. Purge of the history is set to 7 days by default which could easily do this. Not being able to find a history that is not in the cache is another issue which is far more difficult to fix. > add progress log to JHS during startup > -- > > Key: MAPREDUCE-6718 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6718 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Minor > Labels: supportability > > When the JHS starts up, it initializes the internal caches and storage via > the HistoryFileManager. If we have a large number of existing finished jobs > then we could spent minutes in this startup phase without logging progress: > 2016-03-14 10:56:01,444 INFO > org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file > system [hdfs://hadoopcdh.itnas01.ieee.org:8020] > 2016-03-14 10:56:11,455 INFO > org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager: Initializing Existing > Jobs... > 2016-03-14 12:01:36,926 INFO > org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage: CachedHistoryStorage > Init > This makes it really difficult to assess if things are working correctly (it > looks hung). We can add logs to notify users of progress. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6558) multibyte delimiters with compressed input files generate duplicate records
[ https://issues.apache.org/jira/browse/MAPREDUCE-6558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15283419#comment-15283419 ] Wilfred Spiegelenburg commented on MAPREDUCE-6558: -- I somehow could not leave a comment yesterday. I made the .3 patch to fix some comments in the test code and decrease the size of the test file even further. Thank you for the review and the commit [~jlowe] > multibyte delimiters with compressed input files generate duplicate records > --- > > Key: MAPREDUCE-6558 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6558 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv1, mrv2 >Affects Versions: 2.7.2 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg > Fix For: 2.8.0, 2.7.3, 2.6.5 > > Attachments: MAPREDUCE-6558.1.patch, MAPREDUCE-6558.2.patch, > MAPREDUCE-6558.3.patch > > > This is the follow up for MAPREDUCE-6549. Compressed files cause record > duplications as shown in different junit tests. The number of duplicated > records changes with the splitsize: > Unexpected number of records in split (splitsize = 10) > Expected: 41051 > Actual: 45062 > Unexpected number of records in split (splitsize = 10) > Expected: 41051 > Actual: 41052 > Test passes with splitsize = 147445 which is the compressed file length.The > file is a bzip2 file with 100k blocks and a total of 11 blocks -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6558) multibyte delimiters with compressed input files generate duplicate records
[ https://issues.apache.org/jira/browse/MAPREDUCE-6558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated MAPREDUCE-6558: - Attachment: MAPREDUCE-6558.3.patch > multibyte delimiters with compressed input files generate duplicate records > --- > > Key: MAPREDUCE-6558 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6558 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv1, mrv2 >Affects Versions: 2.7.2 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg > Attachments: MAPREDUCE-6558.1.patch, MAPREDUCE-6558.2.patch, > MAPREDUCE-6558.3.patch > > > This is the follow up for MAPREDUCE-6549. Compressed files cause record > duplications as shown in different junit tests. The number of duplicated > records changes with the splitsize: > Unexpected number of records in split (splitsize = 10) > Expected: 41051 > Actual: 45062 > Unexpected number of records in split (splitsize = 10) > Expected: 41051 > Actual: 41052 > Test passes with splitsize = 147445 which is the compressed file length.The > file is a bzip2 file with 100k blocks and a total of 11 blocks -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6558) multibyte delimiters with compressed input files generate duplicate records
[ https://issues.apache.org/jira/browse/MAPREDUCE-6558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated MAPREDUCE-6558: - Attachment: MAPREDUCE-6558.2.patch > multibyte delimiters with compressed input files generate duplicate records > --- > > Key: MAPREDUCE-6558 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6558 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv1, mrv2 >Affects Versions: 2.7.2 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg > Attachments: MAPREDUCE-6558.1.patch, MAPREDUCE-6558.2.patch > > > This is the follow up for MAPREDUCE-6549. Compressed files cause record > duplications as shown in different junit tests. The number of duplicated > records changes with the splitsize: > Unexpected number of records in split (splitsize = 10) > Expected: 41051 > Actual: 45062 > Unexpected number of records in split (splitsize = 10) > Expected: 41051 > Actual: 41052 > Test passes with splitsize = 147445 which is the compressed file length.The > file is a bzip2 file with 100k blocks and a total of 11 blocks -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6558) multibyte delimiters with compressed input files generate duplicate records
[ https://issues.apache.org/jira/browse/MAPREDUCE-6558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated MAPREDUCE-6558: - Attachment: MAPREDUCE-6558.1.patch A patch with test input that fails before the fix was made and passes after the fix was made. I have run all the tests that were there and they all still pass with this change. The tests have been run a large number of times with different input splits and none of them have failed. The use cases that passed before the fix was applied have been documented in the code. > multibyte delimiters with compressed input files generate duplicate records > --- > > Key: MAPREDUCE-6558 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6558 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv1, mrv2 >Affects Versions: 2.7.2 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg > Attachments: MAPREDUCE-6558.1.patch > > > This is the follow up for MAPREDUCE-6549. Compressed files cause record > duplications as shown in different junit tests. The number of duplicated > records changes with the splitsize: > Unexpected number of records in split (splitsize = 10) > Expected: 41051 > Actual: 45062 > Unexpected number of records in split (splitsize = 10) > Expected: 41051 > Actual: 41052 > Test passes with splitsize = 147445 which is the compressed file length.The > file is a bzip2 file with 100k blocks and a total of 11 blocks -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6558) multibyte delimiters with compressed input files generate duplicate records
[ https://issues.apache.org/jira/browse/MAPREDUCE-6558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated MAPREDUCE-6558: - Status: Patch Available (was: Open) > multibyte delimiters with compressed input files generate duplicate records > --- > > Key: MAPREDUCE-6558 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6558 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv1, mrv2 >Affects Versions: 2.7.2 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg > Attachments: MAPREDUCE-6558.1.patch > > > This is the follow up for MAPREDUCE-6549. Compressed files cause record > duplications as shown in different junit tests. The number of duplicated > records changes with the splitsize: > Unexpected number of records in split (splitsize = 10) > Expected: 41051 > Actual: 45062 > Unexpected number of records in split (splitsize = 10) > Expected: 41051 > Actual: 41052 > Test passes with splitsize = 147445 which is the compressed file length.The > file is a bzip2 file with 100k blocks and a total of 11 blocks -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-2398) MRBench: setting the baseDir parameter has no effect
[ https://issues.apache.org/jira/browse/MAPREDUCE-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15252966#comment-15252966 ] Wilfred Spiegelenburg commented on MAPREDUCE-2398: -- [~qwertymaniac] thank you for the quick review and commit > MRBench: setting the baseDir parameter has no effect > > > Key: MAPREDUCE-2398 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2398 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: benchmarks >Affects Versions: 2.3.0 >Reporter: Michael Noll >Assignee: Wilfred Spiegelenburg >Priority: Minor > Fix For: 2.9.0 > > Attachments: MAPREDUCE-2398-trunk.patch, MAPREDUCE-2398.2.patch, > MAPREDUCE-2398_0.20.2.patch, MAPREDUCE-2398_v2-0.20.203.0.patch, > MAPREDUCE-2398_v2-trunk.patch > > > The optional {{-baseDir}} parameter lets user specify the base DFS path for > output/input of MRBench. > However, the two private variables {{INPUT_DIR}} and {{OUTPUT_DIR}} > (MRBench.java) are not updated in the case that the default value of > {{-baseDir}} is actually overwritten by the user. Hence any input and output > is always written to the default locations ({{/benchmarks/MRBench/...}}), > even though the user-supplied location for {{-baseDir}} is created (and > eventually deleted again) on HDFS. > The bug affects at least Hadoop 0.20.2 and the current trunk (r1082703) as of > March 21, 2011. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-2398) MRBench: setting the baseDir parameter has no effect
[ https://issues.apache.org/jira/browse/MAPREDUCE-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15251679#comment-15251679 ] Wilfred Spiegelenburg commented on MAPREDUCE-2398: -- test failures are not related to this patch: - TestMRCJCFileOutputCommitter has been failing in multiple builds - TestUberAM passes for me locally > MRBench: setting the baseDir parameter has no effect > > > Key: MAPREDUCE-2398 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2398 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: benchmarks >Affects Versions: 2.3.0 >Reporter: Michael Noll >Assignee: Wilfred Spiegelenburg >Priority: Minor > Attachments: MAPREDUCE-2398-trunk.patch, MAPREDUCE-2398.2.patch, > MAPREDUCE-2398_0.20.2.patch, MAPREDUCE-2398_v2-0.20.203.0.patch, > MAPREDUCE-2398_v2-trunk.patch > > > The optional {{-baseDir}} parameter lets user specify the base DFS path for > output/input of MRBench. > However, the two private variables {{INPUT_DIR}} and {{OUTPUT_DIR}} > (MRBench.java) are not updated in the case that the default value of > {{-baseDir}} is actually overwritten by the user. Hence any input and output > is always written to the default locations ({{/benchmarks/MRBench/...}}), > even though the user-supplied location for {{-baseDir}} is created (and > eventually deleted again) on HDFS. > The bug affects at least Hadoop 0.20.2 and the current trunk (r1082703) as of > March 21, 2011. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-2398) MRBench: setting the baseDir parameter has no effect
[ https://issues.apache.org/jira/browse/MAPREDUCE-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15251168#comment-15251168 ] Wilfred Spiegelenburg commented on MAPREDUCE-2398: -- [~yanghaogn] do you mind if I assign this one to me? > MRBench: setting the baseDir parameter has no effect > > > Key: MAPREDUCE-2398 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2398 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: benchmarks >Affects Versions: 2.3.0 >Reporter: Michael Noll >Assignee: Yang Hao >Priority: Minor > Attachments: MAPREDUCE-2398-trunk.patch, MAPREDUCE-2398.2.patch, > MAPREDUCE-2398_0.20.2.patch, MAPREDUCE-2398_v2-0.20.203.0.patch, > MAPREDUCE-2398_v2-trunk.patch > > > The optional {{-baseDir}} parameter lets user specify the base DFS path for > output/input of MRBench. > However, the two private variables {{INPUT_DIR}} and {{OUTPUT_DIR}} > (MRBench.java) are not updated in the case that the default value of > {{-baseDir}} is actually overwritten by the user. Hence any input and output > is always written to the default locations ({{/benchmarks/MRBench/...}}), > even though the user-supplied location for {{-baseDir}} is created (and > eventually deleted again) on HDFS. > The bug affects at least Hadoop 0.20.2 and the current trunk (r1082703) as of > March 21, 2011. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-2398) MRBench: setting the baseDir parameter has no effect
[ https://issues.apache.org/jira/browse/MAPREDUCE-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated MAPREDUCE-2398: - Attachment: MAPREDUCE-2398.2.patch This seems to have gone stale for a long time and just ran into this. I have updated the patch and made sure it works passing in the value from the command line. As part of the change I also cleaned up the directory creation and cleanup. The INPUT_DIR automatically gets created by the {{generateTextFile()}} call. The clean up should not delete the BASE_DIR because it could exist and have other data in it. Only the OUTPUT_DIR and INPUT_DIR that were created for the run should be removed for cleanup. > MRBench: setting the baseDir parameter has no effect > > > Key: MAPREDUCE-2398 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2398 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: benchmarks >Affects Versions: 2.3.0 >Reporter: Michael Noll >Assignee: Yang Hao >Priority: Minor > Attachments: MAPREDUCE-2398-trunk.patch, MAPREDUCE-2398.2.patch, > MAPREDUCE-2398_0.20.2.patch, MAPREDUCE-2398_v2-0.20.203.0.patch, > MAPREDUCE-2398_v2-trunk.patch > > > The optional {{-baseDir}} parameter lets user specify the base DFS path for > output/input of MRBench. > However, the two private variables {{INPUT_DIR}} and {{OUTPUT_DIR}} > (MRBench.java) are not updated in the case that the default value of > {{-baseDir}} is actually overwritten by the user. Hence any input and output > is always written to the default locations ({{/benchmarks/MRBench/...}}), > even though the user-supplied location for {{-baseDir}} is created (and > eventually deleted again) on HDFS. > The bug affects at least Hadoop 0.20.2 and the current trunk (r1082703) as of > March 21, 2011. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6549) multibyte delimiters with LineRecordReader cause duplicate records
[ https://issues.apache.org/jira/browse/MAPREDUCE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15031214#comment-15031214 ] Wilfred Spiegelenburg commented on MAPREDUCE-6549: -- Should this be pulled back into 2.7.3 and 2.6.3 based on the fact that [~jlowe] pulled MAPREDUCE-6481 into those releases? > multibyte delimiters with LineRecordReader cause duplicate records > -- > > Key: MAPREDUCE-6549 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6549 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv1, mrv2 >Affects Versions: 2.7.2 >Reporter: Dustin Cote >Assignee: Wilfred Spiegelenburg > Fix For: 2.8.0 > > Attachments: MAPREDUCE-6549-1.patch, MAPREDUCE-6549-2.patch, > MAPREDUCE-6549.3.patch > > > LineRecorderReader currently produces duplicate records under certain > scenarios such as: > 1) input string: "abc+++def++ghi++" > delimiter string: "+++" > test passes with all sizes of the split > 2) input string: "abc++def+++ghi++" > delimiter string: "+++" > test fails with a split size of 4 > 2) input string: "abc+++def++ghi++" > delimiter string: "++" > test fails with a split size of 5 > 3) input string "abc+++defg++hij++" > delimiter string: "++" > test fails with a split size of 4 > 4) input string "abc++def+++ghi++" > delimiter string: "++" > test fails with a split size of 9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6558) multibyte delimiters with compressed input files generate duplicate records
[ https://issues.apache.org/jira/browse/MAPREDUCE-6558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15031213#comment-15031213 ] Wilfred Spiegelenburg commented on MAPREDUCE-6558: -- I do not see a problem with that. I am still working on the fix for this. > multibyte delimiters with compressed input files generate duplicate records > --- > > Key: MAPREDUCE-6558 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6558 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv1, mrv2 >Affects Versions: 2.7.2 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg > > This is the follow up for MAPREDUCE-6549. Compressed files cause record > duplications as shown in different junit tests. The number of duplicated > records changes with the splitsize: > Unexpected number of records in split (splitsize = 10) > Expected: 41051 > Actual: 45062 > Unexpected number of records in split (splitsize = 10) > Expected: 41051 > Actual: 41052 > Test passes with splitsize = 147445 which is the compressed file length.The > file is a bzip2 file with 100k blocks and a total of 11 blocks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6549) multibyte delimiters with LineRecordReader cause duplicate records
[ https://issues.apache.org/jira/browse/MAPREDUCE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15026312#comment-15026312 ] Wilfred Spiegelenburg commented on MAPREDUCE-6549: -- Test failures are not related and tracked in different jiras: testIpcWithReaderQueuing is tracked by HADOOP-10406 testGangliaMetrics2 is tracked in HADOOP-12588 testDeprecatedUmask is tracked in HDFS-9451 > multibyte delimiters with LineRecordReader cause duplicate records > -- > > Key: MAPREDUCE-6549 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6549 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv1, mrv2 >Affects Versions: 2.7.2 >Reporter: Dustin Cote >Assignee: Wilfred Spiegelenburg > Attachments: MAPREDUCE-6549-1.patch, MAPREDUCE-6549-2.patch, > MAPREDUCE-6549.3.patch > > > LineRecorderReader currently produces duplicate records under certain > scenarios such as: > 1) input string: "abc+++def++ghi++" > delimiter string: "+++" > test passes with all sizes of the split > 2) input string: "abc++def+++ghi++" > delimiter string: "+++" > test fails with a split size of 4 > 2) input string: "abc+++def++ghi++" > delimiter string: "++" > test fails with a split size of 5 > 3) input string "abc+++defg++hij++" > delimiter string: "++" > test fails with a split size of 4 > 4) input string "abc++def+++ghi++" > delimiter string: "++" > test fails with a split size of 9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6549) multibyte delimiters with LineRecordReader cause duplicate records
[ https://issues.apache.org/jira/browse/MAPREDUCE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated MAPREDUCE-6549: - Status: Patch Available (was: Open) Updated the patch to fix the NPE in the testUncompressedInputCustomDelimiterPosValue Checked the license, findbugs and other junit test failures and they are not related to the changes from this patch > multibyte delimiters with LineRecordReader cause duplicate records > -- > > Key: MAPREDUCE-6549 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6549 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv1, mrv2 >Affects Versions: 2.7.2 >Reporter: Dustin Cote >Assignee: Wilfred Spiegelenburg > Attachments: MAPREDUCE-6549-1.patch, MAPREDUCE-6549-2.patch, > MAPREDUCE-6549.3.patch > > > LineRecorderReader currently produces duplicate records under certain > scenarios such as: > 1) input string: "abc+++def++ghi++" > delimiter string: "+++" > test passes with all sizes of the split > 2) input string: "abc++def+++ghi++" > delimiter string: "+++" > test fails with a split size of 4 > 2) input string: "abc+++def++ghi++" > delimiter string: "++" > test fails with a split size of 5 > 3) input string "abc+++defg++hij++" > delimiter string: "++" > test fails with a split size of 4 > 4) input string "abc++def+++ghi++" > delimiter string: "++" > test fails with a split size of 9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6549) multibyte delimiters with LineRecordReader cause duplicate records
[ https://issues.apache.org/jira/browse/MAPREDUCE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated MAPREDUCE-6549: - Attachment: MAPREDUCE-6549.3.patch > multibyte delimiters with LineRecordReader cause duplicate records > -- > > Key: MAPREDUCE-6549 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6549 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv1, mrv2 >Affects Versions: 2.7.2 >Reporter: Dustin Cote >Assignee: Wilfred Spiegelenburg > Attachments: MAPREDUCE-6549-1.patch, MAPREDUCE-6549-2.patch, > MAPREDUCE-6549.3.patch > > > LineRecorderReader currently produces duplicate records under certain > scenarios such as: > 1) input string: "abc+++def++ghi++" > delimiter string: "+++" > test passes with all sizes of the split > 2) input string: "abc++def+++ghi++" > delimiter string: "+++" > test fails with a split size of 4 > 2) input string: "abc+++def++ghi++" > delimiter string: "++" > test fails with a split size of 5 > 3) input string "abc+++defg++hij++" > delimiter string: "++" > test fails with a split size of 4 > 4) input string "abc++def+++ghi++" > delimiter string: "++" > test fails with a split size of 9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6549) multibyte delimiters with LineRecordReader cause duplicate records
[ https://issues.apache.org/jira/browse/MAPREDUCE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated MAPREDUCE-6549: - Status: Open (was: Patch Available) > multibyte delimiters with LineRecordReader cause duplicate records > -- > > Key: MAPREDUCE-6549 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6549 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv1, mrv2 >Affects Versions: 2.7.2 >Reporter: Dustin Cote >Assignee: Wilfred Spiegelenburg > Attachments: MAPREDUCE-6549-1.patch, MAPREDUCE-6549-2.patch > > > LineRecorderReader currently produces duplicate records under certain > scenarios such as: > 1) input string: "abc+++def++ghi++" > delimiter string: "+++" > test passes with all sizes of the split > 2) input string: "abc++def+++ghi++" > delimiter string: "+++" > test fails with a split size of 4 > 2) input string: "abc+++def++ghi++" > delimiter string: "++" > test fails with a split size of 5 > 3) input string "abc+++defg++hij++" > delimiter string: "++" > test fails with a split size of 4 > 4) input string "abc++def+++ghi++" > delimiter string: "++" > test fails with a split size of 9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6549) multibyte delimiters with LineRecordReader cause duplicate records
[ https://issues.apache.org/jira/browse/MAPREDUCE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15024263#comment-15024263 ] Wilfred Spiegelenburg commented on MAPREDUCE-6549: -- [~zxu] & [~jlowe] can you also please have a look at the patch for the uncompressed version? I have not seen a build being triggered from the patch that was added. That might need to be triggered somehow for this patch (naming convention wrong for the patch?) > multibyte delimiters with LineRecordReader cause duplicate records > -- > > Key: MAPREDUCE-6549 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6549 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv1, mrv2 >Affects Versions: 2.7.2 >Reporter: Dustin Cote >Assignee: Wilfred Spiegelenburg > Attachments: MAPREDUCE-6549-1.patch, MAPREDUCE-6549-2.patch > > > LineRecorderReader currently produces duplicate records under certain > scenarios such as: > 1) input string: "abc+++def++ghi++" > delimiter string: "+++" > test passes with all sizes of the split > 2) input string: "abc++def+++ghi++" > delimiter string: "+++" > test fails with a split size of 4 > 2) input string: "abc+++def++ghi++" > delimiter string: "++" > test fails with a split size of 5 > 3) input string "abc+++defg++hij++" > delimiter string: "++" > test fails with a split size of 4 > 4) input string "abc++def+++ghi++" > delimiter string: "++" > test fails with a split size of 9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6549) multibyte delimiters with LineRecordReader cause duplicate records
[ https://issues.apache.org/jira/browse/MAPREDUCE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15024233#comment-15024233 ] Wilfred Spiegelenburg commented on MAPREDUCE-6549: -- The compressed version is not as easily fixable, and I am opening up a new jira for that one. The compressed version does not use the split size as the uncompressed version does. The split size as far as I can tell depends on the compression codec and the file encoding/compression blocks. The split size is not taken into account as it is in the uncompressed version. I ran a set of similar junit tests over the compressed data and the changed code is not even triggered. > multibyte delimiters with LineRecordReader cause duplicate records > -- > > Key: MAPREDUCE-6549 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6549 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv1, mrv2 >Affects Versions: 2.7.2 >Reporter: Dustin Cote >Assignee: Wilfred Spiegelenburg > Attachments: MAPREDUCE-6549-1.patch, MAPREDUCE-6549-2.patch > > > LineRecorderReader currently produces duplicate records under certain > scenarios such as: > 1) input string: "abc+++def++ghi++" > delimiter string: "+++" > test passes with all sizes of the split > 2) input string: "abc++def+++ghi++" > delimiter string: "+++" > test fails with a split size of 4 > 2) input string: "abc+++def++ghi++" > delimiter string: "++" > test fails with a split size of 5 > 3) input string "abc+++defg++hij++" > delimiter string: "++" > test fails with a split size of 4 > 4) input string "abc++def+++ghi++" > delimiter string: "++" > test fails with a split size of 9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6558) multibyte delimiters with compressed input files generate duplicate records
Wilfred Spiegelenburg created MAPREDUCE-6558: Summary: multibyte delimiters with compressed input files generate duplicate records Key: MAPREDUCE-6558 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6558 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1, mrv2 Affects Versions: 2.7.2 Reporter: Wilfred Spiegelenburg Assignee: Wilfred Spiegelenburg This is the follow up for MAPREDUCE-6549. Compressed files cause record duplications as shown in different junit tests. The number of duplicated records changes with the splitsize: Unexpected number of records in split (splitsize = 10) Expected: 41051 Actual: 45062 Unexpected number of records in split (splitsize = 10) Expected: 41051 Actual: 41052 Test passes with splitsize = 147445 which is the compressed file length.The file is a bzip2 file with 100k blocks and a total of 11 blocks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6549) multibyte delimiters with LineRecordReader cause duplicate records
[ https://issues.apache.org/jira/browse/MAPREDUCE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15013993#comment-15013993 ] Wilfred Spiegelenburg commented on MAPREDUCE-6549: -- I have been able to generate a compressed file which shows a same record duplication as was shown in the uncompressed processing. The code however behaves completely different in the two cases since we do not have the same kind of buffer filling process. I am still trying to fix the compressed code without breaking the uncompressed code. I should have a fix for both cases in a day or two. > multibyte delimiters with LineRecordReader cause duplicate records > -- > > Key: MAPREDUCE-6549 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6549 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv1, mrv2 >Affects Versions: 2.7.2 >Reporter: Dustin Cote >Assignee: Wilfred Spiegelenburg > Attachments: MAPREDUCE-6549-1.patch, MAPREDUCE-6549-2.patch > > > LineRecorderReader currently produces duplicate records under certain > scenarios such as: > 1) input string: "abc+++def++ghi++" > delimiter string: "+++" > test passes with all sizes of the split > 2) input string: "abc++def+++ghi++" > delimiter string: "+++" > test fails with a split size of 4 > 2) input string: "abc+++def++ghi++" > delimiter string: "++" > test fails with a split size of 5 > 3) input string "abc+++defg++hij++" > delimiter string: "++" > test fails with a split size of 4 > 4) input string "abc++def+++ghi++" > delimiter string: "++" > test fails with a split size of 9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6549) multibyte delimiters with LineRecordReader cause duplicate records
[ https://issues.apache.org/jira/browse/MAPREDUCE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15008133#comment-15008133 ] Wilfred Spiegelenburg commented on MAPREDUCE-6549: -- [~zxu] The change in MAPREDUCE-6481 is not to blame for the duplicate records as far as I can tell. It fixed things and now we get to see what is there and thus we noticed the duplicates. I did not look at the compressed input, and I do think you are correct. Compressed input uses the same steps and we should clear the setting in the same way as we did the uncompressed stream. I will try to generate a compressed stream that is splittable to get a test case. I will upload a new patch but I will first try to generate the test case before I do that. [~cotedm] An EOF will automatically terminate the record there is no need for a record delimiter at the end of the file. All the test, and comments in the code show it. The assumption is that the last record before EOF does not need a record terminator. It is not a new assumption, assuming that an EOF would not delimit a record would be counter intuitive. Most text files for instance do not have a newline at the last line. > multibyte delimiters with LineRecordReader cause duplicate records > -- > > Key: MAPREDUCE-6549 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6549 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv1, mrv2 >Affects Versions: 2.7.2 >Reporter: Dustin Cote >Assignee: Wilfred Spiegelenburg > Attachments: MAPREDUCE-6549-1.patch, MAPREDUCE-6549-2.patch > > > LineRecorderReader currently produces duplicate records under certain > scenarios such as: > 1) input string: "abc+++def++ghi++" > delimiter string: "+++" > test passes with all sizes of the split > 2) input string: "abc++def+++ghi++" > delimiter string: "+++" > test fails with a split size of 4 > 2) input string: "abc+++def++ghi++" > delimiter string: "++" > test fails with a split size of 5 > 3) input string "abc+++defg++hij++" > delimiter string: "++" > test fails with a split size of 4 > 4) input string "abc++def+++ghi++" > delimiter string: "++" > test fails with a split size of 9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6549) multibyte delimiters with LineRecordReader cause duplicate records
[ https://issues.apache.org/jira/browse/MAPREDUCE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated MAPREDUCE-6549: - Component/s: mrv2 mrv1 > multibyte delimiters with LineRecordReader cause duplicate records > -- > > Key: MAPREDUCE-6549 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6549 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv1, mrv2 >Affects Versions: 2.7.2 >Reporter: Dustin Cote >Assignee: Wilfred Spiegelenburg > Attachments: MAPREDUCE-6549-1.patch, MAPREDUCE-6549-2.patch > > > LineRecorderReader currently produces duplicate records under certain > scenarios such as: > 1) input string: "abc+++def++ghi++" > delimiter string: "+++" > test passes with all sizes of the split > 2) input string: "abc++def+++ghi++" > delimiter string: "+++" > test fails with a split size of 4 > 2) input string: "abc+++def++ghi++" > delimiter string: "++" > test fails with a split size of 5 > 3) input string "abc+++defg++hij++" > delimiter string: "++" > test fails with a split size of 4 > 4) input string "abc++def+++ghi++" > delimiter string: "++" > test fails with a split size of 9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6549) multibyte delimiters with LineRecordReader cause duplicate records
[ https://issues.apache.org/jira/browse/MAPREDUCE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated MAPREDUCE-6549: - Attachment: MAPREDUCE-6549-2.patch The issue is related to [MAPREDUCE-6481]. That jira changed the position calculation and made sure that the full records are returned by the reader as expected. It did not anticipate the record duplication. Junit tests also did not cover the use cases correctly to discover the issue. The problem is limited to multi byte delimiters only as far as I can trace. The junit tests for the multi byte delimiter only take the best case scenario into account. The input data contained the exact delimiter and no ambiguous characters. As soon as the test is changed, either the delimiter or the input data, a failure will be triggered. The issue with the failure is that it does not clearly show when and how it fails. Analysis of the test failures shows that a complex combination of input data, split and buffer size will trigger a failure. Based on testing the duplication of the record occurs only if: - the first character(s) of the delimiter are part of the record data, example: 1) the delimiter is {{\+=}} and the data contains a {{\+}} and is not followed by {{=}} 2) the delimiter is {{\+=\+=}} and the data contains {{\+=\+}} and is not followed by {{=}} - the delimiter character is found at the split boundary: the last character before the split ends - a fill of the buffer is triggered to finish processing the record The underlying problem is that we set a flag called {{needAdditionalRecord}} in the {{UncompressedSplitLineReader}} when we fill the buffer and have encountered part of a delimiter in combination with a split. We keep track of this in the ambiguous character number. However is it turns out that if the character(s) found after that point do not belong to a delimiter we do not unset the {{needAdditionalRecord}}. This causes the next record to be read twice and thus we see a duplication of records. The solution would be to unset the flag when we detect that we're not processing a delimiter. We currently only add the ambiguous characters to the record read and set the number back to 0. At the same point we need to unset the flag. The patch was developed based on junit tests that exercise the split and buffer settings in combination with multiple delimiter types using different inputs. All cases now provide a consistent count of records and correct position inside the data. > multibyte delimiters with LineRecordReader cause duplicate records > -- > > Key: MAPREDUCE-6549 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6549 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.7.2 >Reporter: Dustin Cote >Assignee: Wilfred Spiegelenburg > Attachments: MAPREDUCE-6549-1.patch, MAPREDUCE-6549-2.patch > > > LineRecorderReader currently produces duplicate records under certain > scenarios such as: > 1) input string: "abc+++def++ghi++" > delimiter string: "+++" > test passes with all sizes of the split > 2) input string: "abc++def+++ghi++" > delimiter string: "+++" > test fails with a split size of 4 > 2) input string: "abc+++def++ghi++" > delimiter string: "++" > test fails with a split size of 5 > 3) input string "abc+++defg++hij++" > delimiter string: "++" > test fails with a split size of 4 > 4) input string "abc++def+++ghi++" > delimiter string: "++" > test fails with a split size of 9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6549) multibyte delimiters with LineRecordReader cause duplicate records
[ https://issues.apache.org/jira/browse/MAPREDUCE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated MAPREDUCE-6549: - Status: Patch Available (was: Open) > multibyte delimiters with LineRecordReader cause duplicate records > -- > > Key: MAPREDUCE-6549 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6549 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.7.2 >Reporter: Dustin Cote >Assignee: Wilfred Spiegelenburg > Attachments: MAPREDUCE-6549-1.patch, MAPREDUCE-6549-2.patch > > > LineRecorderReader currently produces duplicate records under certain > scenarios such as: > 1) input string: "abc+++def++ghi++" > delimiter string: "+++" > test passes with all sizes of the split > 2) input string: "abc++def+++ghi++" > delimiter string: "+++" > test fails with a split size of 4 > 2) input string: "abc+++def++ghi++" > delimiter string: "++" > test fails with a split size of 5 > 3) input string "abc+++defg++hij++" > delimiter string: "++" > test fails with a split size of 4 > 4) input string "abc++def+++ghi++" > delimiter string: "++" > test fails with a split size of 9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6549) multibyte delimiters with LineRecordReader cause duplicate records
[ https://issues.apache.org/jira/browse/MAPREDUCE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15006080#comment-15006080 ] Wilfred Spiegelenburg commented on MAPREDUCE-6549: -- [~cotedm] I have picked up the jira and have a fully tested and working patch for the issue > multibyte delimiters with LineRecordReader cause duplicate records > -- > > Key: MAPREDUCE-6549 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6549 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.7.2 >Reporter: Dustin Cote >Assignee: Wilfred Spiegelenburg > Attachments: MAPREDUCE-6549-1.patch > > > LineRecorderReader currently produces duplicate records under certain > scenarios such as: > 1) input string: "abc+++def++ghi++" > delimiter string: "+++" > test passes with all sizes of the split > 2) input string: "abc++def+++ghi++" > delimiter string: "+++" > test fails with a split size of 4 > 2) input string: "abc+++def++ghi++" > delimiter string: "++" > test fails with a split size of 5 > 3) input string "abc+++defg++hij++" > delimiter string: "++" > test fails with a split size of 4 > 4) input string "abc++def+++ghi++" > delimiter string: "++" > test fails with a split size of 9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MAPREDUCE-6549) multibyte delimiters with LineRecordReader cause duplicate records
[ https://issues.apache.org/jira/browse/MAPREDUCE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg reassigned MAPREDUCE-6549: Assignee: Wilfred Spiegelenburg (was: Dustin Cote) > multibyte delimiters with LineRecordReader cause duplicate records > -- > > Key: MAPREDUCE-6549 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6549 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.7.2 >Reporter: Dustin Cote >Assignee: Wilfred Spiegelenburg > Attachments: MAPREDUCE-6549-1.patch > > > LineRecorderReader currently produces duplicate records under certain > scenarios such as: > 1) input string: "abc+++def++ghi++" > delimiter string: "+++" > test passes with all sizes of the split > 2) input string: "abc++def+++ghi++" > delimiter string: "+++" > test fails with a split size of 4 > 2) input string: "abc+++def++ghi++" > delimiter string: "++" > test fails with a split size of 5 > 3) input string "abc+++defg++hij++" > delimiter string: "++" > test fails with a split size of 4 > 4) input string "abc++def+++ghi++" > delimiter string: "++" > test fails with a split size of 9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6549) multibyte delimiters with LineRecordReader cause duplicate records
[ https://issues.apache.org/jira/browse/MAPREDUCE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated MAPREDUCE-6549: - Status: Open (was: Patch Available) I tried the change that you made in the patch and it fails the current tests. The patch changes one test (TestLineRecordReader.java) but we have two versions. The mapred version is unchanged and now fails. The mapreduce version works but as soon as I change the delimiter back it also fails. That means that the change does not fix the issue. it also brings the two tests out of sync which is not correct > multibyte delimiters with LineRecordReader cause duplicate records > -- > > Key: MAPREDUCE-6549 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6549 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.7.2 >Reporter: Dustin Cote >Assignee: Dustin Cote > Attachments: MAPREDUCE-6549-1.patch > > > LineRecorderReader currently produces duplicate records under certain > scenarios such as: > 1) input string: "abc+++def++ghi++" > delimiter string: "+++" > test passes with all sizes of the split > 2) input string: "abc++def+++ghi++" > delimiter string: "+++" > test fails with a split size of 4 > 2) input string: "abc+++def++ghi++" > delimiter string: "++" > test fails with a split size of 5 > 3) input string "abc+++defg++hij++" > delimiter string: "++" > test fails with a split size of 4 > 4) input string "abc++def+++ghi++" > delimiter string: "++" > test fails with a split size of 9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5965) Hadoop streaming throws error if list of input files is high. Error is: "error=7, Argument list too long at if number of input file is high"
[ https://issues.apache.org/jira/browse/MAPREDUCE-5965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14563932#comment-14563932 ] Wilfred Spiegelenburg commented on MAPREDUCE-5965: -- Can someone please review the latest patch and let me know if it is OK? > Hadoop streaming throws error if list of input files is high. Error is: > "error=7, Argument list too long at if number of input file is high" > > > Key: MAPREDUCE-5965 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5965 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Arup Malakar >Assignee: Wilfred Spiegelenburg > Attachments: MAPREDUCE-5965.1.patch, MAPREDUCE-5965.2.patch, > MAPREDUCE-5965.3.patch, MAPREDUCE-5965.patch > > > Hadoop streaming exposes all the key values in job conf as environment > variables when it forks a process for streaming code to run. Unfortunately > the variable mapreduce_input_fileinputformat_inputdir contains the list of > input files, and Linux has a limit on size of environment variables + > arguments. > Based on how long the list of files and their full path is this could be > pretty huge. And given all of these variables are not even used it stops user > from running hadoop job with large number of files, even though it could be > run. > Linux throws E2BIG if the size is greater than certain size which is error > code 7. And java translates that to "error=7, Argument list too long". More: > http://man7.org/linux/man-pages/man2/execve.2.html I suggest skipping > variables if it is greater than certain length. That way if user code > requires the environment variable it would fail. It should also introduce a > config variable to skip long variables, and set it to false by default. That > way user has to specifically set it to true to invoke this feature. > Here is the exception: > {code} > Error: java.lang.RuntimeException: Error in configuring object at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) > at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426) at > org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at > java.security.AccessController.doPrivileged(Native Method) at > javax.security.auth.Subject.doAs(Subject.java:415) at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: > java.lang.reflect.InvocationTargetException at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) > ... 9 more Caused by: java.lang.RuntimeException: Error in configuring object > at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) > at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) > at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38) ... 14 > more Caused by: java.lang.reflect.InvocationTargetException at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) > ... 17 more Caused by: java.lang.RuntimeException: configuration exception at > org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:222) at > org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66) ... 22 > more Caused by: java.io.IOException: Cannot run program > "/data/hadoop/hadoop-yarn/cache/yarn/nm-local-dir/usercache/oo-analytics/appcache/application_1403599726264_13177/container_1403599726264_13177_01_06/./rbenv_runner.sh": > error=7, Argument list too long at > java.lang.ProcessBuilder.start(ProcessBuilder.java:1041) at > org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:209) ... 23 > more Caused by: java.io.IOException: error=7, Argument list too long at > java.lang.UNIXProcess.forkAndExec(Na
[jira] [Updated] (MAPREDUCE-5965) Hadoop streaming throws error if list of input files is high. Error is: "error=7, Argument list too long at if number of input file is high"
[ https://issues.apache.org/jira/browse/MAPREDUCE-5965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated MAPREDUCE-5965: - Attachment: MAPREDUCE-5965.3.patch Updated the patch using the new name and made it an integer as [~djp] proposed. The documentation and the usage that is printed in the StreamJob have been updated to show the new option and the values. To answer the 2 question: it would be long enough to leave all but the problem value alone. Three values are documented: -1: do not truncate (default) 0: only copy the key and not the value (side effect of using substring) 2: as a safe value which should prevent the "error=7" issue > Hadoop streaming throws error if list of input files is high. Error is: > "error=7, Argument list too long at if number of input file is high" > > > Key: MAPREDUCE-5965 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5965 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Arup Malakar >Assignee: Wilfred Spiegelenburg > Attachments: MAPREDUCE-5965.1.patch, MAPREDUCE-5965.2.patch, > MAPREDUCE-5965.3.patch, MAPREDUCE-5965.patch > > > Hadoop streaming exposes all the key values in job conf as environment > variables when it forks a process for streaming code to run. Unfortunately > the variable mapreduce_input_fileinputformat_inputdir contains the list of > input files, and Linux has a limit on size of environment variables + > arguments. > Based on how long the list of files and their full path is this could be > pretty huge. And given all of these variables are not even used it stops user > from running hadoop job with large number of files, even though it could be > run. > Linux throws E2BIG if the size is greater than certain size which is error > code 7. And java translates that to "error=7, Argument list too long". More: > http://man7.org/linux/man-pages/man2/execve.2.html I suggest skipping > variables if it is greater than certain length. That way if user code > requires the environment variable it would fail. It should also introduce a > config variable to skip long variables, and set it to false by default. That > way user has to specifically set it to true to invoke this feature. > Here is the exception: > {code} > Error: java.lang.RuntimeException: Error in configuring object at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) > at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426) at > org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at > java.security.AccessController.doPrivileged(Native Method) at > javax.security.auth.Subject.doAs(Subject.java:415) at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: > java.lang.reflect.InvocationTargetException at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) > ... 9 more Caused by: java.lang.RuntimeException: Error in configuring object > at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) > at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) > at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38) ... 14 > more Caused by: java.lang.reflect.InvocationTargetException at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) > ... 17 more Caused by: java.lang.RuntimeException: configuration exception at > org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:222) at > org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66) ... 22 > more Caused by: java.io.IOException: Cannot run program > "/data/hadoop/hadoop-yarn/cache/yarn/nm-local-dir/usercache/o
[jira] [Commented] (MAPREDUCE-5965) Hadoop streaming throws error if list of input files is high. Error is: "error=7, Argument list too long at if number of input file is high"
[ https://issues.apache.org/jira/browse/MAPREDUCE-5965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14554197#comment-14554197 ] Wilfred Spiegelenburg commented on MAPREDUCE-5965: -- [~amalakar] thank you for the assignment. The comment should be added back, I'll do that with an updated patch. The move to keep it in the same method was to make the change as simple as possible. [~rchiang] The streaming configuration does not really have a *-default.xml file. There is documentation (markdown) that shows some of the settings and options: adding it to the FAQ would probably be the correct place. There is a help that is printed in the main StreamJob code that shows most of the options. I will update the two files and explain the setting that is available. I can upload a new patch with that added before I do lets get the other points finalised. A white list or black list is possible but what would we exclude or include? In the job configuration there could be any value which could be too long, a user could set something he wants. It will be really difficult to filter that consistently and be sure that we have a fix with limited impact. Making the lenLimit configurable is possible. However I do not see what we would win with making the length configurable. The data is not used anywhere and lowering or increasing the size at which we cut it off will not give us anything extra. If you really want to make it configurable the easiest way would be to roll the two settings in one. We could make the stream.truncate.long.jobconf.values an integer: -1 do not truncate otherwise truncate at the length given. > Hadoop streaming throws error if list of input files is high. Error is: > "error=7, Argument list too long at if number of input file is high" > > > Key: MAPREDUCE-5965 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5965 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Arup Malakar >Assignee: Wilfred Spiegelenburg > Attachments: MAPREDUCE-5965.1.patch, MAPREDUCE-5965.2.patch, > MAPREDUCE-5965.patch > > > Hadoop streaming exposes all the key values in job conf as environment > variables when it forks a process for streaming code to run. Unfortunately > the variable mapreduce_input_fileinputformat_inputdir contains the list of > input files, and Linux has a limit on size of environment variables + > arguments. > Based on how long the list of files and their full path is this could be > pretty huge. And given all of these variables are not even used it stops user > from running hadoop job with large number of files, even though it could be > run. > Linux throws E2BIG if the size is greater than certain size which is error > code 7. And java translates that to "error=7, Argument list too long". More: > http://man7.org/linux/man-pages/man2/execve.2.html I suggest skipping > variables if it is greater than certain length. That way if user code > requires the environment variable it would fail. It should also introduce a > config variable to skip long variables, and set it to false by default. That > way user has to specifically set it to true to invoke this feature. > Here is the exception: > {code} > Error: java.lang.RuntimeException: Error in configuring object at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) > at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426) at > org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at > java.security.AccessController.doPrivileged(Native Method) at > javax.security.auth.Subject.doAs(Subject.java:415) at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: > java.lang.reflect.InvocationTargetException at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) > ... 9 more Caused by: java.lang.RuntimeException: Error in configuring object > at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) > at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at > org.apa
[jira] [Commented] (MAPREDUCE-5965) Hadoop streaming throws error if list of input files is high. Error is: "error=7, Argument list too long at if number of input file is high"
[ https://issues.apache.org/jira/browse/MAPREDUCE-5965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549661#comment-14549661 ] Wilfred Spiegelenburg commented on MAPREDUCE-5965: -- Arup: Do you mind if I assign the jira to me? Would like to get this fixed in an upcoming release. > Hadoop streaming throws error if list of input files is high. Error is: > "error=7, Argument list too long at if number of input file is high" > > > Key: MAPREDUCE-5965 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5965 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Arup Malakar >Assignee: Arup Malakar > Attachments: MAPREDUCE-5965.1.patch, MAPREDUCE-5965.2.patch, > MAPREDUCE-5965.patch > > > Hadoop streaming exposes all the key values in job conf as environment > variables when it forks a process for streaming code to run. Unfortunately > the variable mapreduce_input_fileinputformat_inputdir contains the list of > input files, and Linux has a limit on size of environment variables + > arguments. > Based on how long the list of files and their full path is this could be > pretty huge. And given all of these variables are not even used it stops user > from running hadoop job with large number of files, even though it could be > run. > Linux throws E2BIG if the size is greater than certain size which is error > code 7. And java translates that to "error=7, Argument list too long". More: > http://man7.org/linux/man-pages/man2/execve.2.html I suggest skipping > variables if it is greater than certain length. That way if user code > requires the environment variable it would fail. It should also introduce a > config variable to skip long variables, and set it to false by default. That > way user has to specifically set it to true to invoke this feature. > Here is the exception: > {code} > Error: java.lang.RuntimeException: Error in configuring object at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) > at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426) at > org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at > java.security.AccessController.doPrivileged(Native Method) at > javax.security.auth.Subject.doAs(Subject.java:415) at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: > java.lang.reflect.InvocationTargetException at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) > ... 9 more Caused by: java.lang.RuntimeException: Error in configuring object > at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) > at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) > at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38) ... 14 > more Caused by: java.lang.reflect.InvocationTargetException at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) > ... 17 more Caused by: java.lang.RuntimeException: configuration exception at > org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:222) at > org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66) ... 22 > more Caused by: java.io.IOException: Cannot run program > "/data/hadoop/hadoop-yarn/cache/yarn/nm-local-dir/usercache/oo-analytics/appcache/application_1403599726264_13177/container_1403599726264_13177_01_06/./rbenv_runner.sh": > error=7, Argument list too long at > java.lang.ProcessBuilder.start(ProcessBuilder.java:1041) at > org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:209) ... 23 > more Caused by: java.io.IOException: error=7, Argument list too long at > java.lang.UNIXProcess.forkAndExec(Native
[jira] [Updated] (MAPREDUCE-5965) Hadoop streaming throws error if list of input files is high. Error is: "error=7, Argument list too long at if number of input file is high"
[ https://issues.apache.org/jira/browse/MAPREDUCE-5965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated MAPREDUCE-5965: - Status: Patch Available (was: Open) > Hadoop streaming throws error if list of input files is high. Error is: > "error=7, Argument list too long at if number of input file is high" > > > Key: MAPREDUCE-5965 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5965 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Arup Malakar >Assignee: Arup Malakar > Attachments: MAPREDUCE-5965.1.patch, MAPREDUCE-5965.2.patch, > MAPREDUCE-5965.patch > > > Hadoop streaming exposes all the key values in job conf as environment > variables when it forks a process for streaming code to run. Unfortunately > the variable mapreduce_input_fileinputformat_inputdir contains the list of > input files, and Linux has a limit on size of environment variables + > arguments. > Based on how long the list of files and their full path is this could be > pretty huge. And given all of these variables are not even used it stops user > from running hadoop job with large number of files, even though it could be > run. > Linux throws E2BIG if the size is greater than certain size which is error > code 7. And java translates that to "error=7, Argument list too long". More: > http://man7.org/linux/man-pages/man2/execve.2.html I suggest skipping > variables if it is greater than certain length. That way if user code > requires the environment variable it would fail. It should also introduce a > config variable to skip long variables, and set it to false by default. That > way user has to specifically set it to true to invoke this feature. > Here is the exception: > {code} > Error: java.lang.RuntimeException: Error in configuring object at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) > at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426) at > org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at > java.security.AccessController.doPrivileged(Native Method) at > javax.security.auth.Subject.doAs(Subject.java:415) at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: > java.lang.reflect.InvocationTargetException at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) > ... 9 more Caused by: java.lang.RuntimeException: Error in configuring object > at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) > at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) > at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38) ... 14 > more Caused by: java.lang.reflect.InvocationTargetException at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) > ... 17 more Caused by: java.lang.RuntimeException: configuration exception at > org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:222) at > org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66) ... 22 > more Caused by: java.io.IOException: Cannot run program > "/data/hadoop/hadoop-yarn/cache/yarn/nm-local-dir/usercache/oo-analytics/appcache/application_1403599726264_13177/container_1403599726264_13177_01_06/./rbenv_runner.sh": > error=7, Argument list too long at > java.lang.ProcessBuilder.start(ProcessBuilder.java:1041) at > org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:209) ... 23 > more Caused by: java.io.IOException: error=7, Argument list too long at > java.lang.UNIXProcess.forkAndExec(Native Method) at > java.lang.UNIXProcess.(UNIXProcess.java:135) at > java.lang.ProcessImpl.start(ProcessImpl.java:130) at
[jira] [Updated] (MAPREDUCE-5965) Hadoop streaming throws error if list of input files is high. Error is: "error=7, Argument list too long at if number of input file is high"
[ https://issues.apache.org/jira/browse/MAPREDUCE-5965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated MAPREDUCE-5965: - Attachment: MAPREDUCE-5965.2.patch Ran into the same issue. Re-based and cleaned up patch which does the same as the Hive patch (truncate the environment value) > Hadoop streaming throws error if list of input files is high. Error is: > "error=7, Argument list too long at if number of input file is high" > > > Key: MAPREDUCE-5965 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5965 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Arup Malakar >Assignee: Arup Malakar > Attachments: MAPREDUCE-5965.1.patch, MAPREDUCE-5965.2.patch, > MAPREDUCE-5965.patch > > > Hadoop streaming exposes all the key values in job conf as environment > variables when it forks a process for streaming code to run. Unfortunately > the variable mapreduce_input_fileinputformat_inputdir contains the list of > input files, and Linux has a limit on size of environment variables + > arguments. > Based on how long the list of files and their full path is this could be > pretty huge. And given all of these variables are not even used it stops user > from running hadoop job with large number of files, even though it could be > run. > Linux throws E2BIG if the size is greater than certain size which is error > code 7. And java translates that to "error=7, Argument list too long". More: > http://man7.org/linux/man-pages/man2/execve.2.html I suggest skipping > variables if it is greater than certain length. That way if user code > requires the environment variable it would fail. It should also introduce a > config variable to skip long variables, and set it to false by default. That > way user has to specifically set it to true to invoke this feature. > Here is the exception: > {code} > Error: java.lang.RuntimeException: Error in configuring object at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) > at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426) at > org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at > java.security.AccessController.doPrivileged(Native Method) at > javax.security.auth.Subject.doAs(Subject.java:415) at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: > java.lang.reflect.InvocationTargetException at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) > ... 9 more Caused by: java.lang.RuntimeException: Error in configuring object > at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) > at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) > at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38) ... 14 > more Caused by: java.lang.reflect.InvocationTargetException at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) > ... 17 more Caused by: java.lang.RuntimeException: configuration exception at > org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:222) at > org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66) ... 22 > more Caused by: java.io.IOException: Cannot run program > "/data/hadoop/hadoop-yarn/cache/yarn/nm-local-dir/usercache/oo-analytics/appcache/application_1403599726264_13177/container_1403599726264_13177_01_06/./rbenv_runner.sh": > error=7, Argument list too long at > java.lang.ProcessBuilder.start(ProcessBuilder.java:1041) at > org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:209) ... 23 > more Caused by: java.io.IOException: error=7, Argument list too long at > java.lang.UNIXProcess.forkAndExe