[jira] [Resolved] (MAPREDUCE-7278) Speculative execution behavior is observed even when mapreduce.map.speculative and mapreduce.reduce.speculative are false

2020-05-27 Thread Wilfred Spiegelenburg (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg resolved MAPREDUCE-7278.
--
Resolution: Fixed

Pulled this back to branch-3.1 for 3.1.5

closing, thank you [~tarunparimi] for your contribution

> Speculative execution behavior is observed even when 
> mapreduce.map.speculative and mapreduce.reduce.speculative are false
> -
>
> Key: MAPREDUCE-7278
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7278
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task
>Affects Versions: 2.8.0, 3.4.0
>Reporter: Tarun Parimi
>Assignee: Tarun Parimi
>Priority: Major
> Fix For: 3.3.0, 3.2.2, 3.4.0, 3.1.5
>
> Attachments: MAPREDUCE-7278.001.patch, MAPREDUCE-7278.002.patch, 
> MAPREDUCE-7278.003.patch, MAPREDUCE-7278.004.patch, Screen Shot 2020-04-30 at 
> 8.04.27 PM.png
>
>
> When a failed task attempt container is stuck in FAIL_FINISHING_CONTAINER 
> state for some time, we observe two task attempts are launched simultaneously 
> even when speculative execution is disabled.
> This results in the below message shown in the killed attempts, indicating 
> speculation has occurred. This is an issue for jobs which require speculative 
> execution to be strictly disabled.
>   !Screen Shot 2020-04-30 at 8.04.27 PM.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7278) Speculative execution behavior is observed even when mapreduce.map.speculative and mapreduce.reduce.speculative are false

2020-05-27 Thread Wilfred Spiegelenburg (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated MAPREDUCE-7278:
-
Fix Version/s: 3.1.5

> Speculative execution behavior is observed even when 
> mapreduce.map.speculative and mapreduce.reduce.speculative are false
> -
>
> Key: MAPREDUCE-7278
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7278
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task
>Affects Versions: 2.8.0, 3.4.0
>Reporter: Tarun Parimi
>Assignee: Tarun Parimi
>Priority: Major
> Fix For: 3.3.0, 3.2.2, 3.4.0, 3.1.5
>
> Attachments: MAPREDUCE-7278.001.patch, MAPREDUCE-7278.002.patch, 
> MAPREDUCE-7278.003.patch, MAPREDUCE-7278.004.patch, Screen Shot 2020-04-30 at 
> 8.04.27 PM.png
>
>
> When a failed task attempt container is stuck in FAIL_FINISHING_CONTAINER 
> state for some time, we observe two task attempts are launched simultaneously 
> even when speculative execution is disabled.
> This results in the below message shown in the killed attempts, indicating 
> speculation has occurred. This is an issue for jobs which require speculative 
> execution to be strictly disabled.
>   !Screen Shot 2020-04-30 at 8.04.27 PM.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7278) Speculative execution behavior is observed even when mapreduce.map.speculative and mapreduce.reduce.speculative are false

2020-05-27 Thread Wilfred Spiegelenburg (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated MAPREDUCE-7278:
-
Fix Version/s: 3.4.0
   3.2.2
   3.3.0

> Speculative execution behavior is observed even when 
> mapreduce.map.speculative and mapreduce.reduce.speculative are false
> -
>
> Key: MAPREDUCE-7278
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7278
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task
>Affects Versions: 2.8.0, 3.4.0
>Reporter: Tarun Parimi
>Assignee: Tarun Parimi
>Priority: Major
> Fix For: 3.3.0, 3.2.2, 3.4.0
>
> Attachments: MAPREDUCE-7278.001.patch, MAPREDUCE-7278.002.patch, 
> MAPREDUCE-7278.003.patch, MAPREDUCE-7278.004.patch, Screen Shot 2020-04-30 at 
> 8.04.27 PM.png
>
>
> When a failed task attempt container is stuck in FAIL_FINISHING_CONTAINER 
> state for some time, we observe two task attempts are launched simultaneously 
> even when speculative execution is disabled.
> This results in the below message shown in the killed attempts, indicating 
> speculation has occurred. This is an issue for jobs which require speculative 
> execution to be strictly disabled.
>   !Screen Shot 2020-04-30 at 8.04.27 PM.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7278) Speculative execution behavior is observed even when mapreduce.map.speculative and mapreduce.reduce.speculative are false

2020-05-27 Thread Wilfred Spiegelenburg (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated MAPREDUCE-7278:
-
Status: In Progress  (was: Patch Available)

Looking at the releases that have MAPREDUCE-6485 pulling the change back to the 
active branch-3.2 and branch-3.3.
Do we need to also pull this back to 3.1.5?

> Speculative execution behavior is observed even when 
> mapreduce.map.speculative and mapreduce.reduce.speculative are false
> -
>
> Key: MAPREDUCE-7278
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7278
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task
>Affects Versions: 2.8.0, 3.4.0
>Reporter: Tarun Parimi
>Assignee: Tarun Parimi
>Priority: Major
> Attachments: MAPREDUCE-7278.001.patch, MAPREDUCE-7278.002.patch, 
> MAPREDUCE-7278.003.patch, MAPREDUCE-7278.004.patch, Screen Shot 2020-04-30 at 
> 8.04.27 PM.png
>
>
> When a failed task attempt container is stuck in FAIL_FINISHING_CONTAINER 
> state for some time, we observe two task attempts are launched simultaneously 
> even when speculative execution is disabled.
> This results in the below message shown in the killed attempts, indicating 
> speculation has occurred. This is an issue for jobs which require speculative 
> execution to be strictly disabled.
>   !Screen Shot 2020-04-30 at 8.04.27 PM.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7278) Speculative execution behavior is observed even when mapreduce.map.speculative and mapreduce.reduce.speculative are false

2020-05-27 Thread Wilfred Spiegelenburg (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17118257#comment-17118257
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-7278:
--

Change is looking good +1, committing to trunk

> Speculative execution behavior is observed even when 
> mapreduce.map.speculative and mapreduce.reduce.speculative are false
> -
>
> Key: MAPREDUCE-7278
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7278
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task
>Affects Versions: 2.8.0, 3.4.0
>Reporter: Tarun Parimi
>Assignee: Tarun Parimi
>Priority: Major
> Attachments: MAPREDUCE-7278.001.patch, MAPREDUCE-7278.002.patch, 
> MAPREDUCE-7278.003.patch, MAPREDUCE-7278.004.patch, Screen Shot 2020-04-30 at 
> 8.04.27 PM.png
>
>
> When a failed task attempt container is stuck in FAIL_FINISHING_CONTAINER 
> state for some time, we observe two task attempts are launched simultaneously 
> even when speculative execution is disabled.
> This results in the below message shown in the killed attempts, indicating 
> speculation has occurred. This is an issue for jobs which require speculative 
> execution to be strictly disabled.
>   !Screen Shot 2020-04-30 at 8.04.27 PM.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7278) Speculative execution behavior is observed even when mapreduce.map.speculative and mapreduce.reduce.speculative are false

2020-05-20 Thread Wilfred Spiegelenburg (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17112750#comment-17112750
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-7278:
--

I agree the asf license is not from this patch. Can you please fix the 
checkstyle issues?

Code looks good no comments on that 

> Speculative execution behavior is observed even when 
> mapreduce.map.speculative and mapreduce.reduce.speculative are false
> -
>
> Key: MAPREDUCE-7278
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7278
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task
>Affects Versions: 2.8.0, 3.4.0
>Reporter: Tarun Parimi
>Assignee: Tarun Parimi
>Priority: Major
> Attachments: MAPREDUCE-7278.001.patch, MAPREDUCE-7278.002.patch, 
> MAPREDUCE-7278.003.patch, Screen Shot 2020-04-30 at 8.04.27 PM.png
>
>
> When a failed task attempt container is stuck in FAIL_FINISHING_CONTAINER 
> state for some time, we observe two task attempts are launched simultaneously 
> even when speculative execution is disabled.
> This results in the below message shown in the killed attempts, indicating 
> speculation has occurred. This is an issue for jobs which require speculative 
> execution to be strictly disabled.
>   !Screen Shot 2020-04-30 at 8.04.27 PM.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7278) Speculative execution behavior is observed even when mapreduce.map.speculative and mapreduce.reduce.speculative are false

2020-05-12 Thread Wilfred Spiegelenburg (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17105859#comment-17105859
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-7278:
--

[~tarunparimi] I have added you to the contributor list for the project and 
assigned the jira to you.

The test you describe I think can be converted into a test using a 
{{MiniMRYarnCluster}} like what is done in the TestSpeculativeExecution. Did 
you look at that possibility?


> Speculative execution behavior is observed even when 
> mapreduce.map.speculative and mapreduce.reduce.speculative are false
> -
>
> Key: MAPREDUCE-7278
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7278
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task
>Affects Versions: 2.8.0, 3.4.0
>Reporter: Tarun Parimi
>Assignee: Tarun Parimi
>Priority: Major
> Attachments: MAPREDUCE-7278.001.patch, MAPREDUCE-7278.002.patch, 
> Screen Shot 2020-04-30 at 8.04.27 PM.png
>
>
> When a failed task attempt container is stuck in FAIL_FINISHING_CONTAINER 
> state for some time, we observe two task attempts are launched simultaneously 
> even when speculative execution is disabled.
> This results in the below message shown in the killed attempts, indicating 
> speculation has occurred. This is an issue for jobs which require speculative 
> execution to be strictly disabled.
>   !Screen Shot 2020-04-30 at 8.04.27 PM.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Assigned] (MAPREDUCE-7278) Speculative execution behavior is observed even when mapreduce.map.speculative and mapreduce.reduce.speculative are false

2020-05-12 Thread Wilfred Spiegelenburg (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg reassigned MAPREDUCE-7278:


Assignee: Tarun Parimi

> Speculative execution behavior is observed even when 
> mapreduce.map.speculative and mapreduce.reduce.speculative are false
> -
>
> Key: MAPREDUCE-7278
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7278
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task
>Affects Versions: 2.8.0, 3.4.0
>Reporter: Tarun Parimi
>Assignee: Tarun Parimi
>Priority: Major
> Attachments: MAPREDUCE-7278.001.patch, MAPREDUCE-7278.002.patch, 
> Screen Shot 2020-04-30 at 8.04.27 PM.png
>
>
> When a failed task attempt container is stuck in FAIL_FINISHING_CONTAINER 
> state for some time, we observe two task attempts are launched simultaneously 
> even when speculative execution is disabled.
> This results in the below message shown in the killed attempts, indicating 
> speculation has occurred. This is an issue for jobs which require speculative 
> execution to be strictly disabled.
>   !Screen Shot 2020-04-30 at 8.04.27 PM.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7269) TestNetworkedJob fails

2020-04-06 Thread Wilfred Spiegelenburg (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17076097#comment-17076097
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-7269:
--

Thank you for filing the issue [~aajisaka]

The license warning is not related to the change. The changes for the test is 
in line with the test changes that have been submitted on the YARN side: +1

> TestNetworkedJob fails
> --
>
> Key: MAPREDUCE-7269
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7269
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
>
> https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1460/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt
> {noformat}
> [INFO] Running org.apache.hadoop.mapred.TestNetworkedJob
> [ERROR] Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 20.981 s <<< FAILURE! - in org.apache.hadoop.mapred.TestNetworkedJob
> [ERROR] testNetworkedJob(org.apache.hadoop.mapred.TestNetworkedJob)  Time 
> elapsed: 4.588 s  <<< FAILURE!
> org.junit.ComparisonFailure: expected:<[]default> but was:<[root.]default>
>   at org.junit.Assert.assertEquals(Assert.java:115)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.hadoop.mapred.TestNetworkedJob.testNetworkedJob(TestNetworkedJob.java:250)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7266) historyContext doesn't need to be a class attribute inside JobHistoryServer

2020-03-22 Thread Wilfred Spiegelenburg (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17064450#comment-17064450
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-7266:
--

Moved it to the correct project (JHS is MR not YARN).
I just noticed that I have not been added yet to the committer lists yet so I 
cannot add you as a contributor and assign the jira to you.

This move should now trigger the build.



> historyContext doesn't need to be a class attribute inside JobHistoryServer
> ---
>
> Key: MAPREDUCE-7266
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7266
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Reporter: Siddharth Ahuja
>Priority: Minor
> Attachments: YARN-10075.001.patch
>
>
> "historyContext" class attribute at 
> https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JobHistoryServer.java#L67
>  is assigned a cast of another class attribute - "jobHistoryService" - 
> https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JobHistoryServer.java#L131,
>  however it does not need to be stored separately because it is only ever 
> used once in the clas, and that too as an argument while instantiating the 
> HistoryClientService class at 
> https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JobHistoryServer.java#L155.
> Therefore, we could just delete the lines at 
> https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JobHistoryServer.java#L67
>  and 
> https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JobHistoryServer.java#L131
>  completely and instantiate the HistoryClientService as follows:
> {code}
>   @VisibleForTesting
>   protected HistoryClientService createHistoryClientService() {
> return new HistoryClientService((HistoryContext)jobHistoryService, 
> this.jhsDTSecretManager);
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Assigned] (MAPREDUCE-7266) historyContext doesn't need to be a class attribute inside JobHistoryServer

2020-03-22 Thread Wilfred Spiegelenburg (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg reassigned MAPREDUCE-7266:


Component/s: (was: yarn)
 jobhistoryserver
Key: MAPREDUCE-7266  (was: YARN-10075)
   Assignee: (was: Siddharth Ahuja)
Project: Hadoop Map/Reduce  (was: Hadoop YARN)

> historyContext doesn't need to be a class attribute inside JobHistoryServer
> ---
>
> Key: MAPREDUCE-7266
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7266
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Reporter: Siddharth Ahuja
>Priority: Minor
> Attachments: YARN-10075.001.patch
>
>
> "historyContext" class attribute at 
> https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JobHistoryServer.java#L67
>  is assigned a cast of another class attribute - "jobHistoryService" - 
> https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JobHistoryServer.java#L131,
>  however it does not need to be stored separately because it is only ever 
> used once in the clas, and that too as an argument while instantiating the 
> HistoryClientService class at 
> https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JobHistoryServer.java#L155.
> Therefore, we could just delete the lines at 
> https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JobHistoryServer.java#L67
>  and 
> https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JobHistoryServer.java#L131
>  completely and instantiate the HistoryClientService as follows:
> {code}
>   @VisibleForTesting
>   protected HistoryClientService createHistoryClientService() {
> return new HistoryClientService((HistoryContext)jobHistoryService, 
> this.jhsDTSecretManager);
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7249) Invalid event TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP causes job failure

2019-11-28 Thread Wilfred Spiegelenburg (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16984279#comment-16984279
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-7249:
--

thank you [~prabhujoseph] for the commit and [~pbacsko] for the review

> Invalid event TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP causes 
> job failure 
> 
>
> Key: MAPREDUCE-7249
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7249
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, mrv2
>Affects Versions: 3.1.0
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Critical
>  Labels: Reviewed
> Fix For: 3.3.0, 3.1.4, 3.2.2
>
> Attachments: MAPREDUCE-7249-001.patch, MAPREDUCE-7249-002.patch, 
> MAPREDUCE-7249-branch-3.2.001.patch
>
>
> Same issue as in MAPREDUCE-7240 but this one has a different state in which 
> the Exception {{TA_TOO_MANY_FETCH_FAILURE}} event is received:
> {code}
> 2019-11-18 23:03:40,270 ERROR [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Can't handle 
> this event at current state for attempt_1568654141590_630203_m_003108_1
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1183)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:148)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1388)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1380)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:182)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
> {code}
> The stack trace is from a CDH release which is highly patched 2.6 release. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7249) Invalid event TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP causes job failure

2019-11-27 Thread Wilfred Spiegelenburg (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16984109#comment-16984109
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-7249:
--

the asf license warning is correct: there are files in the branch that should 
not be there

This is caused by the checking for YARN-9011, filed YARN-9993 to remove the 
files.

> Invalid event TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP causes 
> job failure 
> 
>
> Key: MAPREDUCE-7249
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7249
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, mrv2
>Affects Versions: 3.1.0
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Critical
> Attachments: MAPREDUCE-7249-001.patch, MAPREDUCE-7249-002.patch, 
> MAPREDUCE-7249-branch-3.2.001.patch
>
>
> Same issue as in MAPREDUCE-7240 but this one has a different state in which 
> the Exception {{TA_TOO_MANY_FETCH_FAILURE}} event is received:
> {code}
> 2019-11-18 23:03:40,270 ERROR [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Can't handle 
> this event at current state for attempt_1568654141590_630203_m_003108_1
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1183)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:148)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1388)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1380)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:182)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
> {code}
> The stack trace is from a CDH release which is highly patched 2.6 release. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7249) Invalid event TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP causes job failure

2019-11-27 Thread Wilfred Spiegelenburg (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16984042#comment-16984042
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-7249:
--

Also attaching a patch for branch-3.2 and earlier 
[^MAPREDUCE-7249-branch-3.2.001.patch] applies to both 3.2 and 3.1 branches

> Invalid event TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP causes 
> job failure 
> 
>
> Key: MAPREDUCE-7249
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7249
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, mrv2
>Affects Versions: 3.1.0
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Critical
> Attachments: MAPREDUCE-7249-001.patch, MAPREDUCE-7249-002.patch, 
> MAPREDUCE-7249-branch-3.2.001.patch
>
>
> Same issue as in MAPREDUCE-7240 but this one has a different state in which 
> the Exception {{TA_TOO_MANY_FETCH_FAILURE}} event is received:
> {code}
> 2019-11-18 23:03:40,270 ERROR [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Can't handle 
> this event at current state for attempt_1568654141590_630203_m_003108_1
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1183)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:148)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1388)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1380)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:182)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
> {code}
> The stack trace is from a CDH release which is highly patched 2.6 release. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7249) Invalid event TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP causes job failure

2019-11-27 Thread Wilfred Spiegelenburg (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated MAPREDUCE-7249:
-
Attachment: MAPREDUCE-7249-branch-3.2.001.patch

> Invalid event TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP causes 
> job failure 
> 
>
> Key: MAPREDUCE-7249
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7249
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, mrv2
>Affects Versions: 3.1.0
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Critical
> Attachments: MAPREDUCE-7249-001.patch, MAPREDUCE-7249-002.patch, 
> MAPREDUCE-7249-branch-3.2.001.patch
>
>
> Same issue as in MAPREDUCE-7240 but this one has a different state in which 
> the Exception {{TA_TOO_MANY_FETCH_FAILURE}} event is received:
> {code}
> 2019-11-18 23:03:40,270 ERROR [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Can't handle 
> this event at current state for attempt_1568654141590_630203_m_003108_1
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1183)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:148)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1388)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1380)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:182)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
> {code}
> The stack trace is from a CDH release which is highly patched 2.6 release. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7249) Invalid event TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP causes job failure

2019-11-27 Thread Wilfred Spiegelenburg (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16984034#comment-16984034
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-7249:
--

fixed the checkstyle issues in both files, the \{{TaskAttemptImpl.java}} change 
has become slightly larger as I fixed the whole block.

> Invalid event TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP causes 
> job failure 
> 
>
> Key: MAPREDUCE-7249
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7249
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, mrv2
>Affects Versions: 3.1.0
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Critical
> Attachments: MAPREDUCE-7249-001.patch, MAPREDUCE-7249-002.patch
>
>
> Same issue as in MAPREDUCE-7240 but this one has a different state in which 
> the Exception {{TA_TOO_MANY_FETCH_FAILURE}} event is received:
> {code}
> 2019-11-18 23:03:40,270 ERROR [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Can't handle 
> this event at current state for attempt_1568654141590_630203_m_003108_1
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1183)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:148)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1388)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1380)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:182)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
> {code}
> The stack trace is from a CDH release which is highly patched 2.6 release. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7249) Invalid event TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP causes job failure

2019-11-27 Thread Wilfred Spiegelenburg (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated MAPREDUCE-7249:
-
Attachment: MAPREDUCE-7249-002.patch

> Invalid event TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP causes 
> job failure 
> 
>
> Key: MAPREDUCE-7249
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7249
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, mrv2
>Affects Versions: 3.1.0
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Critical
> Attachments: MAPREDUCE-7249-001.patch, MAPREDUCE-7249-002.patch
>
>
> Same issue as in MAPREDUCE-7240 but this one has a different state in which 
> the Exception {{TA_TOO_MANY_FETCH_FAILURE}} event is received:
> {code}
> 2019-11-18 23:03:40,270 ERROR [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Can't handle 
> this event at current state for attempt_1568654141590_630203_m_003108_1
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1183)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:148)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1388)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1380)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:182)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
> {code}
> The stack trace is from a CDH release which is highly patched 2.6 release. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7249) Invalid event TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP causes job failure

2019-11-27 Thread Wilfred Spiegelenburg (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated MAPREDUCE-7249:
-
Attachment: MAPREDUCE-7249-001.patch
Status: Patch Available  (was: Open)

patch for the issue

> Invalid event TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP causes 
> job failure 
> 
>
> Key: MAPREDUCE-7249
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7249
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, mrv2
>Affects Versions: 3.1.0
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Critical
> Attachments: MAPREDUCE-7249-001.patch
>
>
> Same issue as in MAPREDUCE-7240 but this one has a different state in which 
> the Exception {{TA_TOO_MANY_FETCH_FAILURE}} event is received:
> {code}
> 2019-11-18 23:03:40,270 ERROR [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Can't handle 
> this event at current state for attempt_1568654141590_630203_m_003108_1
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1183)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:148)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1388)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1380)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:182)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
> {code}
> The stack trace is from a CDH release which is highly patched 2.6 release. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7249) Invalid event TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP causes job failure

2019-11-27 Thread Wilfred Spiegelenburg (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated MAPREDUCE-7249:
-
Summary: Invalid event TA_TOO_MANY_FETCH_FAILURE at 
SUCCESS_CONTAINER_CLEANUP causes job failure   (was: Invalid event 
TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP cause job )

> Invalid event TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP causes 
> job failure 
> 
>
> Key: MAPREDUCE-7249
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7249
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, mrv2
>Affects Versions: 3.1.0
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Critical
>
> Same issue as in MAPREDUCE-7240 but this one has a different state in which 
> the Exception {{TA_TOO_MANY_FETCH_FAILURE}} event is received:
> {code}
> 2019-11-18 23:03:40,270 ERROR [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Can't handle 
> this event at current state for attempt_1568654141590_630203_m_003108_1
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1183)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:148)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1388)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1380)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:182)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
> {code}
> The stack trace is from a CDH release which is highly patched 2.6 release. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-7249) Invalid event TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP cause job

2019-11-27 Thread Wilfred Spiegelenburg (Jira)
Wilfred Spiegelenburg created MAPREDUCE-7249:


 Summary: Invalid event TA_TOO_MANY_FETCH_FAILURE at 
SUCCESS_CONTAINER_CLEANUP cause job 
 Key: MAPREDUCE-7249
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7249
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster, mrv2
Affects Versions: 3.1.0
Reporter: Wilfred Spiegelenburg
Assignee: Wilfred Spiegelenburg


Same issue as in MAPREDUCE-7240 but this one has a different state in which the 
Exception {{TA_TOO_MANY_FETCH_FAILURE}} event is received:
{code}
2019-11-18 23:03:40,270 ERROR [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Can't handle this 
event at current state for attempt_1568654141590_630203_m_003108_1
org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
TA_TOO_MANY_FETCH_FAILURE at SUCCESS_CONTAINER_CLEANUP
at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1183)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:148)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1388)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1380)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:182)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
{code}

The stack trace is from a CDH release which is highly patched 2.6 release. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7240) Exception ' Invalid event: TA_TOO_MANY_FETCH_FAILURE at SUCCESS_FINISHING_CONTAINER' cause job error

2019-11-26 Thread Wilfred Spiegelenburg (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16982981#comment-16982981
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-7240:
--

I checked the PRs that are linked to this jira. Jason gave a +1 on the trunk 
version in [PR #1674|https://github.com/apache/hadoop/pull/1674]. If your patch 
follows that change we should be good to go.

+1 (non binding)

For the concern raised in this 
[comment|https://issues.apache.org/jira/browse/MAPREDUCE-7240?focusedCommentId=16982254&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16982254]:
 if the container ignores the newly raised event then the AM needs to handle 
that as per normal.
The main issue in the current code is that because it does not handle the fetch 
failure event a {{InvalidStateTransitionException}} is raised which causes the 
job to fail. After the change the event is handled and the job should continue 
and finish processing. The job can still fail as per normal but the single too 
many fetch failures event does not cause the job to fail immediately.

> Exception ' Invalid event: TA_TOO_MANY_FETCH_FAILURE at 
> SUCCESS_FINISHING_CONTAINER' cause job error
> 
>
> Key: MAPREDUCE-7240
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7240
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.8.2
>Reporter: luhuachao
>Assignee: luhuachao
>Priority: Critical
>  Labels: kerberos
> Attachments: MAPREDUCE-7240-001.patch, 
> application_1566552310686_260041.log
>
>
> *log in appmaster*
> {noformat}
> 2019-09-03 17:18:43,090 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Too many fetch-failures 
> for output of task attempt: attempt_1566552310686_260041_m_52_0 ... 
> raising fetch failure to map
> 2019-09-03 17:18:43,091 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Too many fetch-failures 
> for output of task attempt: attempt_1566552310686_260041_m_49_0 ... 
> raising fetch failure to map
> 2019-09-03 17:18:43,091 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Too many fetch-failures 
> for output of task attempt: attempt_1566552310686_260041_m_51_0 ... 
> raising fetch failure to map
> 2019-09-03 17:18:43,091 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Too many fetch-failures 
> for output of task attempt: attempt_1566552310686_260041_m_50_0 ... 
> raising fetch failure to map
> 2019-09-03 17:18:43,091 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Too many fetch-failures 
> for output of task attempt: attempt_1566552310686_260041_m_53_0 ... 
> raising fetch failure to map
> 2019-09-03 17:18:43,092 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: 
> attempt_1566552310686_260041_m_52_0 transitioned from state SUCCEEDED to 
> FAILED, event type is TA_TOO_MANY_FETCH_FAILURE and nodeId=yarn095:45454
> 2019-09-03 17:18:43,092 ERROR [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Can't handle 
> this event at current state for attempt_1566552310686_260041_m_49_0
> org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: 
> TA_TOO_MANY_FETCH_FAILURE at SUCCESS_FINISHING_CONTAINER
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>   at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1206)
>   at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:146)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1458)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1450)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110)
>   at java.lang.Thread.run(Thread.java:745)
> 2019-09-03 17:18:43,093 ERROR [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Can't handle 
> this event at current state for attempt_1566552310686_260041_m

[jira] [Commented] (MAPREDUCE-7225) Fix broken current folder expansion during MR job start

2019-07-31 Thread Wilfred Spiegelenburg (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16897735#comment-16897735
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-7225:
--

Thank you [~pbacsko] for the patch

I looked over the change from patch v3 and the impact seems minimal.
Change option 2 and 3 are not really an option as we have too many calls 
throughout the repository and probably in projects outside hadoop that could 
break due to a change.

+1 non binding from me.

> Fix broken current folder expansion during MR job start
> ---
>
> Key: MAPREDUCE-7225
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7225
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.9.0, 3.0.3
>Reporter: Adam Antal
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: MAPREDUCE-7225-001.patch, MAPREDUCE-7225-002.patch, 
> MAPREDUCE-7225-002.patch, MAPREDUCE-7225-003.patch
>
>
> Starting a sleep job giving "." as files that should be localized is working 
> fine up until 2.9.0, but after that the user is given an 
> IllegalArgumentException. This change is a side-effect of HADOOP-12747 where 
> {{GenericOptionsParser#validateFiles}} function got modified.
> Can be reproduced by starting a sleep job with "-files ." given as extra 
> parameter. Log:
> {noformat}
> sudo -u hdfs hadoop jar hadoop-mapreduce-client-jobclient-3.0.0.jar sleep 
> -files . -m 1 -r 1 -rt 2000 -mt 2000
> WARNING: Use "yarn jar" to launch YARN applications.
> 19/07/17 08:13:26 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
> to rm21
> 19/07/17 08:13:26 INFO mapreduce.JobResourceUploader: Disabling Erasure 
> Coding for path: /user/hdfs/.staging/job_1563349475208_0017
> 19/07/17 08:13:26 INFO mapreduce.JobSubmitter: Cleaning up the staging area 
> /user/hdfs/.staging/job_1563349475208_0017
> java.lang.IllegalArgumentException: Can not create a Path from an empty string
>   at org.apache.hadoop.fs.Path.checkPathArg(Path.java:168)
>   at org.apache.hadoop.fs.Path.(Path.java:180)
>   at org.apache.hadoop.fs.Path.(Path.java:125)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.copyRemoteFiles(JobResourceUploader.java:686)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadFiles(JobResourceUploader.java:262)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadResourcesInternal(JobResourceUploader.java:203)
>   at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadResources(JobResourceUploader.java:131)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:99)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:194)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1726)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1588)
>   at org.apache.hadoop.mapreduce.SleepJob.run(SleepJob.java:273)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.mapreduce.SleepJob.main(SleepJob.java:194)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at 
> org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:139)
>   at 
> org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:147)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:313)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:227)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: mapreduce-issue

[jira] [Commented] (MAPREDUCE-7196) FairScheduler queue ACLs not implemented for application actions

2019-03-27 Thread Wilfred Spiegelenburg (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16802737#comment-16802737
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-7196:
--

Moved from the YARN project. The mapred-default.xml needs to be updated as the 
queue administrators that existed in the MR1 time (JT   & TT) do not exist 
anymore and they get confused with the YARN queue admins which are not related 
at all.

> FairScheduler queue ACLs not implemented for application actions
> 
>
> Key: MAPREDUCE-7196
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7196
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Tristan Stevens
>Priority: Major
>
> The mapred-site.xml options mapreduce.job.acl-modify-job and 
> mapreduce.job.acl-view-job both specify that queue ACLs should apply for read 
> and modify operations on a job, however according to 
> org.apache.hadoop.yarn.server.security.ApplicationACLsManager.java this 
> feature has not been implemented.
> This is very important otherwise it is difficult to manage a cluster with a 
> complicated queue hierarchy without either putting everyone in the admin ACL, 
> getting many support tickets or asking people to remember to set 
> mapreduce.job.acl-modify-job and mapreduce.job.acl-view-job.
> Extract from mapred-default.xml:
> bq.  Irrespective of this ACL configuration, (a) job-owner, (b) the user who 
> started the cluster, (c) members of an admin configured supergroup configured 
> via mapreduce.cluster.permissions.supergroup and *(d) queue administrators of 
> the queue to which this job was submitted* to configured via 
> acl-administer-jobs for the specific queue in mapred-queues.xml can do all 
> the view operations on a job. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Moved] (MAPREDUCE-7196) FairScheduler queue ACLs not implemented for application actions

2019-03-27 Thread Wilfred Spiegelenburg (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg moved YARN-8026 to MAPREDUCE-7196:


Component/s: (was: fairscheduler)
 documentation
 Issue Type: Improvement  (was: Bug)
Key: MAPREDUCE-7196  (was: YARN-8026)
Project: Hadoop Map/Reduce  (was: Hadoop YARN)

> FairScheduler queue ACLs not implemented for application actions
> 
>
> Key: MAPREDUCE-7196
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7196
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Tristan Stevens
>Priority: Major
>
> The mapred-site.xml options mapreduce.job.acl-modify-job and 
> mapreduce.job.acl-view-job both specify that queue ACLs should apply for read 
> and modify operations on a job, however according to 
> org.apache.hadoop.yarn.server.security.ApplicationACLsManager.java this 
> feature has not been implemented.
> This is very important otherwise it is difficult to manage a cluster with a 
> complicated queue hierarchy without either putting everyone in the admin ACL, 
> getting many support tickets or asking people to remember to set 
> mapreduce.job.acl-modify-job and mapreduce.job.acl-view-job.
> Extract from mapred-default.xml:
> bq.  Irrespective of this ACL configuration, (a) job-owner, (b) the user who 
> started the cluster, (c) members of an admin configured supergroup configured 
> via mapreduce.cluster.permissions.supergroup and *(d) queue administrators of 
> the queue to which this job was submitted* to configured via 
> acl-administer-jobs for the specific queue in mapred-queues.xml can do all 
> the view operations on a job. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7180) Relaunching Failed Containers

2019-03-04 Thread Wilfred Spiegelenburg (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16784121#comment-16784121
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-7180:
--

What I meant is that if an application fails we re-run the application. Any 
finished tasks are OK they are recovered, running tasks are killed and 
restarted. If they had failed once or more times for the first attempt and we 
relaunched them with larger heaps we start the process of increasing the 
containers again from scratch, wasting more resources.

I think what Daniel proposed is the simplest most elegant solution. If we have 
a task that fails due to exceeding the container we should fail the application 
and let the end user and or admin sort it out. Even for an Oozie workflow or in 
the Hive case running jobs through beeline you can set the size of the 
container etc via the command line.
I think finding the cause is not that difficult but as part of the change to 
fail the application we could make it really clear in the diagnostics of the 
application what failed and which action to take. The message for the container 
exceeding the settings has already been extended via YARN-7580 and should be 
clearer in 3.1 and later.

> Relaunching Failed Containers
> -
>
> Key: MAPREDUCE-7180
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7180
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv1, mrv2
>Reporter: David Mollitor
>Priority: Major
>
> In my experience, it is very common that a MR job completely fails because a 
> single Mapper/Reducer container is using more memory than has been reserved 
> in YARN.  The following message is logging the the MapReduce 
> ApplicationMaster:
> {code}
> Container [pid=46028,containerID=container_e54_1435155934213_16721_01_003666] 
> is running beyond physical memory limits. 
> Current usage: 1.0 GB of 1 GB physical memory used; 2.7 GB of 2.1 GB virtual 
> memory used. Killing container.
> {code}
> In this case, the container is re-launched on another node, and of course, it 
> is killed again for the same reason.  This process happens three (maybe 
> four?) times before the entire MapReduce job fails.  It's often said that the 
> definition of insanity is doing the same thing over and over and expecting 
> different results.
> For all intents and purposes, the amount of resources requested by Mappers 
> and Reducers is a fixed amount; based on the default configuration values.  
> Users can set the memory on a per-job basis, but it's a pain, not exact, and 
> requires intimate knowledge of the MapReduce framework and its memory usage 
> patterns.
> I propose that if the MR ApplicationMaster detects that a container is killed 
> because of this specific memory resource constraint, that it requests a 
> larger container for the subsequent task attempt.
> For example, increase the requested memory size by 50% each time the 
> container fails and the task is retried.  This will prevent many Job failures 
> and allow for additional memory tuning, per-Job, after the fact, to get 
> better performance (v.s. fail/succeed).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7180) Relaunching Failed Containers

2019-03-03 Thread Wilfred Spiegelenburg (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16782937#comment-16782937
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-7180:
--

The 80/20 case as DanieI said will not work for all cases but it handles almost 
all use cases. The headroom ratio is configurable which means  that if you know 
you have a high overhead due to the type of code you run you can set it cluster 
wide I would be in favour of not wasting resources and fail the application 
when the JVM goes OOM for one or more tasks. The re-run with adjusted settings 
has more drawbacks than advantages I think.

The main reason I am not in favour of the auto retries is the hiding of 
possible issues and not providing a guarantee that it will work. There is a 
good chance that when one mapper or reducer fails due to memory issues that 
there are more mappers or reducers that will fail in the same way. Multiple 
tasks failing increases the overhead on the cluster like Jim mentioned in his 
example. With data growing or small code changes either in the app, MR 
framework or JVM over time you could be putting a lot of extra strain on a 
cluster. 

What if the application still failed due to task failures: how do we handle an 
application re-run? Won't that start from scratch again and thus waste more 
resources.  


> Relaunching Failed Containers
> -
>
> Key: MAPREDUCE-7180
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7180
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv1, mrv2
>Reporter: BELUGA BEHR
>Priority: Major
>
> In my experience, it is very common that a MR job completely fails because a 
> single Mapper/Reducer container is using more memory than has been reserved 
> in YARN.  The following message is logging the the MapReduce 
> ApplicationMaster:
> {code}
> Container [pid=46028,containerID=container_e54_1435155934213_16721_01_003666] 
> is running beyond physical memory limits. 
> Current usage: 1.0 GB of 1 GB physical memory used; 2.7 GB of 2.1 GB virtual 
> memory used. Killing container.
> {code}
> In this case, the container is re-launched on another node, and of course, it 
> is killed again for the same reason.  This process happens three (maybe 
> four?) times before the entire MapReduce job fails.  It's often said that the 
> definition of insanity is doing the same thing over and over and expecting 
> different results.
> For all intents and purposes, the amount of resources requested by Mappers 
> and Reducers is a fixed amount; based on the default configuration values.  
> Users can set the memory on a per-job basis, but it's a pain, not exact, and 
> requires intimate knowledge of the MapReduce framework and its memory usage 
> patterns.
> I propose that if the MR ApplicationMaster detects that a container is killed 
> because of this specific memory resource constraint, that it requests a 
> larger container for the subsequent task attempt.
> For example, increase the requested memory size by 50% each time the 
> container fails and the task is retried.  This will prevent many Job failures 
> and allow for additional memory tuning, per-Job, after the fact, to get 
> better performance (v.s. fail/succeed).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7180) Relaunching Failed Containers

2019-02-05 Thread Wilfred Spiegelenburg (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16761403#comment-16761403
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-7180:
--

When you use MAPREDUCE-5785 you should not see the type of failures that you 
are trying to prevent. The heap and its overhead should always fit in the 
container unless you have some special off heap case. You should thus only 
expect to see them for the 3rd party library and or off heap issues. What you 
are trying to implement is really only relevant for the edge cases like the 
misconfiguration which you state are not really the goal as the job will still 
fail.

That is why I think adding all this to hide a misconfiguration is the wrong 
thing to do. 

> Relaunching Failed Containers
> -
>
> Key: MAPREDUCE-7180
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7180
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv1, mrv2
>Reporter: BELUGA BEHR
>Priority: Major
>
> In my experience, it is very common that a MR job completely fails because a 
> single Mapper/Reducer container is using more memory than has been reserved 
> in YARN.  The following message is logging the the MapReduce 
> ApplicationMaster:
> {code}
> Container [pid=46028,containerID=container_e54_1435155934213_16721_01_003666] 
> is running beyond physical memory limits. 
> Current usage: 1.0 GB of 1 GB physical memory used; 2.7 GB of 2.1 GB virtual 
> memory used. Killing container.
> {code}
> In this case, the container is re-launched on another node, and of course, it 
> is killed again for the same reason.  This process happens three (maybe 
> four?) times before the entire MapReduce job fails.  It's often said that the 
> definition of insanity is doing the same thing over and over and expecting 
> different results.
> For all intents and purposes, the amount of resources requested by Mappers 
> and Reducers is a fixed amount; based on the default configuration values.  
> Users can set the memory on a per-job basis, but it's a pain, not exact, and 
> requires intimate knowledge of the MapReduce framework and its memory usage 
> patterns.
> I propose that if the MR ApplicationMaster detects that a container is killed 
> because of this specific memory resource constraint, that it requests a 
> larger container for the subsequent task attempt.
> For example, increase the requested memory size by 50% each time the 
> container fails and the task is retried.  This will prevent many Job failures 
> and allow for additional memory tuning, per-Job, after the fact, to get 
> better performance (v.s. fail/succeed).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7180) Relaunching Failed Containers

2019-02-04 Thread Wilfred Spiegelenburg (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16760412#comment-16760412
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-7180:
--

I have some reservations also on just growing on a failure. Letting the 
application fail is the best way to get the job reviewed and configured 
correctly. For a properly configured job we should see the GC kick in way 
before we run over the size of the container. If your default settings do not 
take care of that you are not managing the cluster correctly.

In MAPREDUCE-5785 we introduced the automatic calculation of the heap size 
based on the container size and vice versa. If you use that control it should 
mean that you never get into this situation. What happens when the application 
relies on that calculation for the heap and or container size and still fails?
How are you going to handle that case if the container fails with the same 
message? Are you going to also change the ratio heap to container that is 
configured. That case could be caused by the mapper or reducer using more off 
heap memory (3rd party library). How is that going to work with this auto 
re-run?

Another point to consider is that I can always run over the container by 
setting an overly large heap. As an example: I know my job can run in a 1GB 
heap as I have tried it. I now set 10GB as the heap as a test. GCs will not 
kick in as the heap is not really full and will just keep growing way above the 
1GB. If I would configure the job to run in a 2GB container then the overly 
large heap will cause it to fail. It might even fail when I make the container 
4GB or 8GB. Just doubling and re-running is going to be problematic.

Using the available configuration and the smarts that is build in is a far 
better solution.

> Relaunching Failed Containers
> -
>
> Key: MAPREDUCE-7180
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7180
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv1, mrv2
>Reporter: BELUGA BEHR
>Priority: Major
>
> In my experience, it is very common that a MR job completely fails because a 
> single Mapper/Reducer container is using more memory than has been reserved 
> in YARN.  The following message is logging the the MapReduce 
> ApplicationMaster:
> {code}
> Container [pid=46028,containerID=container_e54_1435155934213_16721_01_003666] 
> is running beyond physical memory limits. 
> Current usage: 1.0 GB of 1 GB physical memory used; 2.7 GB of 2.1 GB virtual 
> memory used. Killing container.
> {code}
> In this case, the container is re-launched on another node, and of course, it 
> is killed again for the same reason.  This process happens three (maybe 
> four?) times before the entire MapReduce job fails.  It's often said that the 
> definition of insanity is doing the same thing over and over and expecting 
> different results.
> For all intents and purposes, the amount of resources requested by Mappers 
> and Reducers is a fixed amount; based on the default configuration values.  
> Users can set the memory on a per-job basis, but it's a pain, not exact, and 
> requires intimate knowledge of the MapReduce framework and its memory usage 
> patterns.
> I propose that if the MR ApplicationMaster detects that a container is killed 
> because of this specific memory resource constraint, that it requests a 
> larger container for the subsequent task attempt.
> For example, increase the requested memory size by 50% each time the 
> container fails and the task is retried.  This will prevent many Job failures 
> and allow for additional memory tuning, per-Job, after the fact, to get 
> better performance (v.s. fail/succeed).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7159) FrameworkUploader: ensure proper permissions of generated framework tar.gz if restrictive umask is used

2018-12-05 Thread Wilfred Spiegelenburg (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16710874#comment-16710874
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-7159:
--

Patch looks good to me +1 (non binding)
Thank you for the fix.

> FrameworkUploader: ensure proper permissions of generated framework tar.gz if 
> restrictive umask is used
> ---
>
> Key: MAPREDUCE-7159
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7159
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 3.1.1
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: MAPREDUCE-7159-001.patch, MAPREDUCE-7159-002.patch, 
> MAPREDUCE-7159-003.patch, MAPREDUCE-7159-004.patch, MAPREDUCE-7159-005.patch, 
> MAPREDUCE-7159-006.patch
>
>
> Using certain umask values (like 027) makes files unreadable to "others". 
> This causes problems if the FrameworkUploader 
> (https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-uploader/src/main/java/org/apache/hadoop/mapred/uploader/FrameworkUploader.java)
>  is used - it's necessary that the compressed MR framework is readable by all 
> users, otherwise they won't be able to run MR jobs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7159) FrameworkUploader: ensure proper permissions of generated framework tar.gz if restrictive umask is used

2018-12-04 Thread Wilfred Spiegelenburg (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709538#comment-16709538
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-7159:
--

Thank you for the updated patch [~pbacsko].

I agree it is probably not a good idea to fix something that was created 
externally, it might not even be allowed to change the permissions. Instead of 
fixing the broken permissions we should abort the upload or log a WARN/ERROR 
message that the permissions are broken. The uploaded framework is not usable 
and from an end user perspective it is not the correct thing. If we know it is 
broken we should report it back to the user.

> FrameworkUploader: ensure proper permissions of generated framework tar.gz if 
> restrictive umask is used
> ---
>
> Key: MAPREDUCE-7159
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7159
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 3.1.1
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: MAPREDUCE-7159-001.patch, MAPREDUCE-7159-002.patch, 
> MAPREDUCE-7159-003.patch, MAPREDUCE-7159-004.patch, MAPREDUCE-7159-005.patch
>
>
> Using certain umask values (like 027) makes files unreadable to "others". 
> This causes problems if the FrameworkUploader 
> (https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-uploader/src/main/java/org/apache/hadoop/mapred/uploader/FrameworkUploader.java)
>  is used - it's necessary that the compressed MR framework is readable by all 
> users, otherwise they won't be able to run MR jobs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7159) FrameworkUploader: ensure proper permissions of generated framework tar.gz if restrictive umask is used

2018-12-03 Thread Wilfred Spiegelenburg (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16707882#comment-16707882
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-7159:
--

Thank you [~pbacsko]

I have two questions: 
# you currently have fixed just the distributed filesystem. Does this same 
issue not happen if we do not have a distributed file system and directly 
create the stream?
# In this case there are restrictive settings on the file itself, what f there 
are restrictive settings in the path? That case does not seem to be handled at 
all as the only thing we check in {{validateTargetPath}} is the start of the 
URI. We need to have at least traversal rights on the whole path.

> FrameworkUploader: ensure proper permissions of generated framework tar.gz if 
> restrictive umask is used
> ---
>
> Key: MAPREDUCE-7159
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7159
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 3.1.1
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: MAPREDUCE-7159-001.patch, MAPREDUCE-7159-002.patch, 
> MAPREDUCE-7159-003.patch
>
>
> Using certain umask values (like 027) makes files unreadable to "others". 
> This causes problems if the FrameworkUploader 
> (https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-uploader/src/main/java/org/apache/hadoop/mapred/uploader/FrameworkUploader.java)
>  is used - it's necessary that the compressed MR framework is readable by all 
> users, otherwise they won't be able to run MR jobs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7094) LocalDistributedCacheManager leaves classloaders open, which leaks FDs

2018-05-10 Thread Wilfred Spiegelenburg (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16471401#comment-16471401
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-7094:
--

Thank you [~szita] patch is looking good +1 (non binding)

> LocalDistributedCacheManager leaves classloaders open, which leaks FDs
> --
>
> Key: MAPREDUCE-7094
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7094
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Adam Szita
>Assignee: Adam Szita
>Priority: Major
> Attachments: MAPREDUCE-7094.0.patch, MAPREDUCE-7094.1.patch, 
> MAPREDUCE-7094.2.patch
>
>
> When a user starts a local mapred task from Hive's beeline, it will leave 
> open file descriptors on the HS2 process (which runs the mapred task).
> I debugged this and saw that it is caused by LocalDistributedCacheManager 
> class, which creates a new URLClassLoader, with a classpath for the two jars 
> seen below. Somewhere down the line Loaders will be created in this 
> URLClassLoader for these files effectively creating the FD's on the OS level.
> This is never cleaned up after execution, although 
> LocalDistributedCacheManager removes the files, it will not close the 
> ClassLoader, so FDs are left open although they point to deleted files at 
> that time:
> {code:java}
> [root@host-1 ~]# lsof -p 14439 | grep hadoop-hive
> java    14439 hive  DEL       REG                8,1             3348748 
> /tmp/hadoop-hive/mapred/local/1525789796610/hive-exec-core.jar
> java    14439 hive  DEL       REG                8,1             3348750 
> /tmp/hadoop-hive/mapred/local/1525789796609/hive-exec-1.1.0-cdh5.13.4-SNAPSHOT-core.jar
> java    14439 hive  649r      REG                8,1   8112438   3348750 
> /tmp/hadoop-hive/mapred/local/1525789796609/hive-exec-1.1.0-cdh5.13.4-SNAPSHOT-core.jar
>  (deleted)
> java    14439 hive  650r      REG                8,1   8112438   3348748 
> /tmp/hadoop-hive/mapred/local/1525789796610/hive-exec-core.jar (deleted)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7094) LocalDistributedCacheManager leaves classloaders open, which leaks FDs

2018-05-09 Thread Wilfred Spiegelenburg (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469771#comment-16469771
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-7094:
--

Based on the code path there is always only one class loader created and 
active. So why do we need to keep track of multiple and can we not just close 
the one open class loader correctly in the close method? That would also fix 
your findbugs issue.

There is only one call to the {{makeClassLoader}} method which is not inside a 
loop in {{LocalJobRunner}}. There is also no loop in 
{{LocalDistributedCacheManager}} to create multiple loaders at the same time.

> LocalDistributedCacheManager leaves classloaders open, which leaks FDs
> --
>
> Key: MAPREDUCE-7094
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7094
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Adam Szita
>Priority: Major
> Attachments: MAPREDUCE-7094.0.patch, MAPREDUCE-7094.1.patch
>
>
> When a user starts a local mapred task from Hive's beeline, it will leave 
> open file descriptors on the HS2 process (which runs the mapred task).
> I debugged this and saw that it is caused by LocalDistributedCacheManager 
> class, which creates a new URLClassLoader, with a classpath for the two jars 
> seen below. Somewhere down the line Loaders will be created in this 
> URLClassLoader for these files effectively creating the FD's on the OS level.
> This is never cleaned up after execution, although 
> LocalDistributedCacheManager removes the files, it will not close the 
> ClassLoader, so FDs are left open although they point to deleted files at 
> that time:
> {code:java}
> [root@host-1 ~]# lsof -p 14439 | grep hadoop-hive
> java    14439 hive  DEL       REG                8,1             3348748 
> /tmp/hadoop-hive/mapred/local/1525789796610/hive-exec-core.jar
> java    14439 hive  DEL       REG                8,1             3348750 
> /tmp/hadoop-hive/mapred/local/1525789796609/hive-exec-1.1.0-cdh5.13.4-SNAPSHOT-core.jar
> java    14439 hive  649r      REG                8,1   8112438   3348750 
> /tmp/hadoop-hive/mapred/local/1525789796609/hive-exec-1.1.0-cdh5.13.4-SNAPSHOT-core.jar
>  (deleted)
> java    14439 hive  650r      REG                8,1   8112438   3348748 
> /tmp/hadoop-hive/mapred/local/1525789796610/hive-exec-core.jar (deleted)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7072) mapred job -history prints duplicate counter in human output

2018-04-26 Thread Wilfred Spiegelenburg (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16455808#comment-16455808
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-7072:
--

No idea what I did yesterday but the diff was not correct and did not 
correspond to what I had locally.
Fixed it up: compile pass, tests pass

> mapred job -history prints duplicate counter in human output
> 
>
> Key: MAPREDUCE-7072
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7072
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Affects Versions: 3.0.0
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Major
> Attachments: MAPREDUCE-7072-branch-2.02.patch, 
> MAPREDUCE-7072-branch-2.03.patch, MAPREDUCE-7072.01.patch, 
> MAPREDUCE-7072.02.patch
>
>
>  'mapred job -history' command prints duplicate entries for counters only for 
> the human output format. It does not do this for the JSON format.
> mapred job -history /user/history/somefile.jhist -format human
> {code}
> 
> |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000
> ...
> |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000
> 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7072) mapred job -history prints duplicate counter in human output

2018-04-26 Thread Wilfred Spiegelenburg (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated MAPREDUCE-7072:
-
Attachment: MAPREDUCE-7072-branch-2.03.patch

> mapred job -history prints duplicate counter in human output
> 
>
> Key: MAPREDUCE-7072
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7072
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Affects Versions: 3.0.0
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Major
> Attachments: MAPREDUCE-7072-branch-2.02.patch, 
> MAPREDUCE-7072-branch-2.03.patch, MAPREDUCE-7072.01.patch, 
> MAPREDUCE-7072.02.patch
>
>
>  'mapred job -history' command prints duplicate entries for counters only for 
> the human output format. It does not do this for the JSON format.
> mapred job -history /user/history/somefile.jhist -format human
> {code}
> 
> |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000
> ...
> |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000
> 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7072) mapred job -history prints duplicate counter in human output

2018-04-25 Thread Wilfred Spiegelenburg (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16453412#comment-16453412
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-7072:
--

Branch 2 patch attached, compiles with java 7 after removing two fields in the 
job info creation

> mapred job -history prints duplicate counter in human output
> 
>
> Key: MAPREDUCE-7072
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7072
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Affects Versions: 3.0.0
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Major
> Attachments: MAPREDUCE-7072-branch-2.02.patch, 
> MAPREDUCE-7072.01.patch, MAPREDUCE-7072.02.patch
>
>
>  'mapred job -history' command prints duplicate entries for counters only for 
> the human output format. It does not do this for the JSON format.
> mapred job -history /user/history/somefile.jhist -format human
> {code}
> 
> |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000
> ...
> |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000
> 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7072) mapred job -history prints duplicate counter in human output

2018-04-25 Thread Wilfred Spiegelenburg (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated MAPREDUCE-7072:
-
Attachment: MAPREDUCE-7072-branch-2.02.patch

> mapred job -history prints duplicate counter in human output
> 
>
> Key: MAPREDUCE-7072
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7072
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Affects Versions: 3.0.0
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Major
> Attachments: MAPREDUCE-7072-branch-2.02.patch, 
> MAPREDUCE-7072.01.patch, MAPREDUCE-7072.02.patch
>
>
>  'mapred job -history' command prints duplicate entries for counters only for 
> the human output format. It does not do this for the JSON format.
> mapred job -history /user/history/somefile.jhist -format human
> {code}
> 
> |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000
> ...
> |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000
> 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7072) mapred job -history prints duplicate counter in human output

2018-04-24 Thread Wilfred Spiegelenburg (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16449915#comment-16449915
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-7072:
--

updated the patch with the review comments, I completely overlooked this far 
more elegant solution.

> mapred job -history prints duplicate counter in human output
> 
>
> Key: MAPREDUCE-7072
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7072
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Affects Versions: 3.0.0
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Major
> Attachments: MAPREDUCE-7072.01.patch, MAPREDUCE-7072.02.patch
>
>
>  'mapred job -history' command prints duplicate entries for counters only for 
> the human output format. It does not do this for the JSON format.
> mapred job -history /user/history/somefile.jhist -format human
> {code}
> 
> |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000
> ...
> |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000
> 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7072) mapred job -history prints duplicate counter in human output

2018-04-24 Thread Wilfred Spiegelenburg (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated MAPREDUCE-7072:
-
Attachment: MAPREDUCE-7072.02.patch

> mapred job -history prints duplicate counter in human output
> 
>
> Key: MAPREDUCE-7072
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7072
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Affects Versions: 3.0.0
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Major
> Attachments: MAPREDUCE-7072.01.patch, MAPREDUCE-7072.02.patch
>
>
>  'mapred job -history' command prints duplicate entries for counters only for 
> the human output format. It does not do this for the JSON format.
> mapred job -history /user/history/somefile.jhist -format human
> {code}
> 
> |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000
> ...
> |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000
> 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7072) mapred job -history prints duplicate counter in human output

2018-04-23 Thread Wilfred Spiegelenburg (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated MAPREDUCE-7072:
-
Status: Patch Available  (was: Open)

patch including tests: the json and human printer have both been changed to 
ignore the deprecated counters and keep code in sync

> mapred job -history prints duplicate counter in human output
> 
>
> Key: MAPREDUCE-7072
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7072
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Affects Versions: 3.0.0
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Major
> Attachments: MAPREDUCE-7072.01.patch
>
>
>  'mapred job -history' command prints duplicate entries for counters only for 
> the human output format. It does not do this for the JSON format.
> mapred job -history /user/history/somefile.jhist -format human
> {code}
> 
> |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000
> ...
> |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000
> 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7072) mapred job -history prints duplicate counter in human output

2018-04-23 Thread Wilfred Spiegelenburg (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated MAPREDUCE-7072:
-
Attachment: MAPREDUCE-7072.01.patch

> mapred job -history prints duplicate counter in human output
> 
>
> Key: MAPREDUCE-7072
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7072
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Affects Versions: 3.0.0
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Major
> Attachments: MAPREDUCE-7072.01.patch
>
>
>  'mapred job -history' command prints duplicate entries for counters only for 
> the human output format. It does not do this for the JSON format.
> mapred job -history /user/history/somefile.jhist -format human
> {code}
> 
> |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000
> ...
> |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000
> 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7072) mapred job -history prints duplicate counter in human output

2018-04-04 Thread Wilfred Spiegelenburg (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425564#comment-16425564
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-7072:
--

 The root cause of the issue is located in the {{AbstractCounters}} code 
{{getGroupNames()}}

When you track through the code in the debugger the number of counter groups 
returned is higher than expected. This is due to the fact that we add the 
deprecated counters names to the list of counter group names before we return. 
The display name of the counters that are tracked in the deprecated list, 
stored in the legacyMap, are the same as the display names in the 
non-deprecated counters. The deprecated counters added are already in the non 
deprecated list which causes the duplication.
It works in the JSON format because it internally uses a HashMap. The HashMap 
uses the name of the counter groups as the key. The keys clash and we thus 
overwrite the existing value with the value from the deprecated value.

To track where this issue is coming from: MAPREDUCE-4053 changed the iteration 
to work for oozie and seems related to OOZIE-777 and the HadoopELFunctions 
which still seems to use the deprecated counter name.
Changing what the method returns is thus not possible without breaking oozie. 
We can use the iterator that can be returned by the abstract counters as it 
does not include the deprecated names.

> mapred job -history prints duplicate counter in human output
> 
>
> Key: MAPREDUCE-7072
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7072
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Affects Versions: 3.0.0
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Major
>
>  'mapred job -history' command prints duplicate entries for counters only for 
> the human output format. It does not do this for the JSON format.
> mapred job -history /user/history/somefile.jhist -format human
> {code}
> 
> |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000
> ...
> |Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000
> 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-7072) mapred job -history prints duplicate counter in human output

2018-04-04 Thread Wilfred Spiegelenburg (JIRA)
Wilfred Spiegelenburg created MAPREDUCE-7072:


 Summary: mapred job -history prints duplicate counter in human 
output
 Key: MAPREDUCE-7072
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7072
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Affects Versions: 3.0.0
Reporter: Wilfred Spiegelenburg
Assignee: Wilfred Spiegelenburg


 'mapred job -history' command prints duplicate entries for counters only for 
the human output format. It does not do this for the JSON format.

mapred job -history /user/history/somefile.jhist -format human
{code}

|Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000
...
|Job Counters |Total megabyte-seconds taken by all map tasks|0 |0 |268,288,000

{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7028) Concurrent task progress updates causing NPE in Application Master

2017-12-28 Thread Wilfred Spiegelenburg (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305929#comment-16305929
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-7028:
--

I logged YARN-7689 to fix the TestRMContainerAllocator failures because I ran 
into it while looking at something else.

> Concurrent task progress updates causing NPE in Application Master
> --
>
> Key: MAPREDUCE-7028
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7028
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am
>Affects Versions: 3.1.0, 3.0.1, 2.10.0, 2.9.1, 2.8.4, 2.7.6
>Reporter: Gergo Repas
>Assignee: Gergo Repas
> Attachments: MAPREDUCE-7028.000.patch, MAPREDUCE-7028.001.patch
>
>
> Concurrent task progress updates can cause a NullPointerException in the 
> Application Master (stack trace is with code at current trunk):
> {quote}
> 2017-12-20 06:49:42,369 INFO [IPC Server handler 9 on 39501] 
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
> attempt_1513780867907_0001_m_02_0 is : 0.02677883
> 2017-12-20 06:49:42,369 INFO [IPC Server handler 13 on 39501] 
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
> attempt_1513780867907_0001_m_02_0 is : 0.02677883
> 2017-12-20 06:49:42,383 FATAL [AsyncDispatcher event handler] 
> org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread
> java.lang.NullPointerException
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$StatusUpdater.transition(TaskAttemptImpl.java:2450)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$StatusUpdater.transition(TaskAttemptImpl.java:2433)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1362)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:154)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1543)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1535)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126)
> at java.lang.Thread.run(Thread.java:748)
> 2017-12-20 06:49:42,385 INFO [IPC Server handler 13 on 39501] 
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
> attempt_1513780867907_0001_m_02_0 is : 0.02677883
> 2017-12-20 06:49:42,386 INFO [AsyncDispatcher ShutDown handler] 
> org.apache.hadoop.yarn.event.AsyncDispatcher: Exiting, bbye..
> {quote}
> This happened naturally in several big wordcount runs, and I could reproduce 
> this reliably by artificially making task updates more frequent.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6447) reduce shuffle throws "java.lang.OutOfMemoryError: Java heap space"

2017-08-30 Thread Wilfred Spiegelenburg (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16148409#comment-16148409
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-6447:
--

[~prateekrungta] and [~shuzhangyao]: I have seen this same issue a number of 
times and people keep referring to this open MR issue.
I have dug into this and have found that there is nothing wrong with the 
calculation and there is no need for a change in the way we handle this in the 
code. There is no guarantee that any method devised for the internal 
calculation will guarantee that you do not get an OOM error. In all cases I 
have run into I have been able to fix it with a configuration change in MR and 
the JVM.

Let me explain what I have found and why the issue will not be solved by 
changing the internal calculations.

When the JVM throws an OOM for a reducer I collected a heap dumps and looked at 
what was allocated at the point in time that the OOM was thrown. It most cases 
the OOM was not thrown due to the total heap being used. As an example: the JVM 
heap for the reducer was set to 9216MB or 9GB the heap dump showed only a 5896M 
heap usage. Looking at the usage of the heap it showed that the shuffle input 
buffer usage was well within its limits.
We then tried to lower the {{mapreduce.reduce.shuffle.input.buffer.percent}} 
from the default 0.9 to 0.6 and found that it did not solve the issue. There 
was still an OOM around the same point with approximately the same usage of the 
heap. Lowering it further to 0.4 allowed the job to finish but we saw that the 
JVM never peaked above about 60% of the assigned heap. This causes a lot of 
waste on the cluster and is thus not a solution we could accept.
Further checks of the GC logging showed that all heap usage was in the old 
generation for each of the OOM cases. That explains why the reducer was 
throwing an OOM and the heap dump: it had run out of space in the old 
generation, not the total heap. Within the heap the old generation can take 
about 2/3 of the total heap. This is based on the default settings for the 
generation sizing in the heap [1].

The question then became what caused the JVM to run out of old generation and 
not using its young generation? This could be explained by the logging from the 
reducer we had. The reducer logs showed that it was trying to allocate a large 
shuffle response. In my case about 1.9GB. Eventhough this is a large shuffle 
response it was within all the limit. The JVM will allocate large objects often 
directly in the old generation instead of allocating it in the young 
generation. This behaviour could cause an OOM error in the reducer while not 
using the full heap and just running out of old generation.

Back to the calculations. In the buffer we load all the shuffle data but we set 
a maximum of 25% of the total buffer in one shuffle response. This is the in 
memory merge limit. If the shuffle response is larger than 25% of the buffer 
size we do not store it in the buffer but directly merge to disk. A shuffle 
response is only accepted and downloaded if we can fit it in memory or if it 
goes straight to disk. The check and increase of the buffer usage happens 
before we start the download. Locking makes sure only one thread does this at a 
time, the number of paralel copies is thus not important. Limiting could lead 
to a deadlock as explained in the comments in the code. Since we need to 
prevent deadlocks we allow one shuffle (one thread) to go over that limit. If 
we would not allow this we could deadlock the reducer. The reducer would be in 
a state that it can not download new data to reduce. There would also never be 
a trigger that would cause the data in memory to be merged/spilled and thus the 
buffer stays as full as it is.

Based on all that the maximum size of all the data we could store in the 
shuffle buffer would be:
{code}
mapreduce.reduce.shuffle.input.buffer.percent = buffer% = 70%
mapreduce.reduce.shuffle.memory.limit.percent = limit% = 25%
heap size = 9GB
maximum used memory = ((buffer% * (1 + limit%)) * heap size) - 1 byte
{code}
If that buffer does not fit in the old generation we could throw an OOM error 
without really running out of memory. This is especially true when the 
individual shuffle sizes are large but not hit the in memory limit. Everything 
is still properly calculated and limited. We also do not unknowningly use more 
than the configured buffer size. If we go over we know exactly how much.

The way we worked around the problem without increasing the size of the heap. 
We did this by changing the generations. The old generation inside the heap was 
changed by increasing the "NewRatio" setting from 2 (default) to 4. We also 
changed the "input.buffer.percent" setting to 65%. That worked in our case with 
the 9GB as the maximum heap for the reducer. Different heap sizes combined with 
a di

[jira] [Commented] (MAPREDUCE-5496) Document mapreduce.cluster.administrators in mapred-default.xml

2017-05-03 Thread Wilfred Spiegelenburg (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15996141#comment-15996141
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-5496:
--

This is a really old one but I just found that this is not fixed yet. Mind if I 
pick this up?

> Document mapreduce.cluster.administrators in mapred-default.xml
> ---
>
> Key: MAPREDUCE-5496
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5496
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.1.0-beta
>Reporter: Srimanth Gunturi
>Assignee: Wilfred Spiegelenburg
>Priority: Minor
>
> {{mapreduce.cluster.administrators}} is not documented anywhere. We should 
> document it in mapred-default.xml.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Assigned] (MAPREDUCE-5496) Document mapreduce.cluster.administrators in mapred-default.xml

2017-05-03 Thread Wilfred Spiegelenburg (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg reassigned MAPREDUCE-5496:


Assignee: Wilfred Spiegelenburg

> Document mapreduce.cluster.administrators in mapred-default.xml
> ---
>
> Key: MAPREDUCE-5496
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5496
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.1.0-beta
>Reporter: Srimanth Gunturi
>Assignee: Wilfred Spiegelenburg
>Priority: Minor
>
> {{mapreduce.cluster.administrators}} is not documented anywhere. We should 
> document it in mapred-default.xml.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Resolved] (MAPREDUCE-6739) allow specifying range on the port that MR AM web server binds to

2017-01-24 Thread Wilfred Spiegelenburg (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg resolved MAPREDUCE-6739.
--
Resolution: Duplicate

Closing this as a duplicate of MAPREDUCE-6404. There has been progress on that 
jira and none on this one

> allow specifying range on the port that MR AM web server binds to
> -
>
> Key: MAPREDUCE-6739
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6739
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mr-am
>Affects Versions: 2.7.2
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>  Labels: supportability
>
> MR AM web server binds itself to an arbitrary port.  This means if the RM web 
> proxy lives outside of a cluster, the whole port range needs to be wide open. 
> It'd be nice to reuse yarn.app.mapreduce.am.job.client.port-range to place a 
> port range restriction on MR AM web server, so that connection from outside 
> the cluster can be restricted within a range of ports.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-21) NegativeArraySizeException in reducer with new api

2016-06-25 Thread Wilfred Spiegelenburg (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-21?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15349935#comment-15349935
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-21:


This is fixed in trunk as HADOOP-11901 and should be closed as a dupe of that 
one.
I am no longer on the contributor list for the MAPREDUCE project so I can't do 
it.

> NegativeArraySizeException in reducer with new api
> --
>
> Key: MAPREDUCE-21
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-21
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task
>Reporter: Amareshwari Sriramadasu
>
> I observed one of the reducers failing with NegativeArraySizeException with 
> new api.
> The exception trace:
> java.lang.NegativeArraySizeException
>   at 
> org.apache.hadoop.io.BytesWritable.setCapacity(BytesWritable.java:119)
>   at org.apache.hadoop.io.BytesWritable.setSize(BytesWritable.java:98)
>   at org.apache.hadoop.io.BytesWritable.readFields(BytesWritable.java:153)
>   at 
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
>   at 
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
>   at 
> org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:142)
>   at 
> org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:121)
>   at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:189)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:542)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:409)
>   at org.apache.hadoop.mapred.Child.main(Child.java:159)
> The corresponding line in ReduceContext is 
> {code}
> line#142key = keyDeserializer.deserialize(key);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6718) add progress log to JHS during startup

2016-06-22 Thread Wilfred Spiegelenburg (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15345577#comment-15345577
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-6718:
--

Patch looks good +1 for me
indeed just logging added no new test would be needed for that.

> add progress log to JHS during startup
> --
>
> Key: MAPREDUCE-6718
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6718
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Minor
>  Labels: supportability
> Attachments: mapreduce6718.001.patch
>
>
> When the JHS starts up, it initializes the internal caches and storage via 
> the HistoryFileManager. If we have a large number of existing finished jobs 
> then we could spent minutes in this startup phase without logging progress:
> 2016-03-14 10:56:01,444 INFO 
> org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file 
> system [hdfs://hadoopcdh.itnas01.ieee.org:8020]
> 2016-03-14 10:56:11,455 INFO 
> org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager: Initializing Existing 
> Jobs...
> 2016-03-14 12:01:36,926 INFO 
> org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage: CachedHistoryStorage 
> Init
> This makes it really difficult to assess if things are working correctly (it 
> looks hung). We can add logs to notify users of progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6718) add progress log to JHS during startup

2016-06-19 Thread Wilfred Spiegelenburg (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15339036#comment-15339036
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-6718:
--

We still should have a progress report anything more than a couple of seconds 
could already cause a customer to say the server has not started. 

What would happen if I have a cache in the history server setup for 150K jobs 
or more to be kept? Limiting the cache is OK and we already do that but 
customers increase the cache size because anything not in the cache can not be 
accessed. If they run 20K jobs a day and want 7 days to be accessible then the 
cache must be 150K.
Purge of the history is set to 7 days by default which could easily do this.

Not being able to find a history that is not in the cache is another issue 
which is far more difficult to fix.

> add progress log to JHS during startup
> --
>
> Key: MAPREDUCE-6718
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6718
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobhistoryserver
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Minor
>  Labels: supportability
>
> When the JHS starts up, it initializes the internal caches and storage via 
> the HistoryFileManager. If we have a large number of existing finished jobs 
> then we could spent minutes in this startup phase without logging progress:
> 2016-03-14 10:56:01,444 INFO 
> org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file 
> system [hdfs://hadoopcdh.itnas01.ieee.org:8020]
> 2016-03-14 10:56:11,455 INFO 
> org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager: Initializing Existing 
> Jobs...
> 2016-03-14 12:01:36,926 INFO 
> org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage: CachedHistoryStorage 
> Init
> This makes it really difficult to assess if things are working correctly (it 
> looks hung). We can add logs to notify users of progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6558) multibyte delimiters with compressed input files generate duplicate records

2016-05-13 Thread Wilfred Spiegelenburg (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15283419#comment-15283419
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-6558:
--

I somehow could not leave a comment yesterday. I made the .3 patch to fix some 
comments in the test code and decrease the size of the test file even further.
Thank you for the review and the commit [~jlowe]


> multibyte delimiters with compressed input files generate duplicate records
> ---
>
> Key: MAPREDUCE-6558
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6558
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1, mrv2
>Affects Versions: 2.7.2
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
> Fix For: 2.8.0, 2.7.3, 2.6.5
>
> Attachments: MAPREDUCE-6558.1.patch, MAPREDUCE-6558.2.patch, 
> MAPREDUCE-6558.3.patch
>
>
> This is the follow up for MAPREDUCE-6549. Compressed files cause record 
> duplications as shown in different junit tests. The number of duplicated 
> records changes with the splitsize:
> Unexpected number of records in split (splitsize = 10)
> Expected: 41051
> Actual: 45062
> Unexpected number of records in split (splitsize = 10)
> Expected: 41051
> Actual: 41052
> Test passes with splitsize = 147445 which is the compressed file length.The 
> file is a bzip2 file with 100k blocks and a total of 11 blocks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6558) multibyte delimiters with compressed input files generate duplicate records

2016-05-12 Thread Wilfred Spiegelenburg (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated MAPREDUCE-6558:
-
Attachment: MAPREDUCE-6558.3.patch

> multibyte delimiters with compressed input files generate duplicate records
> ---
>
> Key: MAPREDUCE-6558
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6558
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1, mrv2
>Affects Versions: 2.7.2
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
> Attachments: MAPREDUCE-6558.1.patch, MAPREDUCE-6558.2.patch, 
> MAPREDUCE-6558.3.patch
>
>
> This is the follow up for MAPREDUCE-6549. Compressed files cause record 
> duplications as shown in different junit tests. The number of duplicated 
> records changes with the splitsize:
> Unexpected number of records in split (splitsize = 10)
> Expected: 41051
> Actual: 45062
> Unexpected number of records in split (splitsize = 10)
> Expected: 41051
> Actual: 41052
> Test passes with splitsize = 147445 which is the compressed file length.The 
> file is a bzip2 file with 100k blocks and a total of 11 blocks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6558) multibyte delimiters with compressed input files generate duplicate records

2016-05-12 Thread Wilfred Spiegelenburg (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated MAPREDUCE-6558:
-
Attachment: MAPREDUCE-6558.2.patch

> multibyte delimiters with compressed input files generate duplicate records
> ---
>
> Key: MAPREDUCE-6558
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6558
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1, mrv2
>Affects Versions: 2.7.2
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
> Attachments: MAPREDUCE-6558.1.patch, MAPREDUCE-6558.2.patch
>
>
> This is the follow up for MAPREDUCE-6549. Compressed files cause record 
> duplications as shown in different junit tests. The number of duplicated 
> records changes with the splitsize:
> Unexpected number of records in split (splitsize = 10)
> Expected: 41051
> Actual: 45062
> Unexpected number of records in split (splitsize = 10)
> Expected: 41051
> Actual: 41052
> Test passes with splitsize = 147445 which is the compressed file length.The 
> file is a bzip2 file with 100k blocks and a total of 11 blocks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6558) multibyte delimiters with compressed input files generate duplicate records

2016-05-10 Thread Wilfred Spiegelenburg (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated MAPREDUCE-6558:
-
Attachment: MAPREDUCE-6558.1.patch

A patch with test input that fails before the fix was made and passes after the 
fix was made.

I have run all the tests that were there and they all still pass with this 
change. The tests have been run a large number of times with different input 
splits and none of them have failed.

The use cases that passed before the fix was applied have been documented in 
the code.

> multibyte delimiters with compressed input files generate duplicate records
> ---
>
> Key: MAPREDUCE-6558
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6558
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1, mrv2
>Affects Versions: 2.7.2
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
> Attachments: MAPREDUCE-6558.1.patch
>
>
> This is the follow up for MAPREDUCE-6549. Compressed files cause record 
> duplications as shown in different junit tests. The number of duplicated 
> records changes with the splitsize:
> Unexpected number of records in split (splitsize = 10)
> Expected: 41051
> Actual: 45062
> Unexpected number of records in split (splitsize = 10)
> Expected: 41051
> Actual: 41052
> Test passes with splitsize = 147445 which is the compressed file length.The 
> file is a bzip2 file with 100k blocks and a total of 11 blocks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6558) multibyte delimiters with compressed input files generate duplicate records

2016-05-10 Thread Wilfred Spiegelenburg (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated MAPREDUCE-6558:
-
Status: Patch Available  (was: Open)

> multibyte delimiters with compressed input files generate duplicate records
> ---
>
> Key: MAPREDUCE-6558
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6558
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1, mrv2
>Affects Versions: 2.7.2
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
> Attachments: MAPREDUCE-6558.1.patch
>
>
> This is the follow up for MAPREDUCE-6549. Compressed files cause record 
> duplications as shown in different junit tests. The number of duplicated 
> records changes with the splitsize:
> Unexpected number of records in split (splitsize = 10)
> Expected: 41051
> Actual: 45062
> Unexpected number of records in split (splitsize = 10)
> Expected: 41051
> Actual: 41052
> Test passes with splitsize = 147445 which is the compressed file length.The 
> file is a bzip2 file with 100k blocks and a total of 11 blocks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-2398) MRBench: setting the baseDir parameter has no effect

2016-04-21 Thread Wilfred Spiegelenburg (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15252966#comment-15252966
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-2398:
--

[~qwertymaniac] thank you for the quick review and commit

> MRBench: setting the baseDir parameter has no effect
> 
>
> Key: MAPREDUCE-2398
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2398
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: benchmarks
>Affects Versions: 2.3.0
>Reporter: Michael Noll
>Assignee: Wilfred Spiegelenburg
>Priority: Minor
> Fix For: 2.9.0
>
> Attachments: MAPREDUCE-2398-trunk.patch, MAPREDUCE-2398.2.patch, 
> MAPREDUCE-2398_0.20.2.patch, MAPREDUCE-2398_v2-0.20.203.0.patch, 
> MAPREDUCE-2398_v2-trunk.patch
>
>
> The optional {{-baseDir}} parameter lets user specify the base DFS path for 
> output/input of MRBench.
> However, the two private variables {{INPUT_DIR}} and {{OUTPUT_DIR}} 
> (MRBench.java) are not updated in the case that the default value of  
> {{-baseDir}} is actually overwritten by the user. Hence any input and output 
> is always written to the default locations ({{/benchmarks/MRBench/...}}), 
> even though the user-supplied location for {{-baseDir}} is created (and 
> eventually deleted again) on HDFS.
> The bug affects at least Hadoop 0.20.2 and the current trunk (r1082703) as of 
> March 21, 2011.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-2398) MRBench: setting the baseDir parameter has no effect

2016-04-21 Thread Wilfred Spiegelenburg (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15251679#comment-15251679
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-2398:
--

test failures are not related to this patch:
- TestMRCJCFileOutputCommitter has been failing in multiple builds
- TestUberAM passes for me locally

> MRBench: setting the baseDir parameter has no effect
> 
>
> Key: MAPREDUCE-2398
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2398
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: benchmarks
>Affects Versions: 2.3.0
>Reporter: Michael Noll
>Assignee: Wilfred Spiegelenburg
>Priority: Minor
> Attachments: MAPREDUCE-2398-trunk.patch, MAPREDUCE-2398.2.patch, 
> MAPREDUCE-2398_0.20.2.patch, MAPREDUCE-2398_v2-0.20.203.0.patch, 
> MAPREDUCE-2398_v2-trunk.patch
>
>
> The optional {{-baseDir}} parameter lets user specify the base DFS path for 
> output/input of MRBench.
> However, the two private variables {{INPUT_DIR}} and {{OUTPUT_DIR}} 
> (MRBench.java) are not updated in the case that the default value of  
> {{-baseDir}} is actually overwritten by the user. Hence any input and output 
> is always written to the default locations ({{/benchmarks/MRBench/...}}), 
> even though the user-supplied location for {{-baseDir}} is created (and 
> eventually deleted again) on HDFS.
> The bug affects at least Hadoop 0.20.2 and the current trunk (r1082703) as of 
> March 21, 2011.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-2398) MRBench: setting the baseDir parameter has no effect

2016-04-20 Thread Wilfred Spiegelenburg (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15251168#comment-15251168
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-2398:
--

[~yanghaogn] do you mind if I assign this one to me?


> MRBench: setting the baseDir parameter has no effect
> 
>
> Key: MAPREDUCE-2398
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2398
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: benchmarks
>Affects Versions: 2.3.0
>Reporter: Michael Noll
>Assignee: Yang Hao
>Priority: Minor
> Attachments: MAPREDUCE-2398-trunk.patch, MAPREDUCE-2398.2.patch, 
> MAPREDUCE-2398_0.20.2.patch, MAPREDUCE-2398_v2-0.20.203.0.patch, 
> MAPREDUCE-2398_v2-trunk.patch
>
>
> The optional {{-baseDir}} parameter lets user specify the base DFS path for 
> output/input of MRBench.
> However, the two private variables {{INPUT_DIR}} and {{OUTPUT_DIR}} 
> (MRBench.java) are not updated in the case that the default value of  
> {{-baseDir}} is actually overwritten by the user. Hence any input and output 
> is always written to the default locations ({{/benchmarks/MRBench/...}}), 
> even though the user-supplied location for {{-baseDir}} is created (and 
> eventually deleted again) on HDFS.
> The bug affects at least Hadoop 0.20.2 and the current trunk (r1082703) as of 
> March 21, 2011.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-2398) MRBench: setting the baseDir parameter has no effect

2016-04-20 Thread Wilfred Spiegelenburg (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated MAPREDUCE-2398:
-
Attachment: MAPREDUCE-2398.2.patch

This seems to have gone stale for a long time and just ran into this.

I have updated the patch and made sure it works passing in the value from the 
command line. As part of the change I also cleaned up the directory creation 
and cleanup.
The INPUT_DIR automatically gets created by the {{generateTextFile()}} call. 
The clean up should not delete the BASE_DIR because it could exist and have 
other data in it. Only the OUTPUT_DIR and INPUT_DIR that were created for the 
run should be removed for cleanup.

> MRBench: setting the baseDir parameter has no effect
> 
>
> Key: MAPREDUCE-2398
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2398
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: benchmarks
>Affects Versions: 2.3.0
>Reporter: Michael Noll
>Assignee: Yang Hao
>Priority: Minor
> Attachments: MAPREDUCE-2398-trunk.patch, MAPREDUCE-2398.2.patch, 
> MAPREDUCE-2398_0.20.2.patch, MAPREDUCE-2398_v2-0.20.203.0.patch, 
> MAPREDUCE-2398_v2-trunk.patch
>
>
> The optional {{-baseDir}} parameter lets user specify the base DFS path for 
> output/input of MRBench.
> However, the two private variables {{INPUT_DIR}} and {{OUTPUT_DIR}} 
> (MRBench.java) are not updated in the case that the default value of  
> {{-baseDir}} is actually overwritten by the user. Hence any input and output 
> is always written to the default locations ({{/benchmarks/MRBench/...}}), 
> even though the user-supplied location for {{-baseDir}} is created (and 
> eventually deleted again) on HDFS.
> The bug affects at least Hadoop 0.20.2 and the current trunk (r1082703) as of 
> March 21, 2011.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6549) multibyte delimiters with LineRecordReader cause duplicate records

2015-11-29 Thread Wilfred Spiegelenburg (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15031214#comment-15031214
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-6549:
--

Should this be pulled back into 2.7.3 and 2.6.3 based on the fact that [~jlowe] 
pulled MAPREDUCE-6481 into those releases?

> multibyte delimiters with LineRecordReader cause duplicate records
> --
>
> Key: MAPREDUCE-6549
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6549
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1, mrv2
>Affects Versions: 2.7.2
>Reporter: Dustin Cote
>Assignee: Wilfred Spiegelenburg
> Fix For: 2.8.0
>
> Attachments: MAPREDUCE-6549-1.patch, MAPREDUCE-6549-2.patch, 
> MAPREDUCE-6549.3.patch
>
>
> LineRecorderReader currently produces duplicate records under certain 
> scenarios such as:
> 1) input string: "abc+++def++ghi++" 
> delimiter string: "+++" 
> test passes with all sizes of the split 
> 2) input string: "abc++def+++ghi++" 
> delimiter string: "+++" 
> test fails with a split size of 4 
> 2) input string: "abc+++def++ghi++" 
> delimiter string: "++" 
> test fails with a split size of 5 
> 3) input string "abc+++defg++hij++" 
> delimiter string: "++" 
> test fails with a split size of 4 
> 4) input string "abc++def+++ghi++" 
> delimiter string: "++" 
> test fails with a split size of 9 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6558) multibyte delimiters with compressed input files generate duplicate records

2015-11-29 Thread Wilfred Spiegelenburg (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15031213#comment-15031213
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-6558:
--

I do not see a problem with that. I am still working on the fix for this.

> multibyte delimiters with compressed input files generate duplicate records
> ---
>
> Key: MAPREDUCE-6558
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6558
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1, mrv2
>Affects Versions: 2.7.2
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>
> This is the follow up for MAPREDUCE-6549. Compressed files cause record 
> duplications as shown in different junit tests. The number of duplicated 
> records changes with the splitsize:
> Unexpected number of records in split (splitsize = 10)
> Expected: 41051
> Actual: 45062
> Unexpected number of records in split (splitsize = 10)
> Expected: 41051
> Actual: 41052
> Test passes with splitsize = 147445 which is the compressed file length.The 
> file is a bzip2 file with 100k blocks and a total of 11 blocks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6549) multibyte delimiters with LineRecordReader cause duplicate records

2015-11-24 Thread Wilfred Spiegelenburg (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15026312#comment-15026312
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-6549:
--

Test failures are not related and tracked in different jiras:

testIpcWithReaderQueuing is tracked by HADOOP-10406
testGangliaMetrics2 is tracked in HADOOP-12588
testDeprecatedUmask is tracked in HDFS-9451

> multibyte delimiters with LineRecordReader cause duplicate records
> --
>
> Key: MAPREDUCE-6549
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6549
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1, mrv2
>Affects Versions: 2.7.2
>Reporter: Dustin Cote
>Assignee: Wilfred Spiegelenburg
> Attachments: MAPREDUCE-6549-1.patch, MAPREDUCE-6549-2.patch, 
> MAPREDUCE-6549.3.patch
>
>
> LineRecorderReader currently produces duplicate records under certain 
> scenarios such as:
> 1) input string: "abc+++def++ghi++" 
> delimiter string: "+++" 
> test passes with all sizes of the split 
> 2) input string: "abc++def+++ghi++" 
> delimiter string: "+++" 
> test fails with a split size of 4 
> 2) input string: "abc+++def++ghi++" 
> delimiter string: "++" 
> test fails with a split size of 5 
> 3) input string "abc+++defg++hij++" 
> delimiter string: "++" 
> test fails with a split size of 4 
> 4) input string "abc++def+++ghi++" 
> delimiter string: "++" 
> test fails with a split size of 9 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6549) multibyte delimiters with LineRecordReader cause duplicate records

2015-11-24 Thread Wilfred Spiegelenburg (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated MAPREDUCE-6549:
-
Status: Patch Available  (was: Open)

Updated the patch to fix the NPE in the 
testUncompressedInputCustomDelimiterPosValue

Checked the license, findbugs and other junit test failures and they are not 
related to the changes from this patch

> multibyte delimiters with LineRecordReader cause duplicate records
> --
>
> Key: MAPREDUCE-6549
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6549
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1, mrv2
>Affects Versions: 2.7.2
>Reporter: Dustin Cote
>Assignee: Wilfred Spiegelenburg
> Attachments: MAPREDUCE-6549-1.patch, MAPREDUCE-6549-2.patch, 
> MAPREDUCE-6549.3.patch
>
>
> LineRecorderReader currently produces duplicate records under certain 
> scenarios such as:
> 1) input string: "abc+++def++ghi++" 
> delimiter string: "+++" 
> test passes with all sizes of the split 
> 2) input string: "abc++def+++ghi++" 
> delimiter string: "+++" 
> test fails with a split size of 4 
> 2) input string: "abc+++def++ghi++" 
> delimiter string: "++" 
> test fails with a split size of 5 
> 3) input string "abc+++defg++hij++" 
> delimiter string: "++" 
> test fails with a split size of 4 
> 4) input string "abc++def+++ghi++" 
> delimiter string: "++" 
> test fails with a split size of 9 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6549) multibyte delimiters with LineRecordReader cause duplicate records

2015-11-24 Thread Wilfred Spiegelenburg (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated MAPREDUCE-6549:
-
Attachment: MAPREDUCE-6549.3.patch

> multibyte delimiters with LineRecordReader cause duplicate records
> --
>
> Key: MAPREDUCE-6549
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6549
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1, mrv2
>Affects Versions: 2.7.2
>Reporter: Dustin Cote
>Assignee: Wilfred Spiegelenburg
> Attachments: MAPREDUCE-6549-1.patch, MAPREDUCE-6549-2.patch, 
> MAPREDUCE-6549.3.patch
>
>
> LineRecorderReader currently produces duplicate records under certain 
> scenarios such as:
> 1) input string: "abc+++def++ghi++" 
> delimiter string: "+++" 
> test passes with all sizes of the split 
> 2) input string: "abc++def+++ghi++" 
> delimiter string: "+++" 
> test fails with a split size of 4 
> 2) input string: "abc+++def++ghi++" 
> delimiter string: "++" 
> test fails with a split size of 5 
> 3) input string "abc+++defg++hij++" 
> delimiter string: "++" 
> test fails with a split size of 4 
> 4) input string "abc++def+++ghi++" 
> delimiter string: "++" 
> test fails with a split size of 9 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6549) multibyte delimiters with LineRecordReader cause duplicate records

2015-11-24 Thread Wilfred Spiegelenburg (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated MAPREDUCE-6549:
-
Status: Open  (was: Patch Available)

> multibyte delimiters with LineRecordReader cause duplicate records
> --
>
> Key: MAPREDUCE-6549
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6549
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1, mrv2
>Affects Versions: 2.7.2
>Reporter: Dustin Cote
>Assignee: Wilfred Spiegelenburg
> Attachments: MAPREDUCE-6549-1.patch, MAPREDUCE-6549-2.patch
>
>
> LineRecorderReader currently produces duplicate records under certain 
> scenarios such as:
> 1) input string: "abc+++def++ghi++" 
> delimiter string: "+++" 
> test passes with all sizes of the split 
> 2) input string: "abc++def+++ghi++" 
> delimiter string: "+++" 
> test fails with a split size of 4 
> 2) input string: "abc+++def++ghi++" 
> delimiter string: "++" 
> test fails with a split size of 5 
> 3) input string "abc+++defg++hij++" 
> delimiter string: "++" 
> test fails with a split size of 4 
> 4) input string "abc++def+++ghi++" 
> delimiter string: "++" 
> test fails with a split size of 9 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6549) multibyte delimiters with LineRecordReader cause duplicate records

2015-11-24 Thread Wilfred Spiegelenburg (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15024263#comment-15024263
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-6549:
--

[~zxu] & [~jlowe] can you also please have a look at the patch for the 
uncompressed version?
I have not seen a build being triggered from the patch that was added. That 
might need to be triggered somehow for this patch (naming convention wrong for 
the patch?)

> multibyte delimiters with LineRecordReader cause duplicate records
> --
>
> Key: MAPREDUCE-6549
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6549
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1, mrv2
>Affects Versions: 2.7.2
>Reporter: Dustin Cote
>Assignee: Wilfred Spiegelenburg
> Attachments: MAPREDUCE-6549-1.patch, MAPREDUCE-6549-2.patch
>
>
> LineRecorderReader currently produces duplicate records under certain 
> scenarios such as:
> 1) input string: "abc+++def++ghi++" 
> delimiter string: "+++" 
> test passes with all sizes of the split 
> 2) input string: "abc++def+++ghi++" 
> delimiter string: "+++" 
> test fails with a split size of 4 
> 2) input string: "abc+++def++ghi++" 
> delimiter string: "++" 
> test fails with a split size of 5 
> 3) input string "abc+++defg++hij++" 
> delimiter string: "++" 
> test fails with a split size of 4 
> 4) input string "abc++def+++ghi++" 
> delimiter string: "++" 
> test fails with a split size of 9 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6549) multibyte delimiters with LineRecordReader cause duplicate records

2015-11-24 Thread Wilfred Spiegelenburg (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15024233#comment-15024233
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-6549:
--

The compressed version is not as easily fixable, and I am opening up a new jira 
for that one.

The compressed version does not use the split size as the uncompressed version 
does. The split size as far as I can tell depends on the compression codec and 
the file encoding/compression blocks. The split size is not taken into account 
as it is in the uncompressed version.

I ran a set of similar junit tests over the compressed data and the changed 
code is not even triggered.

> multibyte delimiters with LineRecordReader cause duplicate records
> --
>
> Key: MAPREDUCE-6549
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6549
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1, mrv2
>Affects Versions: 2.7.2
>Reporter: Dustin Cote
>Assignee: Wilfred Spiegelenburg
> Attachments: MAPREDUCE-6549-1.patch, MAPREDUCE-6549-2.patch
>
>
> LineRecorderReader currently produces duplicate records under certain 
> scenarios such as:
> 1) input string: "abc+++def++ghi++" 
> delimiter string: "+++" 
> test passes with all sizes of the split 
> 2) input string: "abc++def+++ghi++" 
> delimiter string: "+++" 
> test fails with a split size of 4 
> 2) input string: "abc+++def++ghi++" 
> delimiter string: "++" 
> test fails with a split size of 5 
> 3) input string "abc+++defg++hij++" 
> delimiter string: "++" 
> test fails with a split size of 4 
> 4) input string "abc++def+++ghi++" 
> delimiter string: "++" 
> test fails with a split size of 9 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6558) multibyte delimiters with compressed input files generate duplicate records

2015-11-24 Thread Wilfred Spiegelenburg (JIRA)
Wilfred Spiegelenburg created MAPREDUCE-6558:


 Summary: multibyte delimiters with compressed input files generate 
duplicate records
 Key: MAPREDUCE-6558
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6558
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Affects Versions: 2.7.2
Reporter: Wilfred Spiegelenburg
Assignee: Wilfred Spiegelenburg


This is the follow up for MAPREDUCE-6549. Compressed files cause record 
duplications as shown in different junit tests. The number of duplicated 
records changes with the splitsize:

Unexpected number of records in split (splitsize = 10)
Expected: 41051
Actual: 45062

Unexpected number of records in split (splitsize = 10)
Expected: 41051
Actual: 41052

Test passes with splitsize = 147445 which is the compressed file length.The 
file is a bzip2 file with 100k blocks and a total of 11 blocks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6549) multibyte delimiters with LineRecordReader cause duplicate records

2015-11-19 Thread Wilfred Spiegelenburg (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15013993#comment-15013993
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-6549:
--

I have been able to generate a compressed file which shows a same record 
duplication as was shown in the uncompressed processing. The code however 
behaves completely different in the two cases since we do not have the same 
kind of buffer filling process. I am still trying to fix the compressed code 
without breaking the uncompressed code.

I should have a fix for both cases in a day or two.

> multibyte delimiters with LineRecordReader cause duplicate records
> --
>
> Key: MAPREDUCE-6549
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6549
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1, mrv2
>Affects Versions: 2.7.2
>Reporter: Dustin Cote
>Assignee: Wilfred Spiegelenburg
> Attachments: MAPREDUCE-6549-1.patch, MAPREDUCE-6549-2.patch
>
>
> LineRecorderReader currently produces duplicate records under certain 
> scenarios such as:
> 1) input string: "abc+++def++ghi++" 
> delimiter string: "+++" 
> test passes with all sizes of the split 
> 2) input string: "abc++def+++ghi++" 
> delimiter string: "+++" 
> test fails with a split size of 4 
> 2) input string: "abc+++def++ghi++" 
> delimiter string: "++" 
> test fails with a split size of 5 
> 3) input string "abc+++defg++hij++" 
> delimiter string: "++" 
> test fails with a split size of 4 
> 4) input string "abc++def+++ghi++" 
> delimiter string: "++" 
> test fails with a split size of 9 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6549) multibyte delimiters with LineRecordReader cause duplicate records

2015-11-16 Thread Wilfred Spiegelenburg (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15008133#comment-15008133
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-6549:
--

[~zxu] The change in MAPREDUCE-6481 is not to blame for the duplicate records 
as far as I can tell. It fixed things and now we get to see what is there and 
thus we noticed the duplicates. I did not look at the compressed input, and I 
do think you are correct. Compressed input uses the same steps and we should 
clear the setting in the same way as we did the uncompressed stream. I will try 
to generate a compressed stream that is splittable to get a test case. I will 
upload a new patch but I will first try to generate the test case before I do 
that.

[~cotedm] An EOF will automatically terminate the record there is no need for a 
record delimiter at the end of the file. All the test, and comments in the code 
show it. The assumption is that the last record before EOF does not need a 
record terminator. It is not a new assumption, assuming that an EOF would not 
delimit a record would be counter intuitive. Most text files for instance do 
not have a newline at the last line.

> multibyte delimiters with LineRecordReader cause duplicate records
> --
>
> Key: MAPREDUCE-6549
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6549
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1, mrv2
>Affects Versions: 2.7.2
>Reporter: Dustin Cote
>Assignee: Wilfred Spiegelenburg
> Attachments: MAPREDUCE-6549-1.patch, MAPREDUCE-6549-2.patch
>
>
> LineRecorderReader currently produces duplicate records under certain 
> scenarios such as:
> 1) input string: "abc+++def++ghi++" 
> delimiter string: "+++" 
> test passes with all sizes of the split 
> 2) input string: "abc++def+++ghi++" 
> delimiter string: "+++" 
> test fails with a split size of 4 
> 2) input string: "abc+++def++ghi++" 
> delimiter string: "++" 
> test fails with a split size of 5 
> 3) input string "abc+++defg++hij++" 
> delimiter string: "++" 
> test fails with a split size of 4 
> 4) input string "abc++def+++ghi++" 
> delimiter string: "++" 
> test fails with a split size of 9 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6549) multibyte delimiters with LineRecordReader cause duplicate records

2015-11-15 Thread Wilfred Spiegelenburg (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated MAPREDUCE-6549:
-
Component/s: mrv2
 mrv1

> multibyte delimiters with LineRecordReader cause duplicate records
> --
>
> Key: MAPREDUCE-6549
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6549
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1, mrv2
>Affects Versions: 2.7.2
>Reporter: Dustin Cote
>Assignee: Wilfred Spiegelenburg
> Attachments: MAPREDUCE-6549-1.patch, MAPREDUCE-6549-2.patch
>
>
> LineRecorderReader currently produces duplicate records under certain 
> scenarios such as:
> 1) input string: "abc+++def++ghi++" 
> delimiter string: "+++" 
> test passes with all sizes of the split 
> 2) input string: "abc++def+++ghi++" 
> delimiter string: "+++" 
> test fails with a split size of 4 
> 2) input string: "abc+++def++ghi++" 
> delimiter string: "++" 
> test fails with a split size of 5 
> 3) input string "abc+++defg++hij++" 
> delimiter string: "++" 
> test fails with a split size of 4 
> 4) input string "abc++def+++ghi++" 
> delimiter string: "++" 
> test fails with a split size of 9 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6549) multibyte delimiters with LineRecordReader cause duplicate records

2015-11-15 Thread Wilfred Spiegelenburg (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated MAPREDUCE-6549:
-
Attachment: MAPREDUCE-6549-2.patch

The issue is related to [MAPREDUCE-6481]. That jira changed the position 
calculation and made sure that the full records are returned by the reader as 
expected. It did not anticipate the record duplication. Junit tests also did 
not cover the use cases correctly to discover the issue.
The problem is limited to multi byte delimiters only as far as I can trace. 

The junit tests for the multi byte delimiter only take the best case scenario 
into account. The input data contained the exact delimiter and no ambiguous 
characters. As soon as the test is changed, either the delimiter or the input 
data, a failure will be triggered. The issue with the failure is that it does 
not clearly show when and how it fails. Analysis of the test failures shows 
that a complex combination of input data, split and buffer size will trigger a 
failure.

Based on testing the duplication of the record occurs only if:
- the first character(s) of the delimiter are part of the record data, example: 
  1) the delimiter is {{\+=}} and the data contains a {{\+}} and is not 
followed by {{=}}
  2) the delimiter is {{\+=\+=}} and the data contains {{\+=\+}} and is not 
followed by {{=}}
- the delimiter character is found at the split boundary: the last character 
before the split ends
- a fill of the buffer is triggered to finish processing the record

The underlying problem is that we set a flag called {{needAdditionalRecord}} in 
the {{UncompressedSplitLineReader}} when we fill the buffer and have 
encountered part of a delimiter in combination with a split. We keep track of 
this in the ambiguous character number. However is it turns out that if the 
character(s) found after that point do not belong to a delimiter we do not 
unset the {{needAdditionalRecord}}. This causes the next record to be read 
twice and thus we see a duplication of records.
The solution would be to unset the flag when we detect that we're not 
processing a delimiter. We currently only add the ambiguous characters to the 
record read and set the number back to 0. At the same point we need to unset 
the flag.

The patch was developed based on junit tests that exercise the split and buffer 
settings in combination with multiple delimiter types using different inputs. 
All cases now provide a consistent count of records and correct position inside 
the data.

> multibyte delimiters with LineRecordReader cause duplicate records
> --
>
> Key: MAPREDUCE-6549
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6549
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Dustin Cote
>Assignee: Wilfred Spiegelenburg
> Attachments: MAPREDUCE-6549-1.patch, MAPREDUCE-6549-2.patch
>
>
> LineRecorderReader currently produces duplicate records under certain 
> scenarios such as:
> 1) input string: "abc+++def++ghi++" 
> delimiter string: "+++" 
> test passes with all sizes of the split 
> 2) input string: "abc++def+++ghi++" 
> delimiter string: "+++" 
> test fails with a split size of 4 
> 2) input string: "abc+++def++ghi++" 
> delimiter string: "++" 
> test fails with a split size of 5 
> 3) input string "abc+++defg++hij++" 
> delimiter string: "++" 
> test fails with a split size of 4 
> 4) input string "abc++def+++ghi++" 
> delimiter string: "++" 
> test fails with a split size of 9 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6549) multibyte delimiters with LineRecordReader cause duplicate records

2015-11-15 Thread Wilfred Spiegelenburg (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated MAPREDUCE-6549:
-
Status: Patch Available  (was: Open)

> multibyte delimiters with LineRecordReader cause duplicate records
> --
>
> Key: MAPREDUCE-6549
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6549
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Dustin Cote
>Assignee: Wilfred Spiegelenburg
> Attachments: MAPREDUCE-6549-1.patch, MAPREDUCE-6549-2.patch
>
>
> LineRecorderReader currently produces duplicate records under certain 
> scenarios such as:
> 1) input string: "abc+++def++ghi++" 
> delimiter string: "+++" 
> test passes with all sizes of the split 
> 2) input string: "abc++def+++ghi++" 
> delimiter string: "+++" 
> test fails with a split size of 4 
> 2) input string: "abc+++def++ghi++" 
> delimiter string: "++" 
> test fails with a split size of 5 
> 3) input string "abc+++defg++hij++" 
> delimiter string: "++" 
> test fails with a split size of 4 
> 4) input string "abc++def+++ghi++" 
> delimiter string: "++" 
> test fails with a split size of 9 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6549) multibyte delimiters with LineRecordReader cause duplicate records

2015-11-15 Thread Wilfred Spiegelenburg (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15006080#comment-15006080
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-6549:
--

[~cotedm] I have picked up the jira and have a fully tested and working patch 
for the issue

> multibyte delimiters with LineRecordReader cause duplicate records
> --
>
> Key: MAPREDUCE-6549
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6549
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Dustin Cote
>Assignee: Wilfred Spiegelenburg
> Attachments: MAPREDUCE-6549-1.patch
>
>
> LineRecorderReader currently produces duplicate records under certain 
> scenarios such as:
> 1) input string: "abc+++def++ghi++" 
> delimiter string: "+++" 
> test passes with all sizes of the split 
> 2) input string: "abc++def+++ghi++" 
> delimiter string: "+++" 
> test fails with a split size of 4 
> 2) input string: "abc+++def++ghi++" 
> delimiter string: "++" 
> test fails with a split size of 5 
> 3) input string "abc+++defg++hij++" 
> delimiter string: "++" 
> test fails with a split size of 4 
> 4) input string "abc++def+++ghi++" 
> delimiter string: "++" 
> test fails with a split size of 9 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MAPREDUCE-6549) multibyte delimiters with LineRecordReader cause duplicate records

2015-11-15 Thread Wilfred Spiegelenburg (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg reassigned MAPREDUCE-6549:


Assignee: Wilfred Spiegelenburg  (was: Dustin Cote)

> multibyte delimiters with LineRecordReader cause duplicate records
> --
>
> Key: MAPREDUCE-6549
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6549
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Dustin Cote
>Assignee: Wilfred Spiegelenburg
> Attachments: MAPREDUCE-6549-1.patch
>
>
> LineRecorderReader currently produces duplicate records under certain 
> scenarios such as:
> 1) input string: "abc+++def++ghi++" 
> delimiter string: "+++" 
> test passes with all sizes of the split 
> 2) input string: "abc++def+++ghi++" 
> delimiter string: "+++" 
> test fails with a split size of 4 
> 2) input string: "abc+++def++ghi++" 
> delimiter string: "++" 
> test fails with a split size of 5 
> 3) input string "abc+++defg++hij++" 
> delimiter string: "++" 
> test fails with a split size of 4 
> 4) input string "abc++def+++ghi++" 
> delimiter string: "++" 
> test fails with a split size of 9 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6549) multibyte delimiters with LineRecordReader cause duplicate records

2015-11-15 Thread Wilfred Spiegelenburg (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated MAPREDUCE-6549:
-
Status: Open  (was: Patch Available)

I tried the change that you made in the patch and it fails the current tests.
The patch changes one test (TestLineRecordReader.java) but we have two 
versions. The mapred version is unchanged and now fails. The mapreduce version 
works but as soon as I change the delimiter back it also fails. That means that 
the change does not fix the issue.

it also brings the two tests out of sync which is not correct

> multibyte delimiters with LineRecordReader cause duplicate records
> --
>
> Key: MAPREDUCE-6549
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6549
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Dustin Cote
>Assignee: Dustin Cote
> Attachments: MAPREDUCE-6549-1.patch
>
>
> LineRecorderReader currently produces duplicate records under certain 
> scenarios such as:
> 1) input string: "abc+++def++ghi++" 
> delimiter string: "+++" 
> test passes with all sizes of the split 
> 2) input string: "abc++def+++ghi++" 
> delimiter string: "+++" 
> test fails with a split size of 4 
> 2) input string: "abc+++def++ghi++" 
> delimiter string: "++" 
> test fails with a split size of 5 
> 3) input string "abc+++defg++hij++" 
> delimiter string: "++" 
> test fails with a split size of 4 
> 4) input string "abc++def+++ghi++" 
> delimiter string: "++" 
> test fails with a split size of 9 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5965) Hadoop streaming throws error if list of input files is high. Error is: "error=7, Argument list too long at if number of input file is high"

2015-05-28 Thread Wilfred Spiegelenburg (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14563932#comment-14563932
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-5965:
--

Can someone please review the latest patch and let me know if it is OK?

> Hadoop streaming throws error if list of input files is high. Error is: 
> "error=7, Argument list too long at if number of input file is high"
> 
>
> Key: MAPREDUCE-5965
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5965
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Arup Malakar
>Assignee: Wilfred Spiegelenburg
> Attachments: MAPREDUCE-5965.1.patch, MAPREDUCE-5965.2.patch, 
> MAPREDUCE-5965.3.patch, MAPREDUCE-5965.patch
>
>
> Hadoop streaming exposes all the key values in job conf as environment 
> variables when it forks a process for streaming code to run. Unfortunately 
> the variable mapreduce_input_fileinputformat_inputdir contains the list of 
> input files, and Linux has a limit on size of environment variables + 
> arguments.
> Based on how long the list of files and their full path is this could be 
> pretty huge. And given all of these variables are not even used it stops user 
> from running hadoop job with large number of files, even though it could be 
> run.
> Linux throws E2BIG if the size is greater than certain size which is error 
> code 7. And java translates that to "error=7, Argument list too long". More: 
> http://man7.org/linux/man-pages/man2/execve.2.html I suggest skipping 
> variables if it is greater than certain length. That way if user code 
> requires the environment variable it would fail. It should also introduce a 
> config variable to skip long variables, and set it to false by default. That 
> way user has to specifically set it to true to invoke this feature.
> Here is the exception:
> {code}
> Error: java.lang.RuntimeException: Error in configuring object at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) 
> at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426) at 
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at 
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:415) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: 
> java.lang.reflect.InvocationTargetException at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:606) at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) 
> ... 9 more Caused by: java.lang.RuntimeException: Error in configuring object 
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) 
> at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
> at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38) ... 14 
> more Caused by: java.lang.reflect.InvocationTargetException at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:606) at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) 
> ... 17 more Caused by: java.lang.RuntimeException: configuration exception at 
> org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:222) at 
> org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66) ... 22 
> more Caused by: java.io.IOException: Cannot run program 
> "/data/hadoop/hadoop-yarn/cache/yarn/nm-local-dir/usercache/oo-analytics/appcache/application_1403599726264_13177/container_1403599726264_13177_01_06/./rbenv_runner.sh":
>  error=7, Argument list too long at 
> java.lang.ProcessBuilder.start(ProcessBuilder.java:1041) at 
> org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:209) ... 23 
> more Caused by: java.io.IOException: error=7, Argument list too long at 
> java.lang.UNIXProcess.forkAndExec(Na

[jira] [Updated] (MAPREDUCE-5965) Hadoop streaming throws error if list of input files is high. Error is: "error=7, Argument list too long at if number of input file is high"

2015-05-25 Thread Wilfred Spiegelenburg (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated MAPREDUCE-5965:
-
Attachment: MAPREDUCE-5965.3.patch

Updated the patch using the new name and made it an integer as [~djp] proposed. 
The documentation and the usage that is printed in the StreamJob have been 
updated to show the new option and the values.

To answer the 2 question: it would be long enough to leave all but the 
problem value alone.

Three values are documented:
-1: do not truncate (default)
0: only copy the key and not the value (side effect of using substring)
2: as a safe value which should prevent the "error=7" issue

> Hadoop streaming throws error if list of input files is high. Error is: 
> "error=7, Argument list too long at if number of input file is high"
> 
>
> Key: MAPREDUCE-5965
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5965
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Arup Malakar
>Assignee: Wilfred Spiegelenburg
> Attachments: MAPREDUCE-5965.1.patch, MAPREDUCE-5965.2.patch, 
> MAPREDUCE-5965.3.patch, MAPREDUCE-5965.patch
>
>
> Hadoop streaming exposes all the key values in job conf as environment 
> variables when it forks a process for streaming code to run. Unfortunately 
> the variable mapreduce_input_fileinputformat_inputdir contains the list of 
> input files, and Linux has a limit on size of environment variables + 
> arguments.
> Based on how long the list of files and their full path is this could be 
> pretty huge. And given all of these variables are not even used it stops user 
> from running hadoop job with large number of files, even though it could be 
> run.
> Linux throws E2BIG if the size is greater than certain size which is error 
> code 7. And java translates that to "error=7, Argument list too long". More: 
> http://man7.org/linux/man-pages/man2/execve.2.html I suggest skipping 
> variables if it is greater than certain length. That way if user code 
> requires the environment variable it would fail. It should also introduce a 
> config variable to skip long variables, and set it to false by default. That 
> way user has to specifically set it to true to invoke this feature.
> Here is the exception:
> {code}
> Error: java.lang.RuntimeException: Error in configuring object at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) 
> at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426) at 
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at 
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:415) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: 
> java.lang.reflect.InvocationTargetException at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:606) at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) 
> ... 9 more Caused by: java.lang.RuntimeException: Error in configuring object 
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) 
> at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
> at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38) ... 14 
> more Caused by: java.lang.reflect.InvocationTargetException at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:606) at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) 
> ... 17 more Caused by: java.lang.RuntimeException: configuration exception at 
> org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:222) at 
> org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66) ... 22 
> more Caused by: java.io.IOException: Cannot run program 
> "/data/hadoop/hadoop-yarn/cache/yarn/nm-local-dir/usercache/o

[jira] [Commented] (MAPREDUCE-5965) Hadoop streaming throws error if list of input files is high. Error is: "error=7, Argument list too long at if number of input file is high"

2015-05-21 Thread Wilfred Spiegelenburg (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14554197#comment-14554197
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-5965:
--

[~amalakar] thank you for the assignment. The comment should be added back, 
I'll do that with an updated patch. The move to keep it in the same method was 
to make the change as simple as possible.

[~rchiang] 
The streaming configuration does not really have a *-default.xml file. There is 
documentation (markdown) that shows some of the settings and options: adding it 
to the FAQ would probably be the correct place. There is a help that is printed 
in the main StreamJob code that shows most of the options. I will update the 
two files and explain the setting that is available. I can upload a new patch 
with that added before I do lets get the other points finalised.

A white list or black list is possible but what would we exclude or include? In 
the job configuration there could be any value which could be too long, a user 
could set something he wants. It will be really difficult to filter that 
consistently and be sure that we have a fix with limited impact.

Making the lenLimit configurable is possible. However I do not see what we 
would win with making the length configurable. The data is not used anywhere 
and lowering or increasing the size at which we cut it off will not give us 
anything extra. If you really want to make it configurable the easiest way 
would be to roll the two settings in one. We could make the 
stream.truncate.long.jobconf.values an integer: -1 do not truncate otherwise 
truncate at the length given.

> Hadoop streaming throws error if list of input files is high. Error is: 
> "error=7, Argument list too long at if number of input file is high"
> 
>
> Key: MAPREDUCE-5965
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5965
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Arup Malakar
>Assignee: Wilfred Spiegelenburg
> Attachments: MAPREDUCE-5965.1.patch, MAPREDUCE-5965.2.patch, 
> MAPREDUCE-5965.patch
>
>
> Hadoop streaming exposes all the key values in job conf as environment 
> variables when it forks a process for streaming code to run. Unfortunately 
> the variable mapreduce_input_fileinputformat_inputdir contains the list of 
> input files, and Linux has a limit on size of environment variables + 
> arguments.
> Based on how long the list of files and their full path is this could be 
> pretty huge. And given all of these variables are not even used it stops user 
> from running hadoop job with large number of files, even though it could be 
> run.
> Linux throws E2BIG if the size is greater than certain size which is error 
> code 7. And java translates that to "error=7, Argument list too long". More: 
> http://man7.org/linux/man-pages/man2/execve.2.html I suggest skipping 
> variables if it is greater than certain length. That way if user code 
> requires the environment variable it would fail. It should also introduce a 
> config variable to skip long variables, and set it to false by default. That 
> way user has to specifically set it to true to invoke this feature.
> Here is the exception:
> {code}
> Error: java.lang.RuntimeException: Error in configuring object at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) 
> at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426) at 
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at 
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:415) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: 
> java.lang.reflect.InvocationTargetException at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:606) at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) 
> ... 9 more Caused by: java.lang.RuntimeException: Error in configuring object 
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) 
> at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at 
> org.apa

[jira] [Commented] (MAPREDUCE-5965) Hadoop streaming throws error if list of input files is high. Error is: "error=7, Argument list too long at if number of input file is high"

2015-05-18 Thread Wilfred Spiegelenburg (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549661#comment-14549661
 ] 

Wilfred Spiegelenburg commented on MAPREDUCE-5965:
--

Arup: Do you mind if I assign the jira to me? Would like to get this fixed in 
an upcoming release.

> Hadoop streaming throws error if list of input files is high. Error is: 
> "error=7, Argument list too long at if number of input file is high"
> 
>
> Key: MAPREDUCE-5965
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5965
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Arup Malakar
>Assignee: Arup Malakar
> Attachments: MAPREDUCE-5965.1.patch, MAPREDUCE-5965.2.patch, 
> MAPREDUCE-5965.patch
>
>
> Hadoop streaming exposes all the key values in job conf as environment 
> variables when it forks a process for streaming code to run. Unfortunately 
> the variable mapreduce_input_fileinputformat_inputdir contains the list of 
> input files, and Linux has a limit on size of environment variables + 
> arguments.
> Based on how long the list of files and their full path is this could be 
> pretty huge. And given all of these variables are not even used it stops user 
> from running hadoop job with large number of files, even though it could be 
> run.
> Linux throws E2BIG if the size is greater than certain size which is error 
> code 7. And java translates that to "error=7, Argument list too long". More: 
> http://man7.org/linux/man-pages/man2/execve.2.html I suggest skipping 
> variables if it is greater than certain length. That way if user code 
> requires the environment variable it would fail. It should also introduce a 
> config variable to skip long variables, and set it to false by default. That 
> way user has to specifically set it to true to invoke this feature.
> Here is the exception:
> {code}
> Error: java.lang.RuntimeException: Error in configuring object at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) 
> at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426) at 
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at 
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:415) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: 
> java.lang.reflect.InvocationTargetException at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:606) at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) 
> ... 9 more Caused by: java.lang.RuntimeException: Error in configuring object 
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) 
> at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
> at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38) ... 14 
> more Caused by: java.lang.reflect.InvocationTargetException at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:606) at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) 
> ... 17 more Caused by: java.lang.RuntimeException: configuration exception at 
> org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:222) at 
> org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66) ... 22 
> more Caused by: java.io.IOException: Cannot run program 
> "/data/hadoop/hadoop-yarn/cache/yarn/nm-local-dir/usercache/oo-analytics/appcache/application_1403599726264_13177/container_1403599726264_13177_01_06/./rbenv_runner.sh":
>  error=7, Argument list too long at 
> java.lang.ProcessBuilder.start(ProcessBuilder.java:1041) at 
> org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:209) ... 23 
> more Caused by: java.io.IOException: error=7, Argument list too long at 
> java.lang.UNIXProcess.forkAndExec(Native 

[jira] [Updated] (MAPREDUCE-5965) Hadoop streaming throws error if list of input files is high. Error is: "error=7, Argument list too long at if number of input file is high"

2015-05-18 Thread Wilfred Spiegelenburg (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated MAPREDUCE-5965:
-
Status: Patch Available  (was: Open)

> Hadoop streaming throws error if list of input files is high. Error is: 
> "error=7, Argument list too long at if number of input file is high"
> 
>
> Key: MAPREDUCE-5965
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5965
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Arup Malakar
>Assignee: Arup Malakar
> Attachments: MAPREDUCE-5965.1.patch, MAPREDUCE-5965.2.patch, 
> MAPREDUCE-5965.patch
>
>
> Hadoop streaming exposes all the key values in job conf as environment 
> variables when it forks a process for streaming code to run. Unfortunately 
> the variable mapreduce_input_fileinputformat_inputdir contains the list of 
> input files, and Linux has a limit on size of environment variables + 
> arguments.
> Based on how long the list of files and their full path is this could be 
> pretty huge. And given all of these variables are not even used it stops user 
> from running hadoop job with large number of files, even though it could be 
> run.
> Linux throws E2BIG if the size is greater than certain size which is error 
> code 7. And java translates that to "error=7, Argument list too long". More: 
> http://man7.org/linux/man-pages/man2/execve.2.html I suggest skipping 
> variables if it is greater than certain length. That way if user code 
> requires the environment variable it would fail. It should also introduce a 
> config variable to skip long variables, and set it to false by default. That 
> way user has to specifically set it to true to invoke this feature.
> Here is the exception:
> {code}
> Error: java.lang.RuntimeException: Error in configuring object at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) 
> at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426) at 
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at 
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:415) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: 
> java.lang.reflect.InvocationTargetException at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:606) at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) 
> ... 9 more Caused by: java.lang.RuntimeException: Error in configuring object 
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) 
> at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
> at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38) ... 14 
> more Caused by: java.lang.reflect.InvocationTargetException at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:606) at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) 
> ... 17 more Caused by: java.lang.RuntimeException: configuration exception at 
> org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:222) at 
> org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66) ... 22 
> more Caused by: java.io.IOException: Cannot run program 
> "/data/hadoop/hadoop-yarn/cache/yarn/nm-local-dir/usercache/oo-analytics/appcache/application_1403599726264_13177/container_1403599726264_13177_01_06/./rbenv_runner.sh":
>  error=7, Argument list too long at 
> java.lang.ProcessBuilder.start(ProcessBuilder.java:1041) at 
> org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:209) ... 23 
> more Caused by: java.io.IOException: error=7, Argument list too long at 
> java.lang.UNIXProcess.forkAndExec(Native Method) at 
> java.lang.UNIXProcess.(UNIXProcess.java:135) at 
> java.lang.ProcessImpl.start(ProcessImpl.java:130) at

[jira] [Updated] (MAPREDUCE-5965) Hadoop streaming throws error if list of input files is high. Error is: "error=7, Argument list too long at if number of input file is high"

2015-05-18 Thread Wilfred Spiegelenburg (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated MAPREDUCE-5965:
-
Attachment: MAPREDUCE-5965.2.patch

Ran into the same issue. Re-based and cleaned up patch which does the same as 
the Hive patch (truncate the environment value)

> Hadoop streaming throws error if list of input files is high. Error is: 
> "error=7, Argument list too long at if number of input file is high"
> 
>
> Key: MAPREDUCE-5965
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5965
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Arup Malakar
>Assignee: Arup Malakar
> Attachments: MAPREDUCE-5965.1.patch, MAPREDUCE-5965.2.patch, 
> MAPREDUCE-5965.patch
>
>
> Hadoop streaming exposes all the key values in job conf as environment 
> variables when it forks a process for streaming code to run. Unfortunately 
> the variable mapreduce_input_fileinputformat_inputdir contains the list of 
> input files, and Linux has a limit on size of environment variables + 
> arguments.
> Based on how long the list of files and their full path is this could be 
> pretty huge. And given all of these variables are not even used it stops user 
> from running hadoop job with large number of files, even though it could be 
> run.
> Linux throws E2BIG if the size is greater than certain size which is error 
> code 7. And java translates that to "error=7, Argument list too long". More: 
> http://man7.org/linux/man-pages/man2/execve.2.html I suggest skipping 
> variables if it is greater than certain length. That way if user code 
> requires the environment variable it would fail. It should also introduce a 
> config variable to skip long variables, and set it to false by default. That 
> way user has to specifically set it to true to invoke this feature.
> Here is the exception:
> {code}
> Error: java.lang.RuntimeException: Error in configuring object at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) 
> at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426) at 
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at 
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:415) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: 
> java.lang.reflect.InvocationTargetException at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:606) at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) 
> ... 9 more Caused by: java.lang.RuntimeException: Error in configuring object 
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) 
> at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
> at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38) ... 14 
> more Caused by: java.lang.reflect.InvocationTargetException at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:606) at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) 
> ... 17 more Caused by: java.lang.RuntimeException: configuration exception at 
> org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:222) at 
> org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66) ... 22 
> more Caused by: java.io.IOException: Cannot run program 
> "/data/hadoop/hadoop-yarn/cache/yarn/nm-local-dir/usercache/oo-analytics/appcache/application_1403599726264_13177/container_1403599726264_13177_01_06/./rbenv_runner.sh":
>  error=7, Argument list too long at 
> java.lang.ProcessBuilder.start(ProcessBuilder.java:1041) at 
> org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:209) ... 23 
> more Caused by: java.io.IOException: error=7, Argument list too long at 
> java.lang.UNIXProcess.forkAndExe