[jira] [Commented] (MAPREDUCE-7141) Allow KMS generated spill encryption keys

2021-06-03 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17356485#comment-17356485
 ] 

Jim Brennan commented on MAPREDUCE-7141:


[~ahussein] the current PR does not apply to trunk.  Can you please update it?


> Allow KMS generated spill encryption keys
> -
>
> Key: MAPREDUCE-7141
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7141
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>Reporter: Kuhu Shukla
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
> Attachments: MAPREDUCE-7141-001.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently the only way an encryption key for task spills is generated is by 
> the AM's key generator. This JIRA tracks the work required to add KMS support 
> to this key's generation allowing fault tolerance to AM failures/re-runs and 
> also give another option to the client on how it wants the keys to be created.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7332) Fix SpillCallBackPathsFinder to use JDK7 on branch-2.10

2021-03-26 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated MAPREDUCE-7332:
---
Fix Version/s: 2.10.2
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Thanks for the fix [~ahussein]!  I've committed this to branch-2.10.

 

> Fix SpillCallBackPathsFinder to use JDK7 on branch-2.10
> ---
>
> Key: MAPREDUCE-7332
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7332
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: job submission, security
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
> Fix For: 2.10.2
>
> Attachments: MAPREDUCE-7332.branch-2.10.001.patch, 
> MAPREDUCE-7332.branch-2.10.002.patch
>
>
> I mistakenly uploaded a patch for branch-2.10 that uses JDK8.
> Yetus did not fail though. It should be investigated why it was not failing 
> if JDK8+ is used in the code.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7332) Fix SpillCallBackPathsFinder to use JDK7 on branch-2.10

2021-03-26 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17309454#comment-17309454
 ] 

Jim Brennan commented on MAPREDUCE-7332:


Given that this is an isolated change that affects only one unit test, I am ok 
with using the patch in this instance.  We really need to get the precommit 
build working for branch-2.10 though.

 

> Fix SpillCallBackPathsFinder to use JDK7 on branch-2.10
> ---
>
> Key: MAPREDUCE-7332
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7332
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: job submission, security
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
> Attachments: MAPREDUCE-7332.branch-2.10.001.patch, 
> MAPREDUCE-7332.branch-2.10.002.patch
>
>
> I mistakenly uploaded a patch for branch-2.10 that uses JDK8.
> Yetus did not fail though. It should be investigated why it was not failing 
> if JDK8+ is used in the code.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (MAPREDUCE-7332) Fix SpillCallBackPathsFinder to use JDK7 on branch-2.10

2021-03-25 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17308974#comment-17308974
 ] 

Jim Brennan edited comment on MAPREDUCE-7332 at 3/25/21, 9:11 PM:
--

[~aajisaka] do you know why this failure happens?
{noformat}


Re-exec mode detected. Continuing.


ERROR: Unprocessed flag(s): --spotbugs-strict-precheck {noformat}


was (Author: jim_brennan):
[~aajisaka] do you know why this failure happens?
{noformat}


Re-exec mode detected. Continuing.

ERROR:
 Unprocessed flag(s): --spotbugs-strict-precheck {noformat}

> Fix SpillCallBackPathsFinder to use JDK7 on branch-2.10
> ---
>
> Key: MAPREDUCE-7332
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7332
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: job submission, security
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
> Attachments: MAPREDUCE-7332.branch-2.10.001.patch
>
>
> I mistakenly uploaded a patch for branch-2.10 that uses JDK8.
> Yetus did not fail though. It should be investigated why it was not failing 
> if JDK8+ is used in the code.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7332) Fix SpillCallBackPathsFinder to use JDK7 on branch-2.10

2021-03-25 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17308974#comment-17308974
 ] 

Jim Brennan commented on MAPREDUCE-7332:


[~aajisaka] do you know why this failure happens?
{noformat}


Re-exec mode detected. Continuing.

ERROR:
 Unprocessed flag(s): --spotbugs-strict-precheck {noformat}

> Fix SpillCallBackPathsFinder to use JDK7 on branch-2.10
> ---
>
> Key: MAPREDUCE-7332
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7332
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: job submission, security
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
> Attachments: MAPREDUCE-7332.branch-2.10.001.patch
>
>
> I mistakenly uploaded a patch for branch-2.10 that uses JDK8.
> Yetus did not fail though. It should be investigated why it was not failing 
> if JDK8+ is used in the code.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7325) Intermediate data encryption is broken in LocalJobRunner

2021-03-22 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated MAPREDUCE-7325:
---
Fix Version/s: 3.2.3
   2.10.2
   3.3.1
   3.1.5
   3.4.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Thanks for the contribution [~ahussein]!  I have committed this to trunk - 
branch-2.10.


> Intermediate data encryption is broken in LocalJobRunner
> 
>
> Key: MAPREDUCE-7325
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7325
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: job submission, security
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
> Fix For: 3.4.0, 3.1.5, 3.3.1, 2.10.2, 3.2.3
>
> Attachments: MAPREDUCE-7325.001.patch, MAPREDUCE-7325.002.patch
>
>
> With enabling intermediate encryption, running a job using LocalJobRunner 
> with multiple reducers fails with the following error:
>  
> {code:bash}
> 2021-02-23 18:18:05,145 WARN  [Thread-2381] mapred.LocalJobRunner 
> (LocalJobRunner.java:run(590)) - job_local1344328673_0004
> java.lang.Exception: 
> org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in 
> shuffle in localfetcher#5
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:559)
> Caused by: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: 
> error in shuffle in localfetcher#5
>   at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:136)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:377)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:347)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.fs.ChecksumException: Checksum Error
>   at 
> org.apache.hadoop.mapred.IFileInputStream.doRead(IFileInputStream.java:229)
>   at 
> org.apache.hadoop.mapred.IFileInputStream.read(IFileInputStream.java:153)
>   at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:210)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.doShuffle(InMemoryMapOutput.java:91)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.IFileWrappedMapOutput.shuffle(IFileWrappedMapOutput.java:63)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.copyMapOutput(LocalFetcher.java:156)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.doCopy(LocalFetcher.java:103)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.run(LocalFetcher.java:86)
> {code}
> The bug can be reproduced with any test unit like TestLocalJobSubmission.
> Another way to reproduce the bug is to run {{LargeSorter}} with multiple 
> reducers.
> {code:java}
> Configuration config = new Configuration();
> // set all the necessary configurations
> config.setInt(LargeSorter.NUM_REDUCE_TASKS,2);
> String[] args = new String[] {"output-dir"};
> int res = ToolRunner.run(config, new LargeSorter(), args);
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7325) Intermediate data encryption is broken in LocalJobRunner

2021-03-22 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17306260#comment-17306260
 ] 

Jim Brennan commented on MAPREDUCE-7325:


Thanks for the fix [~ahussein]!  +1 This change looks good to me.   I will 
commit later today.


> Intermediate data encryption is broken in LocalJobRunner
> 
>
> Key: MAPREDUCE-7325
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7325
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: job submission, security
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: MAPREDUCE-7325.001.patch, MAPREDUCE-7325.002.patch
>
>
> With enabling intermediate encryption, running a job using LocalJobRunner 
> with multiple reducers fails with the following error:
>  
> {code:bash}
> 2021-02-23 18:18:05,145 WARN  [Thread-2381] mapred.LocalJobRunner 
> (LocalJobRunner.java:run(590)) - job_local1344328673_0004
> java.lang.Exception: 
> org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in 
> shuffle in localfetcher#5
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:559)
> Caused by: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: 
> error in shuffle in localfetcher#5
>   at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:136)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:377)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:347)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.fs.ChecksumException: Checksum Error
>   at 
> org.apache.hadoop.mapred.IFileInputStream.doRead(IFileInputStream.java:229)
>   at 
> org.apache.hadoop.mapred.IFileInputStream.read(IFileInputStream.java:153)
>   at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:210)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.doShuffle(InMemoryMapOutput.java:91)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.IFileWrappedMapOutput.shuffle(IFileWrappedMapOutput.java:63)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.copyMapOutput(LocalFetcher.java:156)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.doCopy(LocalFetcher.java:103)
>   at 
> org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.run(LocalFetcher.java:86)
> {code}
> The bug can be reproduced with any test unit like TestLocalJobSubmission.
> Another way to reproduce the bug is to run {{LargeSorter}} with multiple 
> reducers.
> {code:java}
> Configuration config = new Configuration();
> // set all the necessary configurations
> config.setInt(LargeSorter.NUM_REDUCE_TASKS,2);
> String[] args = new String[] {"output-dir"};
> int res = ToolRunner.run(config, new LargeSorter(), args);
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7322) revisiting TestMRIntermediateDataEncryption

2021-03-16 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated MAPREDUCE-7322:
---
Fix Version/s: 3.2.3
   2.10.2
   3.1.5
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Thanks [~ahussein]!  I have committed this to trunk - branch-2.10.


> revisiting TestMRIntermediateDataEncryption 
> 
>
> Key: MAPREDUCE-7322
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7322
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: job submission, security, test
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: patch-available
> Fix For: 3.4.0, 3.1.5, 3.3.1, 2.10.2, 3.2.3
>
> Attachments: MAPREDUCE-7322.001.patch, MAPREDUCE-7322.002.patch, 
> MAPREDUCE-7322.003.patch, MAPREDUCE-7322.004.patch, MAPREDUCE-7322.005.patch, 
> MAPREDUCE-7322.006.patch, MAPREDUCE-7322.007.patch, MAPREDUCE-7322.008.patch, 
> MAPREDUCE-7322.009.patch, MAPREDUCE-7322.branch-2.10.009.patch, 
> MAPREDUCE-7322.branch-3.2.009.patch
>
>
> I was reviewing {{TestMRIntermediateDataEncryption}}. The unit test has 
> actually little to do with encryption.
> I have the following conclusion:
> * Enabling/Disabling {{MRJobConfig.MR_ENCRYPTED_INTERMEDIATE_DATA}} does not 
> change the behavior of the unit test.
> * There are no spill files generated by either mappers/reducers
> * Wrapping I/O streams with Crypto never happens during the execution of the 
> unit test.
> Unless I misunderstand the purpose of that unit test, I suggest that it gets 
> re-implemented so that it validates encryption in spilled intermediate data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7322) revisiting TestMRIntermediateDataEncryption

2021-03-16 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17302591#comment-17302591
 ] 

Jim Brennan commented on MAPREDUCE-7322:


[~ahussein],  -MAPREDUCE-7320- has already been backported all the way to 
branch-2.10.
Let me know if you want to provide additional patches for this one.


> revisiting TestMRIntermediateDataEncryption 
> 
>
> Key: MAPREDUCE-7322
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7322
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: job submission, security, test
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: patch-available
> Fix For: 3.4.0, 3.3.1
>
> Attachments: MAPREDUCE-7322.001.patch, MAPREDUCE-7322.002.patch, 
> MAPREDUCE-7322.003.patch, MAPREDUCE-7322.004.patch, MAPREDUCE-7322.005.patch, 
> MAPREDUCE-7322.006.patch, MAPREDUCE-7322.007.patch, MAPREDUCE-7322.008.patch, 
> MAPREDUCE-7322.009.patch
>
>
> I was reviewing {{TestMRIntermediateDataEncryption}}. The unit test has 
> actually little to do with encryption.
> I have the following conclusion:
> * Enabling/Disabling {{MRJobConfig.MR_ENCRYPTED_INTERMEDIATE_DATA}} does not 
> change the behavior of the unit test.
> * There are no spill files generated by either mappers/reducers
> * Wrapping I/O streams with Crypto never happens during the execution of the 
> unit test.
> Unless I misunderstand the purpose of that unit test, I suggest that it gets 
> re-implemented so that it validates encryption in spilled intermediate data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7322) revisiting TestMRIntermediateDataEncryption

2021-03-15 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17302010#comment-17302010
 ] 

Jim Brennan commented on MAPREDUCE-7322:


Thanks for the contribution [~ahussein]!  I have committed this to trunk and 
branch-3.3, but the patch does not apply to branch-3.2.    Can you please 
provide a patch for 3.2 and any earlier branches you want this pulled back to?

 

> revisiting TestMRIntermediateDataEncryption 
> 
>
> Key: MAPREDUCE-7322
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7322
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: job submission, security, test
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: patch-available
> Fix For: 3.4.0, 3.3.1
>
> Attachments: MAPREDUCE-7322.001.patch, MAPREDUCE-7322.002.patch, 
> MAPREDUCE-7322.003.patch, MAPREDUCE-7322.004.patch, MAPREDUCE-7322.005.patch, 
> MAPREDUCE-7322.006.patch, MAPREDUCE-7322.007.patch, MAPREDUCE-7322.008.patch, 
> MAPREDUCE-7322.009.patch
>
>
> I was reviewing {{TestMRIntermediateDataEncryption}}. The unit test has 
> actually little to do with encryption.
> I have the following conclusion:
> * Enabling/Disabling {{MRJobConfig.MR_ENCRYPTED_INTERMEDIATE_DATA}} does not 
> change the behavior of the unit test.
> * There are no spill files generated by either mappers/reducers
> * Wrapping I/O streams with Crypto never happens during the execution of the 
> unit test.
> Unless I misunderstand the purpose of that unit test, I suggest that it gets 
> re-implemented so that it validates encryption in spilled intermediate data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7322) revisiting TestMRIntermediateDataEncryption

2021-03-15 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated MAPREDUCE-7322:
---
Fix Version/s: 3.3.1
   3.4.0

> revisiting TestMRIntermediateDataEncryption 
> 
>
> Key: MAPREDUCE-7322
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7322
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: job submission, security, test
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: patch-available
> Fix For: 3.4.0, 3.3.1
>
> Attachments: MAPREDUCE-7322.001.patch, MAPREDUCE-7322.002.patch, 
> MAPREDUCE-7322.003.patch, MAPREDUCE-7322.004.patch, MAPREDUCE-7322.005.patch, 
> MAPREDUCE-7322.006.patch, MAPREDUCE-7322.007.patch, MAPREDUCE-7322.008.patch, 
> MAPREDUCE-7322.009.patch
>
>
> I was reviewing {{TestMRIntermediateDataEncryption}}. The unit test has 
> actually little to do with encryption.
> I have the following conclusion:
> * Enabling/Disabling {{MRJobConfig.MR_ENCRYPTED_INTERMEDIATE_DATA}} does not 
> change the behavior of the unit test.
> * There are no spill files generated by either mappers/reducers
> * Wrapping I/O streams with Crypto never happens during the execution of the 
> unit test.
> Unless I misunderstand the purpose of that unit test, I suggest that it gets 
> re-implemented so that it validates encryption in spilled intermediate data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7322) revisiting TestMRIntermediateDataEncryption

2021-03-12 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17300597#comment-17300597
 ] 

Jim Brennan commented on MAPREDUCE-7322:


Thanks for all the work on this [~ahussein]!  I am +1 on patch 009, pending 
precommit build results.

 

> revisiting TestMRIntermediateDataEncryption 
> 
>
> Key: MAPREDUCE-7322
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7322
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: job submission, security, test
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: patch-available
> Attachments: MAPREDUCE-7322.001.patch, MAPREDUCE-7322.002.patch, 
> MAPREDUCE-7322.003.patch, MAPREDUCE-7322.004.patch, MAPREDUCE-7322.005.patch, 
> MAPREDUCE-7322.006.patch, MAPREDUCE-7322.007.patch, MAPREDUCE-7322.008.patch, 
> MAPREDUCE-7322.009.patch
>
>
> I was reviewing {{TestMRIntermediateDataEncryption}}. The unit test has 
> actually little to do with encryption.
> I have the following conclusion:
> * Enabling/Disabling {{MRJobConfig.MR_ENCRYPTED_INTERMEDIATE_DATA}} does not 
> change the behavior of the unit test.
> * There are no spill files generated by either mappers/reducers
> * Wrapping I/O streams with Crypto never happens during the execution of the 
> unit test.
> Unless I misunderstand the purpose of that unit test, I suggest that it gets 
> re-implemented so that it validates encryption in spilled intermediate data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7322) revisiting TestMRIntermediateDataEncryption

2021-03-10 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17299154#comment-17299154
 ] 

Jim Brennan commented on MAPREDUCE-7322:


A couple more comments. 
* Instead of putting the new files in mapreduce/lib/spill, I think putting them 
in mapreduce/security might make more sense?
* Is it possible in TestMRIntermediateDataEncryption to run the test once 
without spill encryption enabled and verify that it produces the same output?

> revisiting TestMRIntermediateDataEncryption 
> 
>
> Key: MAPREDUCE-7322
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7322
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: job submission, security, test
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: patch-available
> Attachments: MAPREDUCE-7322.001.patch, MAPREDUCE-7322.002.patch, 
> MAPREDUCE-7322.003.patch, MAPREDUCE-7322.004.patch, MAPREDUCE-7322.005.patch, 
> MAPREDUCE-7322.006.patch
>
>
> I was reviewing {{TestMRIntermediateDataEncryption}}. The unit test has 
> actually little to do with encryption.
> I have the following conclusion:
> * Enabling/Disabling {{MRJobConfig.MR_ENCRYPTED_INTERMEDIATE_DATA}} does not 
> change the behavior of the unit test.
> * There are no spill files generated by either mappers/reducers
> * Wrapping I/O streams with Crypto never happens during the execution of the 
> unit test.
> Unless I misunderstand the purpose of that unit test, I suggest that it gets 
> re-implemented so that it validates encryption in spilled intermediate data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7322) revisiting TestMRIntermediateDataEncryption

2021-03-10 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17299144#comment-17299144
 ] 

Jim Brennan commented on MAPREDUCE-7322:


{quote}
I think that I should add a configuration to set the class implementation of 
the callback injector.

This would allow starting the runtime on a cluster with the injector enabled.

I ruled out making the configuration per Job as this would imply parsing the 
JobConf everytime (not a no-op for production).
 {quote}
I'm not sure this is a good idea.  Sounds like a potential security hole.

> revisiting TestMRIntermediateDataEncryption 
> 
>
> Key: MAPREDUCE-7322
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7322
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: job submission, security, test
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: patch-available
> Attachments: MAPREDUCE-7322.001.patch, MAPREDUCE-7322.002.patch, 
> MAPREDUCE-7322.003.patch, MAPREDUCE-7322.004.patch, MAPREDUCE-7322.005.patch, 
> MAPREDUCE-7322.006.patch
>
>
> I was reviewing {{TestMRIntermediateDataEncryption}}. The unit test has 
> actually little to do with encryption.
> I have the following conclusion:
> * Enabling/Disabling {{MRJobConfig.MR_ENCRYPTED_INTERMEDIATE_DATA}} does not 
> change the behavior of the unit test.
> * There are no spill files generated by either mappers/reducers
> * Wrapping I/O streams with Crypto never happens during the execution of the 
> unit test.
> Unless I misunderstand the purpose of that unit test, I suggest that it gets 
> re-implemented so that it validates encryption in spilled intermediate data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7322) revisiting TestMRIntermediateDataEncryption

2021-03-10 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17299031#comment-17299031
 ] 

Jim Brennan commented on MAPREDUCE-7322:


Great work [~ahussein]. Overall this looks excellent. Just a few minor comments:

GenericTestUtils
 * I’m not sure why we need convertCamelToHyphen(). I can understand that the 
directory names would be more readable, but they would no longer match the test 
case method name, making it harder to match up the test dirs with the test 
cases?  Am I missing something?
 * The methods relating to initializing spill encryption configs do not belong 
in Common. These need to be somewhere in the MAPREDUCE project. 
initEncryptedIntermediateConfigs(), setLocalDirectoriesConfig()

MapTask
 * remove TODO comment (as you noted)

TestGenericTestUtils
 * not sure about need for testConvertCamelCase() (see above)

> revisiting TestMRIntermediateDataEncryption 
> 
>
> Key: MAPREDUCE-7322
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7322
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: job submission, security, test
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: patch-available
> Attachments: MAPREDUCE-7322.001.patch, MAPREDUCE-7322.002.patch, 
> MAPREDUCE-7322.003.patch, MAPREDUCE-7322.004.patch, MAPREDUCE-7322.005.patch
>
>
> I was reviewing {{TestMRIntermediateDataEncryption}}. The unit test has 
> actually little to do with encryption.
> I have the following conclusion:
> * Enabling/Disabling {{MRJobConfig.MR_ENCRYPTED_INTERMEDIATE_DATA}} does not 
> change the behavior of the unit test.
> * There are no spill files generated by either mappers/reducers
> * Wrapping I/O streams with Crypto never happens during the execution of the 
> unit test.
> Unless I misunderstand the purpose of that unit test, I suggest that it gets 
> re-implemented so that it validates encryption in spilled intermediate data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Resolved] (MAPREDUCE-7320) ClusterMapReduceTestCase does not clean directories

2021-02-26 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan resolved MAPREDUCE-7320.

Fix Version/s: 3.2.3
   2.10.2
   3.3.1
   3.1.5
   3.4.0
   Resolution: Fixed

Thanks for fixing this [~ahussein]!  I have committed this to trunk - 
branch-2.10.

> ClusterMapReduceTestCase does not clean directories
> ---
>
> Key: MAPREDUCE-7320
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7320
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.1.5, 3.3.1, 2.10.2, 3.2.3
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Running Junits that extend {{ClusterMapReduceTestCase}} generate lots of 
> directories and folders without cleaning them up.
> For example:
> {code:bash}
> men test -Dtest=TestMRJobClient{code}
> generates the following directories:
> {code:bash}
> - target
>-+ ConfigurableMiniMRCluster_315090884
>-+ ConfigurableMiniMRCluster_1335188990
>-+ ConfigurableMiniMRCluster_1973037511
>-+ test-dir
> -+ dfs
> -+ hadopp-XYZ-01
> -+ hadopp-XYZ-02 
> -+ hadopp-XYZ-03
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7320) ClusterMapReduceTestCase does not clean directories

2021-02-24 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated MAPREDUCE-7320:
---
Flags:   (was: Patch)

> ClusterMapReduceTestCase does not clean directories
> ---
>
> Key: MAPREDUCE-7320
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7320
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Running Junits that extend {{ClusterMapReduceTestCase}} generate lots of 
> directories and folders without cleaning them up.
> For example:
> {code:bash}
> men test -Dtest=TestMRJobClient{code}
> generates the following directories:
> {code:bash}
> - target
>-+ ConfigurableMiniMRCluster_315090884
>-+ ConfigurableMiniMRCluster_1335188990
>-+ ConfigurableMiniMRCluster_1973037511
>-+ test-dir
> -+ dfs
> -+ hadopp-XYZ-01
> -+ hadopp-XYZ-02 
> -+ hadopp-XYZ-03
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7320) ClusterMapReduceTestCase does not clean directories

2021-02-24 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated MAPREDUCE-7320:
---
Flags: Patch

> ClusterMapReduceTestCase does not clean directories
> ---
>
> Key: MAPREDUCE-7320
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7320
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Running Junits that extend {{ClusterMapReduceTestCase}} generate lots of 
> directories and folders without cleaning them up.
> For example:
> {code:bash}
> men test -Dtest=TestMRJobClient{code}
> generates the following directories:
> {code:bash}
> - target
>-+ ConfigurableMiniMRCluster_315090884
>-+ ConfigurableMiniMRCluster_1335188990
>-+ ConfigurableMiniMRCluster_1973037511
>-+ test-dir
> -+ dfs
> -+ hadopp-XYZ-01
> -+ hadopp-XYZ-02 
> -+ hadopp-XYZ-03
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7320) ClusterMapReduceTestCase does not clean directories

2021-02-22 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17288420#comment-17288420
 ] 

Jim Brennan commented on MAPREDUCE-7320:


I would prefer to cleanup at the start of the test.  I know we have had cases 
in the past where I needed to look at these logs after our automated unit test 
builds.  I wouldn't want to have to modify the code to enable that.
I am curious how others feel about this?
cc: [~epayne], [~jeagles], [~ebadger], [~jhung]

> ClusterMapReduceTestCase does not clean directories
> ---
>
> Key: MAPREDUCE-7320
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7320
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Running Junits that extend {{ClusterMapReduceTestCase}} generate lots of 
> directories and folders without cleaning them up.
> For example:
> {code:bash}
> men test -Dtest=TestMRJobClient{code}
> generates the following directories:
> {code:bash}
> - target
>-+ ConfigurableMiniMRCluster_315090884
>-+ ConfigurableMiniMRCluster_1335188990
>-+ ConfigurableMiniMRCluster_1973037511
>-+ test-dir
> -+ dfs
> -+ hadopp-XYZ-01
> -+ hadopp-XYZ-02 
> -+ hadopp-XYZ-03
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7320) ClusterMapReduceTestCase does not clean directories

2021-02-19 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17287385#comment-17287385
 ] 

Jim Brennan commented on MAPREDUCE-7320:


[~ahussein] Thanks for cleaning this up.  But I always thought keeping these 
around was intentional?  If you get a failure, it is sometimes useful to be 
able to find the logs of the mini cluster (especially job/container logs).
In cases like this, you would want to make sure the top level directory is 
predictable (which you have done), but you would want to delete them at the 
beginning of the test instead of the end.
This would still leave a bunch of files around after running the unit tests, 
but a subsequent run would first delete the old stuff.  What do you think?



> ClusterMapReduceTestCase does not clean directories
> ---
>
> Key: MAPREDUCE-7320
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7320
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Running Junits that extend {{ClusterMapReduceTestCase}} generate lots of 
> directories and folders without cleaning them up.
> For example:
> {code:bash}
> men test -Dtest=TestMRJobClient{code}
> generates the following directories:
> {code:bash}
> - target
>-+ ConfigurableMiniMRCluster_315090884
>-+ ConfigurableMiniMRCluster_1335188990
>-+ ConfigurableMiniMRCluster_1973037511
>-+ test-dir
> -+ dfs
> -+ hadopp-XYZ-01
> -+ hadopp-XYZ-02 
> -+ hadopp-XYZ-03
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7319) Log list of mappers at trace level in ShuffleHandler audit log

2021-02-09 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17282035#comment-17282035
 ] 

Jim Brennan commented on MAPREDUCE-7319:


Thanks [~ebadger]!

> Log list of mappers at trace level in ShuffleHandler audit log
> --
>
> Key: MAPREDUCE-7319
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7319
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.4.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Fix For: 3.4.0, 3.1.5, 3.3.1, 2.10.2, 3.2.3
>
> Attachments: MAPREDUCE-7319.001.patch
>
>
> [MAPREDUCE-6958] added the content length to ShuffleHandler audit log, which 
> is logged at DEBUG level.  After enabling it, we found that the list of 
> mappers for large jobs was filling up our audit logs.  It would be good to 
> move the list of mappers to TRACE level to reduce the logging impact without 
> disabling the log message entirely.
> For example a log message like this:
> {noformat}
> 2018-01-25 23:43:02,669 [New I/O worker #1] DEBUG ShuffleHandler.audit: 
> shuffle for job_1512479762132_1318600 reducer 241 length 482072 mappers: 
> [attempt_1512479762132_1318600_1_00_004852_0_10003,
> attempt_1512479762132_1318600_1_00_004190_0_10003, 
> attempt_1512479762132_1318600_1_00_004393_0_10003, 
> attempt_1512479762132_1318600_1_00_005057_0_10003, 
> attempt_1512479762132_1318600_1_00_004855_0_10002,
> attempt_1512479762132_1318600_1_00_003976_0_10003, 
> attempt_1512479762132_1318600_1_00_004058_0_10003, 
> attempt_1512479762132_1318600_1_00_004355_0_10003, 
> attempt_1512479762132_1318600_1_00_004436_0_10002,
> attempt_1512479762132_1318600_1_00_004854_0_10003, 
> attempt_1512479762132_1318600_1_00_005174_0_10004, 
> attempt_1512479762132_1318600_1_00_003972_0_10002, 
> attempt_1512479762132_1318600_1_00_004853_0_10002,
> attempt_1512479762132_1318600_1_00_004856_0_10002]
> {noformat}
> Would become this with 
> {{log4j.logger.org.apache.hadoop.mapred.ShuffleHandler.audit=DEBUG}}:
> {noformat}
> 2018-01-25 23:43:02,669 [New I/O worker #1] DEBUG ShuffleHandler.audit: 
> shuffle for job_1512479762132_1318600 reducer 241 length 482072
> {noformat}
> And this with 
> {{log4j.logger.org.apache.hadoop.mapred.ShuffleHandler.audit=TRACE}}:
> {noformat}
> 2018-01-25 23:43:02,669 [New I/O worker #1] DEBUG ShuffleHandler.audit: 
> shuffle for job_1512479762132_1318600 reducer 241 length 482072
> 2018-01-25 23:43:02,669 [New I/O worker #1] TRACE ShuffleHandler.audit: 
> shuffle for job_1512479762132_1318600 mappers: 
> [attempt_1512479762132_1318600_1_00_004852_0_10003,
> attempt_1512479762132_1318600_1_00_004190_0_10003, 
> attempt_1512479762132_1318600_1_00_004393_0_10003, 
> attempt_1512479762132_1318600_1_00_005057_0_10003, 
> attempt_1512479762132_1318600_1_00_004855_0_10002,
> attempt_1512479762132_1318600_1_00_003976_0_10003, 
> attempt_1512479762132_1318600_1_00_004058_0_10003, 
> attempt_1512479762132_1318600_1_00_004355_0_10003, 
> attempt_1512479762132_1318600_1_00_004436_0_10002,
> attempt_1512479762132_1318600_1_00_004854_0_10003, 
> attempt_1512479762132_1318600_1_00_005174_0_10004, 
> attempt_1512479762132_1318600_1_00_003972_0_10002, 
> attempt_1512479762132_1318600_1_00_004853_0_10002,
> attempt_1512479762132_1318600_1_00_004856_0_10002]
> {noformat}
> One question is whether there are any downstream consumers of this audit log 
> that might have a problem with this change?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7319) Log list of mappers at trace level in ShuffleHandler audit log

2021-02-09 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17281905#comment-17281905
 ] 

Jim Brennan commented on MAPREDUCE-7319:


I don't think a unit test is needed for this log message change.
[~ebadger] can you please review?


> Log list of mappers at trace level in ShuffleHandler audit log
> --
>
> Key: MAPREDUCE-7319
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7319
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.4.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Attachments: MAPREDUCE-7319.001.patch
>
>
> [MAPREDUCE-6958] added the content length to ShuffleHandler audit log, which 
> is logged at DEBUG level.  After enabling it, we found that the list of 
> mappers for large jobs was filling up our audit logs.  It would be good to 
> move the list of mappers to TRACE level to reduce the logging impact without 
> disabling the log message entirely.
> For example a log message like this:
> {noformat}
> 2018-01-25 23:43:02,669 [New I/O worker #1] DEBUG ShuffleHandler.audit: 
> shuffle for job_1512479762132_1318600 reducer 241 length 482072 mappers: 
> [attempt_1512479762132_1318600_1_00_004852_0_10003,
> attempt_1512479762132_1318600_1_00_004190_0_10003, 
> attempt_1512479762132_1318600_1_00_004393_0_10003, 
> attempt_1512479762132_1318600_1_00_005057_0_10003, 
> attempt_1512479762132_1318600_1_00_004855_0_10002,
> attempt_1512479762132_1318600_1_00_003976_0_10003, 
> attempt_1512479762132_1318600_1_00_004058_0_10003, 
> attempt_1512479762132_1318600_1_00_004355_0_10003, 
> attempt_1512479762132_1318600_1_00_004436_0_10002,
> attempt_1512479762132_1318600_1_00_004854_0_10003, 
> attempt_1512479762132_1318600_1_00_005174_0_10004, 
> attempt_1512479762132_1318600_1_00_003972_0_10002, 
> attempt_1512479762132_1318600_1_00_004853_0_10002,
> attempt_1512479762132_1318600_1_00_004856_0_10002]
> {noformat}
> Would become this with 
> {{log4j.logger.org.apache.hadoop.mapred.ShuffleHandler.audit=DEBUG}}:
> {noformat}
> 2018-01-25 23:43:02,669 [New I/O worker #1] DEBUG ShuffleHandler.audit: 
> shuffle for job_1512479762132_1318600 reducer 241 length 482072
> {noformat}
> And this with 
> {{log4j.logger.org.apache.hadoop.mapred.ShuffleHandler.audit=TRACE}}:
> {noformat}
> 2018-01-25 23:43:02,669 [New I/O worker #1] DEBUG ShuffleHandler.audit: 
> shuffle for job_1512479762132_1318600 reducer 241 length 482072
> 2018-01-25 23:43:02,669 [New I/O worker #1] TRACE ShuffleHandler.audit: 
> shuffle for job_1512479762132_1318600 mappers: 
> [attempt_1512479762132_1318600_1_00_004852_0_10003,
> attempt_1512479762132_1318600_1_00_004190_0_10003, 
> attempt_1512479762132_1318600_1_00_004393_0_10003, 
> attempt_1512479762132_1318600_1_00_005057_0_10003, 
> attempt_1512479762132_1318600_1_00_004855_0_10002,
> attempt_1512479762132_1318600_1_00_003976_0_10003, 
> attempt_1512479762132_1318600_1_00_004058_0_10003, 
> attempt_1512479762132_1318600_1_00_004355_0_10003, 
> attempt_1512479762132_1318600_1_00_004436_0_10002,
> attempt_1512479762132_1318600_1_00_004854_0_10003, 
> attempt_1512479762132_1318600_1_00_005174_0_10004, 
> attempt_1512479762132_1318600_1_00_003972_0_10002, 
> attempt_1512479762132_1318600_1_00_004853_0_10002,
> attempt_1512479762132_1318600_1_00_004856_0_10002]
> {noformat}
> One question is whether there are any downstream consumers of this audit log 
> that might have a problem with this change?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7319) Log list of mappers at trace level in ShuffleHandler audit log

2021-02-09 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated MAPREDUCE-7319:
---
Status: Patch Available  (was: Open)

> Log list of mappers at trace level in ShuffleHandler audit log
> --
>
> Key: MAPREDUCE-7319
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7319
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.4.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Attachments: MAPREDUCE-7319.001.patch
>
>
> [MAPREDUCE-6958] added the content length to ShuffleHandler audit log, which 
> is logged at DEBUG level.  After enabling it, we found that the list of 
> mappers for large jobs was filling up our audit logs.  It would be good to 
> move the list of mappers to TRACE level to reduce the logging impact without 
> disabling the log message entirely.
> For example a log message like this:
> {noformat}
> 2018-01-25 23:43:02,669 [New I/O worker #1] DEBUG ShuffleHandler.audit: 
> shuffle for job_1512479762132_1318600 reducer 241 length 482072 mappers: 
> [attempt_1512479762132_1318600_1_00_004852_0_10003,
> attempt_1512479762132_1318600_1_00_004190_0_10003, 
> attempt_1512479762132_1318600_1_00_004393_0_10003, 
> attempt_1512479762132_1318600_1_00_005057_0_10003, 
> attempt_1512479762132_1318600_1_00_004855_0_10002,
> attempt_1512479762132_1318600_1_00_003976_0_10003, 
> attempt_1512479762132_1318600_1_00_004058_0_10003, 
> attempt_1512479762132_1318600_1_00_004355_0_10003, 
> attempt_1512479762132_1318600_1_00_004436_0_10002,
> attempt_1512479762132_1318600_1_00_004854_0_10003, 
> attempt_1512479762132_1318600_1_00_005174_0_10004, 
> attempt_1512479762132_1318600_1_00_003972_0_10002, 
> attempt_1512479762132_1318600_1_00_004853_0_10002,
> attempt_1512479762132_1318600_1_00_004856_0_10002]
> {noformat}
> Would become this with 
> {{log4j.logger.org.apache.hadoop.mapred.ShuffleHandler.audit=DEBUG}}:
> {noformat}
> 2018-01-25 23:43:02,669 [New I/O worker #1] DEBUG ShuffleHandler.audit: 
> shuffle for job_1512479762132_1318600 reducer 241 length 482072
> {noformat}
> And this with 
> {{log4j.logger.org.apache.hadoop.mapred.ShuffleHandler.audit=TRACE}}:
> {noformat}
> 2018-01-25 23:43:02,669 [New I/O worker #1] DEBUG ShuffleHandler.audit: 
> shuffle for job_1512479762132_1318600 reducer 241 length 482072
> 2018-01-25 23:43:02,669 [New I/O worker #1] TRACE ShuffleHandler.audit: 
> shuffle for job_1512479762132_1318600 mappers: 
> [attempt_1512479762132_1318600_1_00_004852_0_10003,
> attempt_1512479762132_1318600_1_00_004190_0_10003, 
> attempt_1512479762132_1318600_1_00_004393_0_10003, 
> attempt_1512479762132_1318600_1_00_005057_0_10003, 
> attempt_1512479762132_1318600_1_00_004855_0_10002,
> attempt_1512479762132_1318600_1_00_003976_0_10003, 
> attempt_1512479762132_1318600_1_00_004058_0_10003, 
> attempt_1512479762132_1318600_1_00_004355_0_10003, 
> attempt_1512479762132_1318600_1_00_004436_0_10002,
> attempt_1512479762132_1318600_1_00_004854_0_10003, 
> attempt_1512479762132_1318600_1_00_005174_0_10004, 
> attempt_1512479762132_1318600_1_00_003972_0_10002, 
> attempt_1512479762132_1318600_1_00_004853_0_10002,
> attempt_1512479762132_1318600_1_00_004856_0_10002]
> {noformat}
> One question is whether there are any downstream consumers of this audit log 
> that might have a problem with this change?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7319) Log list of mappers at trace level in ShuffleHandler audit log

2021-02-08 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated MAPREDUCE-7319:
---
Attachment: MAPREDUCE-7319.001.patch

> Log list of mappers at trace level in ShuffleHandler audit log
> --
>
> Key: MAPREDUCE-7319
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7319
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.4.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Attachments: MAPREDUCE-7319.001.patch
>
>
> [MAPREDUCE-6958] added the content length to ShuffleHandler audit log, which 
> is logged at DEBUG level.  After enabling it, we found that the list of 
> mappers for large jobs was filling up our audit logs.  It would be good to 
> move the list of mappers to TRACE level to reduce the logging impact without 
> disabling the log message entirely.
> For example a log message like this:
> {noformat}
> 2018-01-25 23:43:02,669 [New I/O worker #1] DEBUG ShuffleHandler.audit: 
> shuffle for job_1512479762132_1318600 reducer 241 length 482072 mappers: 
> [attempt_1512479762132_1318600_1_00_004852_0_10003,
> attempt_1512479762132_1318600_1_00_004190_0_10003, 
> attempt_1512479762132_1318600_1_00_004393_0_10003, 
> attempt_1512479762132_1318600_1_00_005057_0_10003, 
> attempt_1512479762132_1318600_1_00_004855_0_10002,
> attempt_1512479762132_1318600_1_00_003976_0_10003, 
> attempt_1512479762132_1318600_1_00_004058_0_10003, 
> attempt_1512479762132_1318600_1_00_004355_0_10003, 
> attempt_1512479762132_1318600_1_00_004436_0_10002,
> attempt_1512479762132_1318600_1_00_004854_0_10003, 
> attempt_1512479762132_1318600_1_00_005174_0_10004, 
> attempt_1512479762132_1318600_1_00_003972_0_10002, 
> attempt_1512479762132_1318600_1_00_004853_0_10002,
> attempt_1512479762132_1318600_1_00_004856_0_10002]
> {noformat}
> Would become this with 
> {{log4j.logger.org.apache.hadoop.mapred.ShuffleHandler.audit=DEBUG}}:
> {noformat}
> 2018-01-25 23:43:02,669 [New I/O worker #1] DEBUG ShuffleHandler.audit: 
> shuffle for job_1512479762132_1318600 reducer 241 length 482072
> {noformat}
> And this with 
> {{log4j.logger.org.apache.hadoop.mapred.ShuffleHandler.audit=TRACE}}:
> {noformat}
> 2018-01-25 23:43:02,669 [New I/O worker #1] DEBUG ShuffleHandler.audit: 
> shuffle for job_1512479762132_1318600 reducer 241 length 482072
> 2018-01-25 23:43:02,669 [New I/O worker #1] TRACE ShuffleHandler.audit: 
> shuffle for job_1512479762132_1318600 mappers: 
> [attempt_1512479762132_1318600_1_00_004852_0_10003,
> attempt_1512479762132_1318600_1_00_004190_0_10003, 
> attempt_1512479762132_1318600_1_00_004393_0_10003, 
> attempt_1512479762132_1318600_1_00_005057_0_10003, 
> attempt_1512479762132_1318600_1_00_004855_0_10002,
> attempt_1512479762132_1318600_1_00_003976_0_10003, 
> attempt_1512479762132_1318600_1_00_004058_0_10003, 
> attempt_1512479762132_1318600_1_00_004355_0_10003, 
> attempt_1512479762132_1318600_1_00_004436_0_10002,
> attempt_1512479762132_1318600_1_00_004854_0_10003, 
> attempt_1512479762132_1318600_1_00_005174_0_10004, 
> attempt_1512479762132_1318600_1_00_003972_0_10002, 
> attempt_1512479762132_1318600_1_00_004853_0_10002,
> attempt_1512479762132_1318600_1_00_004856_0_10002]
> {noformat}
> One question is whether there are any downstream consumers of this audit log 
> that might have a problem with this change?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-7319) Log list of mappers at trace level in ShuffleHandler audit log

2021-02-08 Thread Jim Brennan (Jira)
Jim Brennan created MAPREDUCE-7319:
--

 Summary: Log list of mappers at trace level in ShuffleHandler 
audit log
 Key: MAPREDUCE-7319
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7319
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: yarn
Affects Versions: 3.4.0
Reporter: Jim Brennan
Assignee: Jim Brennan


[MAPREDUCE-6958] added the content length to ShuffleHandler audit log, which is 
logged at DEBUG level.  After enabling it, we found that the list of mappers 
for large jobs was filling up our audit logs.  It would be good to move the 
list of mappers to TRACE level to reduce the logging impact without disabling 
the log message entirely.

For example a log message like this:
{noformat}
2018-01-25 23:43:02,669 [New I/O worker #1] DEBUG ShuffleHandler.audit: shuffle 
for job_1512479762132_1318600 reducer 241 length 482072 mappers: 
[attempt_1512479762132_1318600_1_00_004852_0_10003,
attempt_1512479762132_1318600_1_00_004190_0_10003, 
attempt_1512479762132_1318600_1_00_004393_0_10003, 
attempt_1512479762132_1318600_1_00_005057_0_10003, 
attempt_1512479762132_1318600_1_00_004855_0_10002,
attempt_1512479762132_1318600_1_00_003976_0_10003, 
attempt_1512479762132_1318600_1_00_004058_0_10003, 
attempt_1512479762132_1318600_1_00_004355_0_10003, 
attempt_1512479762132_1318600_1_00_004436_0_10002,
attempt_1512479762132_1318600_1_00_004854_0_10003, 
attempt_1512479762132_1318600_1_00_005174_0_10004, 
attempt_1512479762132_1318600_1_00_003972_0_10002, 
attempt_1512479762132_1318600_1_00_004853_0_10002,
attempt_1512479762132_1318600_1_00_004856_0_10002]
{noformat}
Would become this with 
{{log4j.logger.org.apache.hadoop.mapred.ShuffleHandler.audit=DEBUG}}:
{noformat}
2018-01-25 23:43:02,669 [New I/O worker #1] DEBUG ShuffleHandler.audit: shuffle 
for job_1512479762132_1318600 reducer 241 length 482072
{noformat}
And this with 
{{log4j.logger.org.apache.hadoop.mapred.ShuffleHandler.audit=TRACE}}:
{noformat}
2018-01-25 23:43:02,669 [New I/O worker #1] DEBUG ShuffleHandler.audit: shuffle 
for job_1512479762132_1318600 reducer 241 length 482072
2018-01-25 23:43:02,669 [New I/O worker #1] TRACE ShuffleHandler.audit: shuffle 
for job_1512479762132_1318600 mappers: 
[attempt_1512479762132_1318600_1_00_004852_0_10003,
attempt_1512479762132_1318600_1_00_004190_0_10003, 
attempt_1512479762132_1318600_1_00_004393_0_10003, 
attempt_1512479762132_1318600_1_00_005057_0_10003, 
attempt_1512479762132_1318600_1_00_004855_0_10002,
attempt_1512479762132_1318600_1_00_003976_0_10003, 
attempt_1512479762132_1318600_1_00_004058_0_10003, 
attempt_1512479762132_1318600_1_00_004355_0_10003, 
attempt_1512479762132_1318600_1_00_004436_0_10002,
attempt_1512479762132_1318600_1_00_004854_0_10003, 
attempt_1512479762132_1318600_1_00_005174_0_10004, 
attempt_1512479762132_1318600_1_00_003972_0_10002, 
attempt_1512479762132_1318600_1_00_004853_0_10002,
attempt_1512479762132_1318600_1_00_004856_0_10002]
{noformat}
One question is whether there are any downstream consumers of this audit log 
that might have a problem with this change?




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7282) MR v2 commit algorithm should be deprecated and not the default

2020-09-23 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201046#comment-17201046
 ] 

Jim Brennan commented on MAPREDUCE-7282:


I am -1 on the proposal to remove the v2 algorithm.   Please see [~jlowe]'s 
 
[comment|https://issues.apache.org/jira/browse/MAPREDUCE-4815?focusedCommentId=14271115&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14271115]
 from the original discussion on [MAPREDUCE-4815].

We have been running with the v2 algorithm in production on large clusters for 
years at Verizon Media (Yahoo).   I don't think it is appropriate to remove it.

> MR v2 commit algorithm should be deprecated and not the default
> ---
>
> Key: MAPREDUCE-7282
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7282
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 3.3.0, 3.2.1, 3.1.3, 3.3.1
>Reporter: Steve Loughran
>Priority: Major
>
> The v2 MR commit algorithm moves files from the task attempt dir into the 
> dest dir on task commit -one by one
> It is therefore not atomic
> # if a task commit fails partway through and another task attempt commits 
> -unless exactly the same filenames are used, output of the first attempt may 
> be included in the final result
> # if a worker partitions partway through task commit, and then continues 
> after another attempt has committed, it may partially overwrite the output 
> -even when the filenames are the same
> Both MR and spark assume that task commits are atomic. Either they need to 
> consider that this is not the case, we add a way to probe for a committer 
> supporting atomic task commit, and the engines both add handling for task 
> commit failures (probably fail job)
> Better: we remove this as the default, maybe also warn when it is being used



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7277) IndexCache totalMemoryUsed differs from cache contents.

2020-04-27 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17093811#comment-17093811
 ] 

Jim Brennan commented on MAPREDUCE-7277:


Thanks for the update [~jeagles]!  I'm +1 (non-binding) on patch 004.

 

> IndexCache totalMemoryUsed differs from cache contents.
> ---
>
> Key: MAPREDUCE-7277
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7277
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jonathan Turner Eagles
>Assignee: Jonathan Turner Eagles
>Priority: Major
> Attachments: IndexCacheActualSize.png, MAPREDUCE-7277.001.patch, 
> MAPREDUCE-7277.002.patch, MAPREDUCE-7277.003.patch, MAPREDUCE-7277.004.patch
>
>
> It was observed recently in a nodemanager OOM that the memory was filled with 
> SpillRecords. However, the IndexCache was only 15% full (1.5MB used on a 10MB 
> configured cache size). In particular was noted that the booking variable 
> totalMemoryUsed, was out of sync with the contents of the cache showing 96% 
> full, thereby drastically reducing the effectiveness of the cache.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7277) IndexCache totalMemoryUsed differs from cache contents.

2020-04-27 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17093658#comment-17093658
 ] 

Jim Brennan commented on MAPREDUCE-7277:


Thanks for the update [~jeagles]! Patch 03 looks good to me.
 One nit:
 Typo in comment:
{code:java}
  /** This method should only be called upon successful removal of mapId from
   * the queue. The mapId will the be removed from the queue and
   * totalUsedMemory will be decremented.
{code}
I think that should be "The mapid will *-the-* be removed from the *cache* and 
..."

I am +1 (non-binding) on patch 003.

 

> IndexCache totalMemoryUsed differs from cache contents.
> ---
>
> Key: MAPREDUCE-7277
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7277
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jonathan Turner Eagles
>Assignee: Jonathan Turner Eagles
>Priority: Major
> Attachments: IndexCacheActualSize.png, MAPREDUCE-7277.001.patch, 
> MAPREDUCE-7277.002.patch, MAPREDUCE-7277.003.patch
>
>
> It was observed recently in a nodemanager OOM that the memory was filled with 
> SpillRecords. However, the IndexCache was only 15% full (1.5MB used on a 10MB 
> configured cache size). In particular was noted that the booking variable 
> totalMemoryUsed, was out of sync with the contents of the cache showing 96% 
> full, thereby drastically reducing the effectiveness of the cache.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7277) IndexCache totalMemoryUsed differs from cache contents.

2020-04-22 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17089839#comment-17089839
 ] 

Jim Brennan commented on MAPREDUCE-7277:


Thanks for the patch [~jeagles]!

Question about readIndexFileToCache() in the case where we throw IOException if 
we fail to construct the SpillRecord.  In the old code, we just executed the 
finally clause and then threw the IOException without doing the queue.add() and 
total memory update.  In the new code, we are still doing the queue.add() in 
this case.  Was this intended?

Also, I don't understand why you call checkTotalMemoryUsed() here - is this 
left over debug code?


> IndexCache totalMemoryUsed differs from cache contents.
> ---
>
> Key: MAPREDUCE-7277
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7277
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jonathan Turner Eagles
>Assignee: Jonathan Turner Eagles
>Priority: Major
> Attachments: IndexCacheActualSize.png, MAPREDUCE-7277.001.patch
>
>
> It was observed recently in a nodemanager OOM that the memory was filled with 
> SpillRecords. However, the IndexCache was only 15% full (1.5MB used on a 10MB 
> configured cache size). In particular was noted that the booking variable 
> totalMemoryUsed, was out of sync with the contents of the cache showing 96% 
> full, thereby drastically reducing the effectiveness of the cache.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7172) Wildcard functionality of -libjar is broken when jars are located in same remote FS

2020-02-19 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17040460#comment-17040460
 ] 

Jim Brennan commented on MAPREDUCE-7172:


[~leftnoteasy], [~templedf] any update on this?  While running regression tests 
on branch-2.10 with our grid data management tool, it looks like we hit a 
similar issue.  I've found that setting 
{{mapreduce.client.libjars.wildcard=false}} fixes it.

In the case that was failing for us, tmpjars had:
{noformat}
/user/dfsload/.gdm/gq1/replication/opengdm/shipjars/replication-core.jar,
/user/dfsload/.gdm/gq1/replication/opengdm/jobjars/replication-distcopy.jar
/user/dfsload/.gdm/gq1/replication/opengdm/jobjars/acquisition-loader.jar,
etc...
{noformat}
But mapreduce.job.cache.files was just 
"hdfs://n2:8020/user/dfsload/.staging/job_1581958257895_0005/libjars/*"
and we were failing on a classNotFound exception.


> Wildcard functionality of -libjar is broken when jars are located in same 
> remote FS
> ---
>
> Key: MAPREDUCE-7172
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7172
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Wangda Tan
>Priority: Critical
>
> We recently found that when -libjar specified jars on the same remote FS, 
> jars will not be properly added to classpath. 
> The reason is MAPREDUCE-6719 added the wildcard functionality, but the follow 
> logic assumes files are all placed under job's submission directory. (Inside 
> JobResourceUploader)
> {code:java}
> if (useWildcard && !foundFragment) {
>   // Add the whole directory to the cache using a wild card
>   Path libJarsDirWildcard =
>   jtFs.makeQualified(new Path(libjarsDir, DistributedCache.WILDCARD));
>   DistributedCache.addCacheFile(libJarsDirWildcard.toUri(), conf);
> }{code}
> However, in the same method, specified resources will be only uploaded when 
> two FSes are different, see copyRemoteFiles:
> {code:java}
> if (FileUtil.compareFs(remoteFs, jtFs)) {
>   return originalPath;
> } {code}
> Workaround of this issue is pass:
> mapreduce.client.libjars.wildcard = false.
> When the MR job got launched. 
> Example commandline to reproduce this issue is: 
> {code:java}
> hadoop jar abc.jar org.ABC -libjars 
> "wasb://host/path1/jar1,wasb://host/path2/jar2..."{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7180) Relaunching Failed Containers

2019-02-04 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16759996#comment-16759996
 ] 

Jim Brennan commented on MAPREDUCE-7180:


[~belugabehr] I agree that there are use cases where this would be useful.  But 
there are also cases where the automatic growth would not be desirable.   For 
example:


 # On a large multi-tenant cluster that serves a large number of users
 # Hourly MapReduce job with 100 Mappers @ 1GB/container (for a total of 100GB)
 # All mappers fail first attempt
 # All 100 mappers are retried with 2 GB containers
 # Job completes successfully, still well within required time limits for the 
job.
 # Because the job did not fail and was not late, nobody complains.

In this scenario, we waste 100 GB of cluster capacity for some period of time 
every hour.  Possibly more, if the maps really only needed an additional 512MB, 
for example.    Ideally, the user responsible for this job would notice the 
large number of map failures, and follow-up, but this does not always happen in 
a timely fashion.  If the job fails the first time it goes over its memory 
limit, the problem will more likely be addressed sooner and avoid wasting 
cluster resources.

 
{quote}I could see a new configuration called growth-factor which specifies how 
large to grow the container each time it is re-tried. This would be a 
percentage, therefore, a growth-factor of 1.0f (100%) would preserve the 
current behavior.
{quote}
I think something like this would work.

 

> Relaunching Failed Containers
> -
>
> Key: MAPREDUCE-7180
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7180
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv1, mrv2
>Reporter: BELUGA BEHR
>Priority: Major
>
> In my experience, it is very common that a MR job completely fails because a 
> single Mapper/Reducer container is using more memory than has been reserved 
> in YARN.  The following message is logging the the MapReduce 
> ApplicationMaster:
> {code}
> Container [pid=46028,containerID=container_e54_1435155934213_16721_01_003666] 
> is running beyond physical memory limits. 
> Current usage: 1.0 GB of 1 GB physical memory used; 2.7 GB of 2.1 GB virtual 
> memory used. Killing container.
> {code}
> In this case, the container is re-launched on another node, and of course, it 
> is killed again for the same reason.  This process happens three (maybe 
> four?) times before the entire MapReduce job fails.  It's often said that the 
> definition of insanity is doing the same thing over and over and expecting 
> different results.
> For all intents and purposes, the amount of resources requested by Mappers 
> and Reducers is a fixed amount; based on the default configuration values.  
> Users can set the memory on a per-job basis, but it's a pain, not exact, and 
> requires intimate knowledge of the MapReduce framework and its memory usage 
> patterns.
> I propose that if the MR ApplicationMaster detects that a container is killed 
> because of this specific memory resource constraint, that it requests a 
> larger container for the subsequent task attempt.
> For example, increase the requested memory size by 50% each time the 
> container fails and the task is retried.  This will prevent many Job failures 
> and allow for additional memory tuning, per-Job, after the fact, to get 
> better performance (v.s. fail/succeed).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7180) Relaunching Failed Containers

2019-02-01 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16758408#comment-16758408
 ] 

Jim Brennan commented on MAPREDUCE-7180:


[~belugabehr] While this sounds like a good idea, my concern is that this will 
allow undersized jobs to continue to silently work, using additional cluster 
resources every time they run.  In our experience, users often don't fix their 
jobs in the presence of failures that succeed on retries, so problems like this 
can go on for a long time.   I think if we do a feature like this, it should be 
an opt-in configuration option.

> Relaunching Failed Containers
> -
>
> Key: MAPREDUCE-7180
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7180
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv1, mrv2
>Reporter: BELUGA BEHR
>Priority: Major
>
> In my experience, it is very common that a MR job completely fails because a 
> single Mapper/Reducer container is using more memory than has been reserved 
> in YARN.  The following message is logging the the MapReduce 
> ApplicationMaster:
> {code}
> Container [pid=46028,containerID=container_e54_1435155934213_16721_01_003666] 
> is running beyond physical memory limits. 
> Current usage: 1.0 GB of 1 GB physical memory used; 2.7 GB of 2.1 GB virtual 
> memory used. Killing container.
> {code}
> In this case, the container is re-launched on another node, and of course, it 
> is killed again for the same reason.  This process happens three (maybe 
> four?) times before the entire MapReduce job fails.  It's often said that the 
> definition of insanity is doing the same thing over and over and expecting 
> different results.
> For all intents and purposes, the amount of resources requested by Mappers 
> and Reducers is a fixed amount; based on the default configuration values.  
> Users can set the memory on a per-job basis, but it's a pain, not exact, and 
> requires intimate knowledge of the MapReduce framework and its memory usage 
> patterns.
> I propose that if the MR ApplicationMaster detects that a container is killed 
> because of this specific memory resource constraint, that it requests a 
> larger container for the subsequent task attempt.
> For example, increase the requested memory size by 50% each time the 
> container fails and the task is retried.  This will prevent many Job failures 
> and allow for additional memory tuning, per-Job, after the fact, to get 
> better performance (v.s. fail/succeed).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6948) TestJobImpl.testUnusableNodeTransition failed

2018-07-17 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16546675#comment-16546675
 ] 

Jim Brennan commented on MAPREDUCE-6948:


[~haibochen], [~jlowe], based on my last comment from May, I propose we close 
this and re-open if it recurs.

 

> TestJobImpl.testUnusableNodeTransition failed
> -
>
> Key: MAPREDUCE-6948
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6948
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha4
>Reporter: Haibo Chen
>Assignee: Jim Brennan
>Priority: Major
>  Labels: unit-test
>
> *Error Message*
> expected: but was:
> *Stacktrace*
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl.assertJobState(TestJobImpl.java:1041)
>   at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl.testUnusableNodeTransition(TestJobImpl.java:615)
> *Standard out*
> {code}
> 2017-08-30 10:12:21,928 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventType for class 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler
> 2017-08-30 10:12:21,939 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.v2.app.job.event.JobEventType for class 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl$StubbedJob
> 2017-08-30 10:12:21,940 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.v2.app.job.event.TaskEventType for class 
> org.apache.hadoop.yarn.event.EventHandler$$EnhancerByMockitoWithCGLIB$$79f96ebf
> 2017-08-30 10:12:21,940 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.jobhistory.EventType for class 
> org.apache.hadoop.yarn.event.EventHandler$$EnhancerByMockitoWithCGLIB$$79f96ebf
> 2017-08-30 10:12:21,940 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.v2.app.job.event.JobFinishEvent$Type for class 
> org.apache.hadoop.yarn.event.EventHandler$$EnhancerByMockitoWithCGLIB$$79f96ebf
> 2017-08-30 10:12:21,941 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:setup(1534)) - Adding job token for job_123456789_0001 to 
> jobTokenSecretManager
> 2017-08-30 10:12:21,941 WARN  [Thread-49] impl.JobImpl 
> (JobImpl.java:setup(1540)) - Shuffle secret key missing from job credentials. 
> Using job token secret as shuffle secret.
> 2017-08-30 10:12:21,944 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:makeUberDecision(1305)) - Not uberizing job_123456789_0001 
> because: not enabled;
> 2017-08-30 10:12:21,944 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:createMapTasks(1562)) - Input size for job 
> job_123456789_0001 = 0. Number of splits = 2
> 2017-08-30 10:12:21,945 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:createReduceTasks(1579)) - Number of reduces for job 
> job_123456789_0001 = 1
> 2017-08-30 10:12:21,945 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:handle(1017)) - job_123456789_0001Job Transitioned from NEW 
> to INITED
> 2017-08-30 10:12:21,946 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:handle(1017)) - job_123456789_0001Job Transitioned from 
> INITED to SETUP
> 2017-08-30 10:12:21,954 INFO  [CommitterEvent Processor #0] 
> commit.CommitterEventHandler (CommitterEventHandler.java:run(231)) - 
> Processing the event EventType: JOB_SETUP
> 2017-08-30 10:12:21,978 INFO  [AsyncDispatcher event handler] impl.JobImpl 
> (JobImpl.java:handle(1017)) - job_123456789_0001Job Transitioned from 
> SETUP to RUNNING
> 2017-08-30 10:12:21,983 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.v2.app.job.event.TaskAttemptEventType for class 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl$5
> 2017-08-30 10:12:22,000 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:transition(1953)) - Num completed Tasks: 1
> 2017-08-30 10:12:22,029 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:transition(1953)) - Num completed Tasks: 2
> 2017-08-30 10:12:22,032 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:actOnUnusableNode(1354)) - TaskAttempt killed because it ran on 
> unusable node Mock for NodeId, hashCode: 1280187896. 
> AttemptId:attempt_123456789_0001

[jira] [Commented] (MAPREDUCE-6948) TestJobImpl.testUnusableNodeTransition failed

2018-05-01 Thread Jim Brennan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16459808#comment-16459808
 ] 

Jim Brennan commented on MAPREDUCE-6948:


[~jlowe], [~haibochen], I tried to reproduce this again in branch-2.7 and 
branch-3.0.   I ran this test in a loop for more than an hour on each, and was 
not able to get it to fail.

In branch-2.7, I can get it to fail as reported by inserting a sleep into the 
taskAttemptEventHandler right before it decrements the succeededMapperCount().  
 If I do the same thing in branch-3.0, it does not fail (due to the 
aforementioned fixes in [MAPREDUCE-6675]).

I searched for this test in apache issues, and the only reference I found was 
this comment by [~haibochen] in [MAPREDUCE-6937]:

{quote}
+1 on the branch-2.7-v5 patch. I have filed MAPREDUCE-6948 for the unit test 
failure TestJobImpl.testUnusableNodeTransition. 
{quote}

That comment looks like it is in response to branch-2.7 test failures.  
hadoop.mapreduce.v2.app.job.impl.TestJobImpl is only listed as failing in the 
branch-2.7 tests.

[~haibochen], did you reproduce the failure on 3.0.0-alpha4?  Is it possible 
this was intended to be filed against branch 2.7 as a result of the test 
failures from [MAPREDUCE-6937]?

The race does still appear to exist on branch 2.7, so we may want to consider 
pulling [MAPREDUCE-6675] back to that branch.


> TestJobImpl.testUnusableNodeTransition failed
> -
>
> Key: MAPREDUCE-6948
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6948
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha4
>Reporter: Haibo Chen
>Assignee: Jim Brennan
>Priority: Major
>  Labels: unit-test
>
> *Error Message*
> expected: but was:
> *Stacktrace*
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl.assertJobState(TestJobImpl.java:1041)
>   at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl.testUnusableNodeTransition(TestJobImpl.java:615)
> *Standard out*
> {code}
> 2017-08-30 10:12:21,928 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventType for class 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler
> 2017-08-30 10:12:21,939 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.v2.app.job.event.JobEventType for class 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl$StubbedJob
> 2017-08-30 10:12:21,940 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.v2.app.job.event.TaskEventType for class 
> org.apache.hadoop.yarn.event.EventHandler$$EnhancerByMockitoWithCGLIB$$79f96ebf
> 2017-08-30 10:12:21,940 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.jobhistory.EventType for class 
> org.apache.hadoop.yarn.event.EventHandler$$EnhancerByMockitoWithCGLIB$$79f96ebf
> 2017-08-30 10:12:21,940 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.v2.app.job.event.JobFinishEvent$Type for class 
> org.apache.hadoop.yarn.event.EventHandler$$EnhancerByMockitoWithCGLIB$$79f96ebf
> 2017-08-30 10:12:21,941 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:setup(1534)) - Adding job token for job_123456789_0001 to 
> jobTokenSecretManager
> 2017-08-30 10:12:21,941 WARN  [Thread-49] impl.JobImpl 
> (JobImpl.java:setup(1540)) - Shuffle secret key missing from job credentials. 
> Using job token secret as shuffle secret.
> 2017-08-30 10:12:21,944 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:makeUberDecision(1305)) - Not uberizing job_123456789_0001 
> because: not enabled;
> 2017-08-30 10:12:21,944 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:createMapTasks(1562)) - Input size for job 
> job_123456789_0001 = 0. Number of splits = 2
> 2017-08-30 10:12:21,945 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:createReduceTasks(1579)) - Number of reduces for job 
> job_123456789_0001 = 1
> 2017-08-30 10:12:21,945 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:handle(1017)) - job_123456789_0001Job Transitioned from NEW 
> to INITED
> 2017-08-30 10:12:21,946 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:handle(1017)) - job_123456789_0001Job Transitioned from 
> INI

[jira] [Commented] (MAPREDUCE-7069) Add ability to specify user environment variables individually

2018-04-11 Thread Jim Brennan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434641#comment-16434641
 ] 

Jim Brennan commented on MAPREDUCE-7069:


[~jlowe], unit test is unrelated.  This is ready for review.

 

> Add ability to specify user environment variables individually
> --
>
> Key: MAPREDUCE-7069
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7069
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: MAPREDUCE-7069.001.patch, MAPREDUCE-7069.002.patch, 
> MAPREDUCE-7069.003.patch, MAPREDUCE-7069.004.patch, MAPREDUCE-7069.005.patch, 
> MAPREDUCE-7069.006.patch, MAPREDUCE-7069.007.patch
>
>
> As reported in YARN-6830, it is currently not possible to specify an 
> environment variable that contains commas via {{mapreduce.map.env}}, 
> mapreduce.reduce.env, or {{mapreduce.admin.user.env}}.
> To address this, [~aw] proposed in [YARN-6830] that we add the ability to 
> specify environment variables individually:
> {quote}e.g, mapreduce.map.env.[foo]=bar gets turned into foo=bar
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7069) Add ability to specify user environment variables individually

2018-04-11 Thread Jim Brennan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434225#comment-16434225
 ] 

Jim Brennan commented on MAPREDUCE-7069:


[~jlowe], [~shaneku...@gmail.com], thanks again for the reviews, and thanks 
[~shaneku...@gmail.com] for testing the patch.  I have put up a new patch that 
I believe addresses the documentation issues.


> Add ability to specify user environment variables individually
> --
>
> Key: MAPREDUCE-7069
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7069
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: MAPREDUCE-7069.001.patch, MAPREDUCE-7069.002.patch, 
> MAPREDUCE-7069.003.patch, MAPREDUCE-7069.004.patch, MAPREDUCE-7069.005.patch, 
> MAPREDUCE-7069.006.patch, MAPREDUCE-7069.007.patch
>
>
> As reported in YARN-6830, it is currently not possible to specify an 
> environment variable that contains commas via {{mapreduce.map.env}}, 
> mapreduce.reduce.env, or {{mapreduce.admin.user.env}}.
> To address this, [~aw] proposed in [YARN-6830] that we add the ability to 
> specify environment variables individually:
> {quote}e.g, mapreduce.map.env.[foo]=bar gets turned into foo=bar
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7069) Add ability to specify user environment variables individually

2018-04-11 Thread Jim Brennan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated MAPREDUCE-7069:
---
Attachment: MAPREDUCE-7069.007.patch

> Add ability to specify user environment variables individually
> --
>
> Key: MAPREDUCE-7069
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7069
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: MAPREDUCE-7069.001.patch, MAPREDUCE-7069.002.patch, 
> MAPREDUCE-7069.003.patch, MAPREDUCE-7069.004.patch, MAPREDUCE-7069.005.patch, 
> MAPREDUCE-7069.006.patch, MAPREDUCE-7069.007.patch
>
>
> As reported in YARN-6830, it is currently not possible to specify an 
> environment variable that contains commas via {{mapreduce.map.env}}, 
> mapreduce.reduce.env, or {{mapreduce.admin.user.env}}.
> To address this, [~aw] proposed in [YARN-6830] that we add the ability to 
> specify environment variables individually:
> {quote}e.g, mapreduce.map.env.[foo]=bar gets turned into foo=bar
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7069) Add ability to specify user environment variables individually

2018-04-11 Thread Jim Brennan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434026#comment-16434026
 ] 

Jim Brennan commented on MAPREDUCE-7069:


I can do this for some of them, but the base properties for 
{{mapreduce.map.env}} and {{mapreduce.reduce.env}} are already commented out so 
that they don't override {{mapreduce.child.env}}.  Do you think it is 
appropriate to put the documentation for those in the description for 
{{mapreduce.child.env}}?



> Add ability to specify user environment variables individually
> --
>
> Key: MAPREDUCE-7069
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7069
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: MAPREDUCE-7069.001.patch, MAPREDUCE-7069.002.patch, 
> MAPREDUCE-7069.003.patch, MAPREDUCE-7069.004.patch, MAPREDUCE-7069.005.patch, 
> MAPREDUCE-7069.006.patch
>
>
> As reported in YARN-6830, it is currently not possible to specify an 
> environment variable that contains commas via {{mapreduce.map.env}}, 
> mapreduce.reduce.env, or {{mapreduce.admin.user.env}}.
> To address this, [~aw] proposed in [YARN-6830] that we add the ability to 
> specify environment variables individually:
> {quote}e.g, mapreduce.map.env.[foo]=bar gets turned into foo=bar
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7069) Add ability to specify user environment variables individually

2018-04-11 Thread Jim Brennan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433948#comment-16433948
 ] 

Jim Brennan commented on MAPREDUCE-7069:


[~jlowe], [~shaneku...@gmail.com], thanks for the reviews!   I am working to 
update the documentation and I'll put up another patch soon.
{quote}Looking at the docs generated at 
[http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml]
 I'm wondering if it would be more useful to add the description for the .[var] 
form of the properties to the base property instead of having separate, 
commented-out properties. Users probably won't ever notice the commented-out 
text in that file, especially since it's buried in a .jar when deployed, but 
many would notice the docs generated for properties that aren't commented out. 
Thoughts?
{quote}
This is what I originally started to do, but switched to the commented out 
option because it felt like the descriptions were too cumbersome. But your 
point is well taken. I'll take another stab at combining them. I don't see any 
precedent in this file for using formatting tags inside the descriptions (, 
,...) - do you know if that is allowed?

> Add ability to specify user environment variables individually
> --
>
> Key: MAPREDUCE-7069
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7069
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: MAPREDUCE-7069.001.patch, MAPREDUCE-7069.002.patch, 
> MAPREDUCE-7069.003.patch, MAPREDUCE-7069.004.patch, MAPREDUCE-7069.005.patch, 
> MAPREDUCE-7069.006.patch
>
>
> As reported in YARN-6830, it is currently not possible to specify an 
> environment variable that contains commas via {{mapreduce.map.env}}, 
> mapreduce.reduce.env, or {{mapreduce.admin.user.env}}.
> To address this, [~aw] proposed in [YARN-6830] that we add the ability to 
> specify environment variables individually:
> {quote}e.g, mapreduce.map.env.[foo]=bar gets turned into foo=bar
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7069) Add ability to specify user environment variables individually

2018-04-10 Thread Jim Brennan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16432873#comment-16432873
 ] 

Jim Brennan commented on MAPREDUCE-7069:


[~jlowe], this is ready for review. 

> Add ability to specify user environment variables individually
> --
>
> Key: MAPREDUCE-7069
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7069
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: MAPREDUCE-7069.001.patch, MAPREDUCE-7069.002.patch, 
> MAPREDUCE-7069.003.patch, MAPREDUCE-7069.004.patch, MAPREDUCE-7069.005.patch, 
> MAPREDUCE-7069.006.patch
>
>
> As reported in YARN-6830, it is currently not possible to specify an 
> environment variable that contains commas via {{mapreduce.map.env}}, 
> mapreduce.reduce.env, or {{mapreduce.admin.user.env}}.
> To address this, [~aw] proposed in [YARN-6830] that we add the ability to 
> specify environment variables individually:
> {quote}e.g, mapreduce.map.env.[foo]=bar gets turned into foo=bar
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7069) Add ability to specify user environment variables individually

2018-04-10 Thread Jim Brennan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated MAPREDUCE-7069:
---
Attachment: MAPREDUCE-7069.006.patch

> Add ability to specify user environment variables individually
> --
>
> Key: MAPREDUCE-7069
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7069
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: MAPREDUCE-7069.001.patch, MAPREDUCE-7069.002.patch, 
> MAPREDUCE-7069.003.patch, MAPREDUCE-7069.004.patch, MAPREDUCE-7069.005.patch, 
> MAPREDUCE-7069.006.patch
>
>
> As reported in YARN-6830, it is currently not possible to specify an 
> environment variable that contains commas via {{mapreduce.map.env}}, 
> mapreduce.reduce.env, or {{mapreduce.admin.user.env}}.
> To address this, [~aw] proposed in [YARN-6830] that we add the ability to 
> specify environment variables individually:
> {quote}e.g, mapreduce.map.env.[foo]=bar gets turned into foo=bar
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7069) Add ability to specify user environment variables individually

2018-04-10 Thread Jim Brennan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16432416#comment-16432416
 ] 

Jim Brennan commented on MAPREDUCE-7069:


[~jlowe] thanks for the review!  I've addressed your concerns and put up a new 
patch.

 

> Add ability to specify user environment variables individually
> --
>
> Key: MAPREDUCE-7069
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7069
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: MAPREDUCE-7069.001.patch, MAPREDUCE-7069.002.patch, 
> MAPREDUCE-7069.003.patch, MAPREDUCE-7069.004.patch, MAPREDUCE-7069.005.patch, 
> MAPREDUCE-7069.006.patch
>
>
> As reported in YARN-6830, it is currently not possible to specify an 
> environment variable that contains commas via {{mapreduce.map.env}}, 
> mapreduce.reduce.env, or {{mapreduce.admin.user.env}}.
> To address this, [~aw] proposed in [YARN-6830] that we add the ability to 
> specify environment variables individually:
> {quote}e.g, mapreduce.map.env.[foo]=bar gets turned into foo=bar
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7069) Add ability to specify user environment variables individually

2018-04-09 Thread Jim Brennan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16430549#comment-16430549
 ] 

Jim Brennan commented on MAPREDUCE-7069:


[~jlowe], I think this is ready for review - the unit test failure is unrelated 
- not sure if there is a bug for that jobclient test failure?

> Add ability to specify user environment variables individually
> --
>
> Key: MAPREDUCE-7069
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7069
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: MAPREDUCE-7069.001.patch, MAPREDUCE-7069.002.patch, 
> MAPREDUCE-7069.003.patch, MAPREDUCE-7069.004.patch, MAPREDUCE-7069.005.patch
>
>
> As reported in YARN-6830, it is currently not possible to specify an 
> environment variable that contains commas via {{mapreduce.map.env}}, 
> mapreduce.reduce.env, or {{mapreduce.admin.user.env}}.
> To address this, [~aw] proposed in [YARN-6830] that we add the ability to 
> specify environment variables individually:
> {quote}e.g, mapreduce.map.env.[foo]=bar gets turned into foo=bar
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7069) Add ability to specify user environment variables individually

2018-04-06 Thread Jim Brennan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated MAPREDUCE-7069:
---
Attachment: MAPREDUCE-7069.005.patch

> Add ability to specify user environment variables individually
> --
>
> Key: MAPREDUCE-7069
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7069
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: MAPREDUCE-7069.001.patch, MAPREDUCE-7069.002.patch, 
> MAPREDUCE-7069.003.patch, MAPREDUCE-7069.004.patch, MAPREDUCE-7069.005.patch
>
>
> As reported in YARN-6830, it is currently not possible to specify an 
> environment variable that contains commas via {{mapreduce.map.env}}, 
> mapreduce.reduce.env, or {{mapreduce.admin.user.env}}.
> To address this, [~aw] proposed in [YARN-6830] that we add the ability to 
> specify environment variables individually:
> {quote}e.g, mapreduce.map.env.[foo]=bar gets turned into foo=bar
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7069) Add ability to specify user environment variables individually

2018-04-06 Thread Jim Brennan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16428382#comment-16428382
 ] 

Jim Brennan commented on MAPREDUCE-7069:


I put up a new patch that fixes the checkstyle/whitespace issues.  I believe 
the unit test failure is unrelated.  [~jlowe], I think this is ready for 
another review, unless you'd prefer to wait for the latest patch to finish 
automated testing.

For setVMEnv(), I was able to rearrange the code to avoid calling 
MRApps.setEnvFromInputProperty twice.

> Add ability to specify user environment variables individually
> --
>
> Key: MAPREDUCE-7069
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7069
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: MAPREDUCE-7069.001.patch, MAPREDUCE-7069.002.patch, 
> MAPREDUCE-7069.003.patch, MAPREDUCE-7069.004.patch
>
>
> As reported in YARN-6830, it is currently not possible to specify an 
> environment variable that contains commas via {{mapreduce.map.env}}, 
> mapreduce.reduce.env, or {{mapreduce.admin.user.env}}.
> To address this, [~aw] proposed in [YARN-6830] that we add the ability to 
> specify environment variables individually:
> {quote}e.g, mapreduce.map.env.[foo]=bar gets turned into foo=bar
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7069) Add ability to specify user environment variables individually

2018-04-06 Thread Jim Brennan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated MAPREDUCE-7069:
---
Attachment: MAPREDUCE-7069.004.patch

> Add ability to specify user environment variables individually
> --
>
> Key: MAPREDUCE-7069
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7069
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: MAPREDUCE-7069.001.patch, MAPREDUCE-7069.002.patch, 
> MAPREDUCE-7069.003.patch, MAPREDUCE-7069.004.patch
>
>
> As reported in YARN-6830, it is currently not possible to specify an 
> environment variable that contains commas via {{mapreduce.map.env}}, 
> mapreduce.reduce.env, or {{mapreduce.admin.user.env}}.
> To address this, [~aw] proposed in [YARN-6830] that we add the ability to 
> specify environment variables individually:
> {quote}e.g, mapreduce.map.env.[foo]=bar gets turned into foo=bar
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7069) Add ability to specify user environment variables individually

2018-04-05 Thread Jim Brennan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated MAPREDUCE-7069:
---
Attachment: MAPREDUCE-7069.003.patch

> Add ability to specify user environment variables individually
> --
>
> Key: MAPREDUCE-7069
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7069
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: MAPREDUCE-7069.001.patch, MAPREDUCE-7069.002.patch, 
> MAPREDUCE-7069.003.patch
>
>
> As reported in YARN-6830, it is currently not possible to specify an 
> environment variable that contains commas via {{mapreduce.map.env}}, 
> mapreduce.reduce.env, or {{mapreduce.admin.user.env}}.
> To address this, [~aw] proposed in [YARN-6830] that we add the ability to 
> specify environment variables individually:
> {quote}e.g, mapreduce.map.env.[foo]=bar gets turned into foo=bar
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7069) Add ability to specify user environment variables individually

2018-04-04 Thread Jim Brennan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425908#comment-16425908
 ] 

Jim Brennan commented on MAPREDUCE-7069:


[~jlowe] thanks for the thorough review.  Much appreciated!
{quote}It's a bit odd and inefficient that setVMEnv calls 
MRApps.setEnvFromInputProperty twice. I think it would be clearer and more 
efficient to call it once, place the results in a temporary map (like it 
already does in the second call), then only set HADOOP_ROOT_LOGGER and 
HADOOP_CLIENT_OPTS in the environment if they are not set in the temporary map. 
Then at the end we can simply call addAll to dump the contents of the temporary 
map into the environment map.
{quote}
The reason it was done with two calls is because of the way environment 
variables are handled when they are already defined in the environment map. If 
you an environment variable that you are updating already exists in the 
environment, setEnvFromInput* functions append the new value to the existing 
value, using the appropriate separator. The special handling for 
HADOOP_ROOT_LOGGER and HADOOP_CLIENT_OPTS is to overwrite them instead of 
appending.  That said, I can definitely change it to do it the way you suggest, 
except I can't just use addAll() - you ultimately need to use 
Apps.addToEnvironment on each k/v pair. I could expose an 
Apps.setEnvFromInputStringMapNoExpand() (or add a noExpand boolean to the 
existing one) to handle this though.  

Thanks for the documentation/comment recommendations - I was going to ask about 
that - I'll clean those up.
{quote}Nit: setEnvFromInputStringMap does not need to be public.
{quote}
Will fix. In an earlier iteration I was calling this directly.
{quote}Would it be easier to call tmpEnv.addAll(inputMap) and pass tmpEnv 
instead of inputMap? Then we don't need to explicitly iterate the map.
{quote}
Yes.  I will make this change.

> Add ability to specify user environment variables individually
> --
>
> Key: MAPREDUCE-7069
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7069
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: MAPREDUCE-7069.001.patch, MAPREDUCE-7069.002.patch
>
>
> As reported in YARN-6830, it is currently not possible to specify an 
> environment variable that contains commas via {{mapreduce.map.env}}, 
> mapreduce.reduce.env, or {{mapreduce.admin.user.env}}.
> To address this, [~aw] proposed in [YARN-6830] that we add the ability to 
> specify environment variables individually:
> {quote}e.g, mapreduce.map.env.[foo]=bar gets turned into foo=bar
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7069) Add ability to specify user environment variables individually

2018-04-03 Thread Jim Brennan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16424557#comment-16424557
 ] 

Jim Brennan commented on MAPREDUCE-7069:


[~jlowe], I believe this is ready for review.  The failing unit test is 
unrelated.

> Add ability to specify user environment variables individually
> --
>
> Key: MAPREDUCE-7069
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7069
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: MAPREDUCE-7069.001.patch, MAPREDUCE-7069.002.patch
>
>
> As reported in YARN-6830, it is currently not possible to specify an 
> environment variable that contains commas via {{mapreduce.map.env}}, 
> mapreduce.reduce.env, or {{mapreduce.admin.user.env}}.
> To address this, [~aw] proposed in [YARN-6830] that we add the ability to 
> specify environment variables individually:
> {quote}e.g, mapreduce.map.env.[foo]=bar gets turned into foo=bar
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7069) Add ability to specify user environment variables individually

2018-04-03 Thread Jim Brennan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16424178#comment-16424178
 ] 

Jim Brennan commented on MAPREDUCE-7069:


Fixed the checkstyle issues and put up a new patch.  The 
{{hadoop-mapreduce-client-jobclient}} test failure seems unrelated.

> Add ability to specify user environment variables individually
> --
>
> Key: MAPREDUCE-7069
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7069
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: MAPREDUCE-7069.001.patch, MAPREDUCE-7069.002.patch
>
>
> As reported in YARN-6830, it is currently not possible to specify an 
> environment variable that contains commas via {{mapreduce.map.env}}, 
> mapreduce.reduce.env, or {{mapreduce.admin.user.env}}.
> To address this, [~aw] proposed in [YARN-6830] that we add the ability to 
> specify environment variables individually:
> {quote}e.g, mapreduce.map.env.[foo]=bar gets turned into foo=bar
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7069) Add ability to specify user environment variables individually

2018-04-03 Thread Jim Brennan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated MAPREDUCE-7069:
---
Attachment: MAPREDUCE-7069.002.patch

> Add ability to specify user environment variables individually
> --
>
> Key: MAPREDUCE-7069
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7069
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: MAPREDUCE-7069.001.patch, MAPREDUCE-7069.002.patch
>
>
> As reported in YARN-6830, it is currently not possible to specify an 
> environment variable that contains commas via {{mapreduce.map.env}}, 
> mapreduce.reduce.env, or {{mapreduce.admin.user.env}}.
> To address this, [~aw] proposed in [YARN-6830] that we add the ability to 
> specify environment variables individually:
> {quote}e.g, mapreduce.map.env.[foo]=bar gets turned into foo=bar
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7069) Add ability to specify user environment variables individually

2018-04-02 Thread Jim Brennan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422913#comment-16422913
 ] 

Jim Brennan commented on MAPREDUCE-7069:


Currently using Configuration.getPropsWithPrefix() for this, but that function 
is not currently expanding variables.   I filed a separate Jira for that issue: 
[HADOOP-15357]


> Add ability to specify user environment variables individually
> --
>
> Key: MAPREDUCE-7069
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7069
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: MAPREDUCE-7069.001.patch
>
>
> As reported in YARN-6830, it is currently not possible to specify an 
> environment variable that contains commas via {{mapreduce.map.env}}, 
> mapreduce.reduce.env, or {{mapreduce.admin.user.env}}.
> To address this, [~aw] proposed in [YARN-6830] that we add the ability to 
> specify environment variables individually:
> {quote}e.g, mapreduce.map.env.[foo]=bar gets turned into foo=bar
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7069) Add ability to specify user environment variables individually

2018-04-02 Thread Jim Brennan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated MAPREDUCE-7069:
---
Status: Patch Available  (was: Open)

> Add ability to specify user environment variables individually
> --
>
> Key: MAPREDUCE-7069
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7069
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: MAPREDUCE-7069.001.patch
>
>
> As reported in YARN-6830, it is currently not possible to specify an 
> environment variable that contains commas via {{mapreduce.map.env}}, 
> mapreduce.reduce.env, or {{mapreduce.admin.user.env}}.
> To address this, [~aw] proposed in [YARN-6830] that we add the ability to 
> specify environment variables individually:
> {quote}e.g, mapreduce.map.env.[foo]=bar gets turned into foo=bar
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7069) Add ability to specify user environment variables individually

2018-04-02 Thread Jim Brennan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated MAPREDUCE-7069:
---
Attachment: MAPREDUCE-7069.001.patch

> Add ability to specify user environment variables individually
> --
>
> Key: MAPREDUCE-7069
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7069
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: MAPREDUCE-7069.001.patch
>
>
> As reported in YARN-6830, it is currently not possible to specify an 
> environment variable that contains commas via {{mapreduce.map.env}}, 
> mapreduce.reduce.env, or {{mapreduce.admin.user.env}}.
> To address this, [~aw] proposed in [YARN-6830] that we add the ability to 
> specify environment variables individually:
> {quote}e.g, mapreduce.map.env.[foo]=bar gets turned into foo=bar
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Assigned] (MAPREDUCE-7069) Add ability to specify user environment variables individually

2018-03-28 Thread Jim Brennan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan reassigned MAPREDUCE-7069:
--

Assignee: Jim Brennan

> Add ability to specify user environment variables individually
> --
>
> Key: MAPREDUCE-7069
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7069
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
>
> As reported in YARN-6830, it is currently not possible to specify an 
> environment variable that contains commas via {{mapreduce.map.env}}, 
> mapreduce.reduce.env, or {{mapreduce.admin.user.env}}.
> To address this, [~aw] proposed in [YARN-6830] that we add the ability to 
> specify environment variables individually:
> {quote}e.g, mapreduce.map.env.[foo]=bar gets turned into foo=bar
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-7069) Add ability to specify user environment variables individually

2018-03-28 Thread Jim Brennan (JIRA)
Jim Brennan created MAPREDUCE-7069:
--

 Summary: Add ability to specify user environment variables 
individually
 Key: MAPREDUCE-7069
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7069
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Jim Brennan


As reported in YARN-6830, it is currently not possible to specify an 
environment variable that contains commas via {{mapreduce.map.env}}, 
mapreduce.reduce.env, or {{mapreduce.admin.user.env}}.

To address this, [~aw] proposed in [YARN-6830] that we add the ability to 
specify environment variables individually:
{quote}e.g, mapreduce.map.env.[foo]=bar gets turned into foo=bar
{quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6948) TestJobImpl.testUnusableNodeTransition failed

2018-03-21 Thread Jim Brennan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16408630#comment-16408630
 ] 

Jim Brennan commented on MAPREDUCE-6948:


Propose we close this.

[~jlowe], [~haibochen], do you agree?

 

> TestJobImpl.testUnusableNodeTransition failed
> -
>
> Key: MAPREDUCE-6948
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6948
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha4
>Reporter: Haibo Chen
>Assignee: Jim Brennan
>Priority: Major
>  Labels: unit-test
>
> *Error Message*
> expected: but was:
> *Stacktrace*
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl.assertJobState(TestJobImpl.java:1041)
>   at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl.testUnusableNodeTransition(TestJobImpl.java:615)
> *Standard out*
> {code}
> 2017-08-30 10:12:21,928 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventType for class 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler
> 2017-08-30 10:12:21,939 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.v2.app.job.event.JobEventType for class 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl$StubbedJob
> 2017-08-30 10:12:21,940 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.v2.app.job.event.TaskEventType for class 
> org.apache.hadoop.yarn.event.EventHandler$$EnhancerByMockitoWithCGLIB$$79f96ebf
> 2017-08-30 10:12:21,940 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.jobhistory.EventType for class 
> org.apache.hadoop.yarn.event.EventHandler$$EnhancerByMockitoWithCGLIB$$79f96ebf
> 2017-08-30 10:12:21,940 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.v2.app.job.event.JobFinishEvent$Type for class 
> org.apache.hadoop.yarn.event.EventHandler$$EnhancerByMockitoWithCGLIB$$79f96ebf
> 2017-08-30 10:12:21,941 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:setup(1534)) - Adding job token for job_123456789_0001 to 
> jobTokenSecretManager
> 2017-08-30 10:12:21,941 WARN  [Thread-49] impl.JobImpl 
> (JobImpl.java:setup(1540)) - Shuffle secret key missing from job credentials. 
> Using job token secret as shuffle secret.
> 2017-08-30 10:12:21,944 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:makeUberDecision(1305)) - Not uberizing job_123456789_0001 
> because: not enabled;
> 2017-08-30 10:12:21,944 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:createMapTasks(1562)) - Input size for job 
> job_123456789_0001 = 0. Number of splits = 2
> 2017-08-30 10:12:21,945 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:createReduceTasks(1579)) - Number of reduces for job 
> job_123456789_0001 = 1
> 2017-08-30 10:12:21,945 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:handle(1017)) - job_123456789_0001Job Transitioned from NEW 
> to INITED
> 2017-08-30 10:12:21,946 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:handle(1017)) - job_123456789_0001Job Transitioned from 
> INITED to SETUP
> 2017-08-30 10:12:21,954 INFO  [CommitterEvent Processor #0] 
> commit.CommitterEventHandler (CommitterEventHandler.java:run(231)) - 
> Processing the event EventType: JOB_SETUP
> 2017-08-30 10:12:21,978 INFO  [AsyncDispatcher event handler] impl.JobImpl 
> (JobImpl.java:handle(1017)) - job_123456789_0001Job Transitioned from 
> SETUP to RUNNING
> 2017-08-30 10:12:21,983 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.v2.app.job.event.TaskAttemptEventType for class 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl$5
> 2017-08-30 10:12:22,000 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:transition(1953)) - Num completed Tasks: 1
> 2017-08-30 10:12:22,029 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:transition(1953)) - Num completed Tasks: 2
> 2017-08-30 10:12:22,032 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:actOnUnusableNode(1354)) - TaskAttempt killed because it ran on 
> unusable node Mock for NodeId, hashCode: 1280187896. 
> AttemptId:attempt_123456789_0001_m_00_0
> 2017-08-30 10:12:22,032 INFO  [Thre

[jira] [Commented] (MAPREDUCE-6948) TestJobImpl.testUnusableNodeTransition failed

2017-12-15 Thread Jim Brennan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16293374#comment-16293374
 ] 

Jim Brennan commented on MAPREDUCE-6948:


I have been unable to reproduce this problem in trunk, nor in branch-2.8.   I 
have been able to repro in branch-2.7, but only by adding a sleep to exacerbate 
the race condition.

Analysis:
The key point in the failure case is here:
{{2017-08-30 10:12:22,000 INFO  [Thread-49] impl.JobImpl 
(JobImpl.java:transition(1953)) - Num completed Tasks: 1
2017-08-30 10:12:22,029 INFO  [Thread-49] impl.JobImpl 
(JobImpl.java:transition(1953)) - Num completed Tasks: 2
2017-08-30 10:12:22,032 INFO  [Thread-49] impl.JobImpl 
(JobImpl.java:actOnUnusableNode(1354)) - TaskAttempt killed because it ran on 
unusable node Mock for NodeId, hashCode: 1280187896. 
AttemptId:attempt_123456789_0001_m_00_0
2017-08-30 10:12:22,032 INFO  [Thread-49] impl.JobImpl 
(JobImpl.java:transition(1953)) - Num completed Tasks: 3
}}
At this point Num completed tasks should be 2.  Since it is 3, we start moving 
to the COMMITTED state too early and trip the failure.
In the successful case, the log looks like this:
{{2017-12-15 16:16:54,253 INFO  [Thread-0] impl.JobImpl 
(JobImpl.java:transition(1979)) - Num completed Tasks: 1
2017-12-15 16:16:54,258 INFO  [Thread-0] impl.JobImpl 
(JobImpl.java:transition(1979)) - Num completed Tasks: 2
2017-12-15 16:16:54,260 INFO  [Thread-0] impl.JobImpl 
(JobImpl.java:actOnUnusableNode(1359)) - TaskAttempt killed because it ran on 
unusable node Mock for NodeId, hashCode: 131679889. 
AttemptId:attempt_123456789_0001_m_00_0
2017-12-15 16:16:54,261 INFO  [Thread-0] impl.JobImpl 
(JobImpl.java:transition(1979)) - Num completed Tasks: 2
2017-12-15 16:16:54,262 INFO  [Thread-0] impl.JobImpl 
(JobImpl.java:checkReadyForCompletionWhenAllReducersDone(2103)) - Killing map 
task task_123456789_0001_m_00
2017-12-15 16:16:54,263 INFO  [Thread-0] impl.JobImpl 
(JobImpl.java:checkReadyForCompletionWhenAllReducersDone(2103)) - Killing map 
task task_123456789_0001_m_01
2017-12-15 16:16:54,263 INFO  [Thread-0] impl.JobImpl 
(JobImpl.java:transition(1979)) - Num completed Tasks: 3}}

The second Num Completed Tasks:2 line corresponds to when we mark the Reducer 
task as SUCCEEDED.  At this point, the count of succeeded map tasks should be 
1, because it was just decremented due to the unusable node.  It is incremented 
to 2 before printing.

The difference between branch-2.7, which fails, and trunk/branch-2.8 is the fix 
in MAPREDUCE-6675, which switched it to use a DrainDispatcher and added a 
dispatcher.await() call before we complete the reducer.

Another possible factor is YARN-5436, which fixed a very similar race in 
DrainDispatcher.  That one is present in trunk, but not in branch-2.8.  So it 
may account for intermittent failures in branch-2.8, but I was not able to 
reproduce it.

So as far as I can tell, this appears to be fixed already.

[~haibo.chen], can you provide any insight?  Any chance this failure was seen 
on branch-2.8 or branch-2.7?



> TestJobImpl.testUnusableNodeTransition failed
> -
>
> Key: MAPREDUCE-6948
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6948
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha4
>Reporter: Haibo Chen
>Assignee: Jim Brennan
>  Labels: unit-test
>
> *Error Message*
> expected: but was:
> *Stacktrace*
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl.assertJobState(TestJobImpl.java:1041)
>   at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl.testUnusableNodeTransition(TestJobImpl.java:615)
> *Standard out*
> {code}
> 2017-08-30 10:12:21,928 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventType for class 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler
> 2017-08-30 10:12:21,939 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.v2.app.job.event.JobEventType for class 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl$StubbedJob
> 2017-08-30 10:12:21,940 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.v2.app.job.event.TaskEventType for class 
> org.apache.hadoop.yarn.event.EventHandler$$EnhancerByMockitoWithCGLIB$$79f96ebf
> 2

[jira] [Assigned] (MAPREDUCE-6948) TestJobImpl.testUnusableNodeTransition failed

2017-12-11 Thread Jim Brennan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan reassigned MAPREDUCE-6948:
--

Assignee: Jim Brennan

> TestJobImpl.testUnusableNodeTransition failed
> -
>
> Key: MAPREDUCE-6948
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6948
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha4
>Reporter: Haibo Chen
>Assignee: Jim Brennan
>  Labels: unit-test
>
> *Error Message*
> expected: but was:
> *Stacktrace*
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl.assertJobState(TestJobImpl.java:1041)
>   at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl.testUnusableNodeTransition(TestJobImpl.java:615)
> *Standard out*
> {code}
> 2017-08-30 10:12:21,928 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventType for class 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler
> 2017-08-30 10:12:21,939 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.v2.app.job.event.JobEventType for class 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl$StubbedJob
> 2017-08-30 10:12:21,940 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.v2.app.job.event.TaskEventType for class 
> org.apache.hadoop.yarn.event.EventHandler$$EnhancerByMockitoWithCGLIB$$79f96ebf
> 2017-08-30 10:12:21,940 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.jobhistory.EventType for class 
> org.apache.hadoop.yarn.event.EventHandler$$EnhancerByMockitoWithCGLIB$$79f96ebf
> 2017-08-30 10:12:21,940 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.v2.app.job.event.JobFinishEvent$Type for class 
> org.apache.hadoop.yarn.event.EventHandler$$EnhancerByMockitoWithCGLIB$$79f96ebf
> 2017-08-30 10:12:21,941 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:setup(1534)) - Adding job token for job_123456789_0001 to 
> jobTokenSecretManager
> 2017-08-30 10:12:21,941 WARN  [Thread-49] impl.JobImpl 
> (JobImpl.java:setup(1540)) - Shuffle secret key missing from job credentials. 
> Using job token secret as shuffle secret.
> 2017-08-30 10:12:21,944 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:makeUberDecision(1305)) - Not uberizing job_123456789_0001 
> because: not enabled;
> 2017-08-30 10:12:21,944 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:createMapTasks(1562)) - Input size for job 
> job_123456789_0001 = 0. Number of splits = 2
> 2017-08-30 10:12:21,945 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:createReduceTasks(1579)) - Number of reduces for job 
> job_123456789_0001 = 1
> 2017-08-30 10:12:21,945 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:handle(1017)) - job_123456789_0001Job Transitioned from NEW 
> to INITED
> 2017-08-30 10:12:21,946 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:handle(1017)) - job_123456789_0001Job Transitioned from 
> INITED to SETUP
> 2017-08-30 10:12:21,954 INFO  [CommitterEvent Processor #0] 
> commit.CommitterEventHandler (CommitterEventHandler.java:run(231)) - 
> Processing the event EventType: JOB_SETUP
> 2017-08-30 10:12:21,978 INFO  [AsyncDispatcher event handler] impl.JobImpl 
> (JobImpl.java:handle(1017)) - job_123456789_0001Job Transitioned from 
> SETUP to RUNNING
> 2017-08-30 10:12:21,983 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.v2.app.job.event.TaskAttemptEventType for class 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl$5
> 2017-08-30 10:12:22,000 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:transition(1953)) - Num completed Tasks: 1
> 2017-08-30 10:12:22,029 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:transition(1953)) - Num completed Tasks: 2
> 2017-08-30 10:12:22,032 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:actOnUnusableNode(1354)) - TaskAttempt killed because it ran on 
> unusable node Mock for NodeId, hashCode: 1280187896. 
> AttemptId:attempt_123456789_0001_m_00_0
> 2017-08-30 10:12:22,032 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:transition(1953)) - Num completed Tasks: 3
> 2017-08-30 10:12:22,032 INFO  [Thread