[jira] [Commented] (MAPREDUCE-5364) Deadlock between RenewalTimerTask methods cancel() and run()
[ https://issues.apache.org/jira/browse/MAPREDUCE-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697540#comment-13697540 ] Hadoop QA commented on MAPREDUCE-5364: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12590296/mr-5364-1.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3820//console This message is automatically generated. Deadlock between RenewalTimerTask methods cancel() and run() Key: MAPREDUCE-5364 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5364 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.2.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Attachments: mr-5364-1.patch MAPREDUCE-4860 introduced a local variable {{cancelled}} in {{RenewalTimerTask}} to fix the race where {{DelegationTokenRenewal}} attempts to renew a token even after the job is removed. However, the patch also makes {{run()}} and {{cancel()}} synchronized methods leading to a potential deadlock against {{run()}}'s catch-block (error-path). The deadlock stacks below: {noformat} - org.apache.hadoop.mapreduce.security.token.DelegationTokenRenewal$RenewalTimerTask.cancel() @bci=0, line=240 (Interpreted frame) - org.apache.hadoop.mapreduce.security.token.DelegationTokenRenewal.removeDelegationTokenRenewalForJob(org.apache.hadoop.mapreduce.JobID) @bci=109, line=319 (Interpreted frame) {noformat} {noformat} - org.apache.hadoop.mapreduce.security.token.DelegationTokenRenewal.removeFailedDelegationToken(org.apache.hadoop.mapreduce.security.token.DelegationTokenRenewal$DelegationTokenToRenew) @bci=62, line=297 (Interpreted frame) - org.apache.hadoop.mapreduce.security.token.DelegationTokenRenewal.access$300(org.apache.hadoop.mapreduce.security.token.DelegationTokenRenewal$DelegationTokenToRenew) @bci=1, line=47 (Interpreted frame) - org.apache.hadoop.mapreduce.security.token.DelegationTokenRenewal$RenewalTimerTask.run() @bci=148, line=234 (Interpreted frame) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5365) Set mapreduce.job.classloader to true by default
[ https://issues.apache.org/jira/browse/MAPREDUCE-5365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697559#comment-13697559 ] Devaraj K commented on MAPREDUCE-5365: -- +1, Changes look good to me. Set mapreduce.job.classloader to true by default Key: MAPREDUCE-5365 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5365 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 2.0.5-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: MAPREDUCE-5365.patch MAPREDUCE-1700 introduced the mapreduce.job.classpath option, which uses a custom classloader to separate system classes from user classes. It seems like there are only rare cases when a user would not want this on, and that it should enabled by default. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5368) Save memory by set capacity, load factor and concurrency level for ConcurrentHashMap in TaskInProgress
zhaoyunjiong created MAPREDUCE-5368: --- Summary: Save memory by set capacity, load factor and concurrency level for ConcurrentHashMap in TaskInProgress Key: MAPREDUCE-5368 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5368 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Affects Versions: 1.2.0 Reporter: zhaoyunjiong Fix For: 1.2.1 Below is histo from our JobTracker: num #instances #bytes class name -- 1: 13604882411347237456 [C 2: 124156992 5959535616 java.util.concurrent.locks.ReentrantLock$NonfairSync 3: 124156973 5959534704 java.util.concurrent.ConcurrentHashMap$Segment 4: 135887753 5435510120 java.lang.String 5: 124213692 3975044400 [Ljava.util.concurrent.ConcurrentHashMap$HashEntry; 6: 63777311 3061310928 java.util.HashMap$Entry 7: 35038252 2803060160 java.util.TreeMap 8: 16921110 2712480072 [Ljava.util.HashMap$Entry; 9: 4803617 2420449192 [Ljava.lang.Object; 10: 50392816 2015712640 org.apache.hadoop.mapred.Counters$Counter 11: 7775438 1181866576 [Ljava.util.concurrent.ConcurrentHashMap$Segment; 12: 3882847 1118259936 org.apache.hadoop.mapred.TaskInProgress ConcurrentHashMap takes more than 14G(5959535616 + 5959534704 + 3975044400). The trouble maker are below codes in TaskInProgress.java: MapTaskAttemptID, Locality taskLocality = new ConcurrentHashMapTaskAttemptID, Locality(); MapTaskAttemptID, Avataar taskAvataar = new ConcurrentHashMapTaskAttemptID, Avataar(); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5368) Save memory by set capacity, load factor and concurrency level for ConcurrentHashMap in TaskInProgress
[ https://issues.apache.org/jira/browse/MAPREDUCE-5368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhaoyunjiong updated MAPREDUCE-5368: Attachment: MAPREDUCE-5368.patch This simple patch can save more than 10GB when there are 4m TaskInProgress instances. Save memory by set capacity, load factor and concurrency level for ConcurrentHashMap in TaskInProgress --- Key: MAPREDUCE-5368 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5368 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Affects Versions: 1.2.0 Reporter: zhaoyunjiong Fix For: 1.2.1 Attachments: MAPREDUCE-5368.patch Below is histo from our JobTracker: num #instances #bytes class name -- 1: 13604882411347237456 [C 2: 124156992 5959535616 java.util.concurrent.locks.ReentrantLock$NonfairSync 3: 124156973 5959534704 java.util.concurrent.ConcurrentHashMap$Segment 4: 135887753 5435510120 java.lang.String 5: 124213692 3975044400 [Ljava.util.concurrent.ConcurrentHashMap$HashEntry; 6: 63777311 3061310928 java.util.HashMap$Entry 7: 35038252 2803060160 java.util.TreeMap 8: 16921110 2712480072 [Ljava.util.HashMap$Entry; 9: 4803617 2420449192 [Ljava.lang.Object; 10: 50392816 2015712640 org.apache.hadoop.mapred.Counters$Counter 11: 7775438 1181866576 [Ljava.util.concurrent.ConcurrentHashMap$Segment; 12: 3882847 1118259936 org.apache.hadoop.mapred.TaskInProgress ConcurrentHashMap takes more than 14G(5959535616 + 5959534704 + 3975044400). The trouble maker are below codes in TaskInProgress.java: MapTaskAttemptID, Locality taskLocality = new ConcurrentHashMapTaskAttemptID, Locality(); MapTaskAttemptID, Avataar taskAvataar = new ConcurrentHashMapTaskAttemptID, Avataar(); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (MAPREDUCE-2036) Enable Erasure Code in Tool similar to Hadoop Archive
[ https://issues.apache.org/jira/browse/MAPREDUCE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] 李志然 reassigned MAPREDUCE-2036: -- Assignee: 李志然 (was: Wittawat Tantisiriroj) Enable Erasure Code in Tool similar to Hadoop Archive - Key: MAPREDUCE-2036 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2036 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/raid, harchive Reporter: Wittawat Tantisiriroj Assignee: 李志然 Priority: Minor Attachments: hdfs-raid.tar.gz, MAPREDUCE-2036.patch, RaidTool.pdf Features: 1) HAR-like Tool 2) RAID5/RAID6 pluggable interface to implement additional coding 3) Enable to group blocks across files 4) Portable across cluster since all necessary metadata is embedded While it was developed separately from HAR or RAID due to time constraints, it would make sense to integrate with either of them. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3193) FileInputFormat doesn't read files recursively in the input path dir
[ https://issues.apache.org/jira/browse/MAPREDUCE-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated MAPREDUCE-3193: - Status: Open (was: Patch Available) FileInputFormat doesn't read files recursively in the input path dir Key: MAPREDUCE-3193 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3193 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1, mrv2 Affects Versions: 2.0.0-alpha, 0.23.2, 3.0.0 Reporter: Ramgopal N Assignee: Devaraj K Attachments: MAPREDUCE-3193-1.patch, MAPREDUCE-3193-2.patch, MAPREDUCE-3193-2.patch, MAPREDUCE-3193-3.patch, MAPREDUCE-3193-4.patch, MAPREDUCE-3193.patch, MAPREDUCE-3193.security.patch java.io.FileNotFoundException is thrown,if input file is more than one folder level deep and the job is getting failed. Example:Input file is /r1/r2/input.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3193) FileInputFormat doesn't read files recursively in the input path dir
[ https://issues.apache.org/jira/browse/MAPREDUCE-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated MAPREDUCE-3193: - Attachment: MAPREDUCE-3193-5.patch FileInputFormat doesn't read files recursively in the input path dir Key: MAPREDUCE-3193 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3193 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1, mrv2 Affects Versions: 0.23.2, 2.0.0-alpha, 3.0.0 Reporter: Ramgopal N Assignee: Devaraj K Attachments: MAPREDUCE-3193-1.patch, MAPREDUCE-3193-2.patch, MAPREDUCE-3193-2.patch, MAPREDUCE-3193-3.patch, MAPREDUCE-3193-4.patch, MAPREDUCE-3193-5.patch, MAPREDUCE-3193.patch, MAPREDUCE-3193.security.patch java.io.FileNotFoundException is thrown,if input file is more than one folder level deep and the job is getting failed. Example:Input file is /r1/r2/input.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3193) FileInputFormat doesn't read files recursively in the input path dir
[ https://issues.apache.org/jira/browse/MAPREDUCE-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated MAPREDUCE-3193: - Status: Patch Available (was: Open) Thanks Amar and Jason. I have updated the patch with the suggested change, Please review this. FileInputFormat doesn't read files recursively in the input path dir Key: MAPREDUCE-3193 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3193 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1, mrv2 Affects Versions: 2.0.0-alpha, 0.23.2, 3.0.0 Reporter: Ramgopal N Assignee: Devaraj K Attachments: MAPREDUCE-3193-1.patch, MAPREDUCE-3193-2.patch, MAPREDUCE-3193-2.patch, MAPREDUCE-3193-3.patch, MAPREDUCE-3193-4.patch, MAPREDUCE-3193-5.patch, MAPREDUCE-3193.patch, MAPREDUCE-3193.security.patch java.io.FileNotFoundException is thrown,if input file is more than one folder level deep and the job is getting failed. Example:Input file is /r1/r2/input.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5369) Progress for jobs with multiple splits in local mode is wrong
Johannes Zillmann created MAPREDUCE-5369: Summary: Progress for jobs with multiple splits in local mode is wrong Key: MAPREDUCE-5369 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5369 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.2 Reporter: Johannes Zillmann In case a job with multiple splits is executed in local mode (LocalJobRunner) its progress calculation is wrong. After the first split is processed it jumps to 100%, then back to 50% and so on. The reason lies in the progress calculation in LocalJobRunner: {code} float taskIndex = mapIds.indexOf(taskId); if (taskIndex = 0) { // mapping float numTasks = mapIds.size(); status.setMapProgress(taskIndex/numTasks + taskStatus.getProgress()/numTasks); } else { status.setReduceProgress(taskStatus.getProgress()); } {code} The problem is that {{mapIds}} is filled lazily in run(). There is an loop over all splits. In the loop, the splits task id is added to {{mapIds}}, then the split is processed. That means {{numTasks}} is 1 while the first split is processed, it is 2 while the second task is processed and so on... I tried Hadoop 0.20.2, 1.0.3, 1.1.2 and cdh-4.1. All the same behaviour! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3193) FileInputFormat doesn't read files recursively in the input path dir
[ https://issues.apache.org/jira/browse/MAPREDUCE-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697621#comment-13697621 ] Hadoop QA commented on MAPREDUCE-3193: -- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12590385/MAPREDUCE-3193-5.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3821//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3821//console This message is automatically generated. FileInputFormat doesn't read files recursively in the input path dir Key: MAPREDUCE-3193 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3193 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1, mrv2 Affects Versions: 0.23.2, 2.0.0-alpha, 3.0.0 Reporter: Ramgopal N Assignee: Devaraj K Attachments: MAPREDUCE-3193-1.patch, MAPREDUCE-3193-2.patch, MAPREDUCE-3193-2.patch, MAPREDUCE-3193-3.patch, MAPREDUCE-3193-4.patch, MAPREDUCE-3193-5.patch, MAPREDUCE-3193.patch, MAPREDUCE-3193.security.patch java.io.FileNotFoundException is thrown,if input file is more than one folder level deep and the job is getting failed. Example:Input file is /r1/r2/input.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5369) Progress for jobs with multiple splits in local mode is wrong
[ https://issues.apache.org/jira/browse/MAPREDUCE-5369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697626#comment-13697626 ] Johannes Zillmann commented on MAPREDUCE-5369: -- Update: tested that with cdh4.3 and the problem does not exists anymore in that version. So it might be fixed somewhere along the 2.x branch. Progress for jobs with multiple splits in local mode is wrong - Key: MAPREDUCE-5369 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5369 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.2 Reporter: Johannes Zillmann In case a job with multiple splits is executed in local mode (LocalJobRunner) its progress calculation is wrong. After the first split is processed it jumps to 100%, then back to 50% and so on. The reason lies in the progress calculation in LocalJobRunner: {code} float taskIndex = mapIds.indexOf(taskId); if (taskIndex = 0) { // mapping float numTasks = mapIds.size(); status.setMapProgress(taskIndex/numTasks + taskStatus.getProgress()/numTasks); } else { status.setReduceProgress(taskStatus.getProgress()); } {code} The problem is that {{mapIds}} is filled lazily in run(). There is an loop over all splits. In the loop, the splits task id is added to {{mapIds}}, then the split is processed. That means {{numTasks}} is 1 while the first split is processed, it is 2 while the second task is processed and so on... I tried Hadoop 0.20.2, 1.0.3, 1.1.2 and cdh-4.1. All the same behaviour! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5364) Deadlock between RenewalTimerTask methods cancel() and run()
[ https://issues.apache.org/jira/browse/MAPREDUCE-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697799#comment-13697799 ] Karthik Kambatla commented on MAPREDUCE-5364: - The patch doesn't apply because it is for branch-1. Deadlock between RenewalTimerTask methods cancel() and run() Key: MAPREDUCE-5364 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5364 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.2.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Attachments: mr-5364-1.patch MAPREDUCE-4860 introduced a local variable {{cancelled}} in {{RenewalTimerTask}} to fix the race where {{DelegationTokenRenewal}} attempts to renew a token even after the job is removed. However, the patch also makes {{run()}} and {{cancel()}} synchronized methods leading to a potential deadlock against {{run()}}'s catch-block (error-path). The deadlock stacks below: {noformat} - org.apache.hadoop.mapreduce.security.token.DelegationTokenRenewal$RenewalTimerTask.cancel() @bci=0, line=240 (Interpreted frame) - org.apache.hadoop.mapreduce.security.token.DelegationTokenRenewal.removeDelegationTokenRenewalForJob(org.apache.hadoop.mapreduce.JobID) @bci=109, line=319 (Interpreted frame) {noformat} {noformat} - org.apache.hadoop.mapreduce.security.token.DelegationTokenRenewal.removeFailedDelegationToken(org.apache.hadoop.mapreduce.security.token.DelegationTokenRenewal$DelegationTokenToRenew) @bci=62, line=297 (Interpreted frame) - org.apache.hadoop.mapreduce.security.token.DelegationTokenRenewal.access$300(org.apache.hadoop.mapreduce.security.token.DelegationTokenRenewal$DelegationTokenToRenew) @bci=1, line=47 (Interpreted frame) - org.apache.hadoop.mapreduce.security.token.DelegationTokenRenewal$RenewalTimerTask.run() @bci=148, line=234 (Interpreted frame) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5368) Save memory by set capacity, load factor and concurrency level for ConcurrentHashMap in TaskInProgress
[ https://issues.apache.org/jira/browse/MAPREDUCE-5368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697850#comment-13697850 ] Karthik Kambatla commented on MAPREDUCE-5368: - The patch is a definite improvement over the current situation. As in branch-2, can we get rid of the ConcurrentHashMaps altogether, and move Locality and Avataar to TaskAttemptID itself? Save memory by set capacity, load factor and concurrency level for ConcurrentHashMap in TaskInProgress --- Key: MAPREDUCE-5368 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5368 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Affects Versions: 1.2.0 Reporter: zhaoyunjiong Fix For: 1.2.1 Attachments: MAPREDUCE-5368.patch Below is histo from our JobTracker: num #instances #bytes class name -- 1: 13604882411347237456 [C 2: 124156992 5959535616 java.util.concurrent.locks.ReentrantLock$NonfairSync 3: 124156973 5959534704 java.util.concurrent.ConcurrentHashMap$Segment 4: 135887753 5435510120 java.lang.String 5: 124213692 3975044400 [Ljava.util.concurrent.ConcurrentHashMap$HashEntry; 6: 63777311 3061310928 java.util.HashMap$Entry 7: 35038252 2803060160 java.util.TreeMap 8: 16921110 2712480072 [Ljava.util.HashMap$Entry; 9: 4803617 2420449192 [Ljava.lang.Object; 10: 50392816 2015712640 org.apache.hadoop.mapred.Counters$Counter 11: 7775438 1181866576 [Ljava.util.concurrent.ConcurrentHashMap$Segment; 12: 3882847 1118259936 org.apache.hadoop.mapred.TaskInProgress ConcurrentHashMap takes more than 14G(5959535616 + 5959534704 + 3975044400). The trouble maker are below codes in TaskInProgress.java: MapTaskAttemptID, Locality taskLocality = new ConcurrentHashMapTaskAttemptID, Locality(); MapTaskAttemptID, Avataar taskAvataar = new ConcurrentHashMapTaskAttemptID, Avataar(); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3193) FileInputFormat doesn't read files recursively in the input path dir
[ https://issues.apache.org/jira/browse/MAPREDUCE-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697951#comment-13697951 ] Jason Lowe commented on MAPREDUCE-3193: --- +1, will commit later today FileInputFormat doesn't read files recursively in the input path dir Key: MAPREDUCE-3193 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3193 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1, mrv2 Affects Versions: 0.23.2, 2.0.0-alpha, 3.0.0 Reporter: Ramgopal N Assignee: Devaraj K Attachments: MAPREDUCE-3193-1.patch, MAPREDUCE-3193-2.patch, MAPREDUCE-3193-2.patch, MAPREDUCE-3193-3.patch, MAPREDUCE-3193-4.patch, MAPREDUCE-3193-5.patch, MAPREDUCE-3193.patch, MAPREDUCE-3193.security.patch java.io.FileNotFoundException is thrown,if input file is more than one folder level deep and the job is getting failed. Example:Input file is /r1/r2/input.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (MAPREDUCE-5351) JobTracker memory leak caused by CleanupQueue reopening FileSystem
[ https://issues.apache.org/jira/browse/MAPREDUCE-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Gupta reopened MAPREDUCE-5351: Reopening as with this fix we are seeing jobs fail with the following exception {code} 13/07/02 16:06:57 DEBUG mapred.JobClient: Printing tokens for job: job_201307020820_0012 13/07/02 16:06:57 DEBUG ipc.Client: IPC Client (47) connection to host/ip:50300 from hortonar sending #32 13/07/02 16:06:57 DEBUG ipc.Client: IPC Client (47) connection to host/ip:50300 from hortonar got value #32 13/07/02 16:06:57 DEBUG retry.RetryUtils: RETRY 0) policy=TryOnceThenFail, exception=org.apache.hadoop.ipc.RemoteException: java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:383) at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1633) at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:364) at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1166) at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:350) at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:3599) at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:3561) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1444) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1440) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1438) 13/07/02 16:06:57 INFO mapred.JobClient: Cleaning up the staging area hdfs://host:8020/user/hortonar/.staging/job_201307020820_0012 13/07/02 16:06:57 ERROR security.UserGroupInformation: PriviledgedActionException as:hortonar cause:org.apache.hadoop.ipc.RemoteException: java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:383) at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1633) at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:364) at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1166) at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:350) at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:3599) at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:3561) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1444) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1440) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1438) org.apache.hadoop.ipc.RemoteException: java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:383) at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1633) at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:364) at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1166) at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:350) at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:3599) at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:3561) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1444) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1440) at
[jira] [Commented] (MAPREDUCE-5368) Save memory by set capacity, load factor and concurrency level for ConcurrentHashMap in TaskInProgress
[ https://issues.apache.org/jira/browse/MAPREDUCE-5368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698042#comment-13698042 ] Mayank Bansal commented on MAPREDUCE-5368: -- The Default values of the Load factor is .75 anyways The Default concurrency level is 16 which I think is reasonable for the jobs. The Default initial capacity is also 16 which is also reasonable. I am not sure how we are saving memory here. Can you please explain a bit? Moreover I really dont think to change the concurrency level so low as it will increase the contention in the threads a lot. Thoughts? Thanks, Mayank Save memory by set capacity, load factor and concurrency level for ConcurrentHashMap in TaskInProgress --- Key: MAPREDUCE-5368 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5368 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Affects Versions: 1.2.0 Reporter: zhaoyunjiong Fix For: 1.2.1 Attachments: MAPREDUCE-5368.patch Below is histo from our JobTracker: num #instances #bytes class name -- 1: 13604882411347237456 [C 2: 124156992 5959535616 java.util.concurrent.locks.ReentrantLock$NonfairSync 3: 124156973 5959534704 java.util.concurrent.ConcurrentHashMap$Segment 4: 135887753 5435510120 java.lang.String 5: 124213692 3975044400 [Ljava.util.concurrent.ConcurrentHashMap$HashEntry; 6: 63777311 3061310928 java.util.HashMap$Entry 7: 35038252 2803060160 java.util.TreeMap 8: 16921110 2712480072 [Ljava.util.HashMap$Entry; 9: 4803617 2420449192 [Ljava.lang.Object; 10: 50392816 2015712640 org.apache.hadoop.mapred.Counters$Counter 11: 7775438 1181866576 [Ljava.util.concurrent.ConcurrentHashMap$Segment; 12: 3882847 1118259936 org.apache.hadoop.mapred.TaskInProgress ConcurrentHashMap takes more than 14G(5959535616 + 5959534704 + 3975044400). The trouble maker are below codes in TaskInProgress.java: MapTaskAttemptID, Locality taskLocality = new ConcurrentHashMapTaskAttemptID, Locality(); MapTaskAttemptID, Avataar taskAvataar = new ConcurrentHashMapTaskAttemptID, Avataar(); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5351) JobTracker memory leak caused by CleanupQueue reopening FileSystem
[ https://issues.apache.org/jira/browse/MAPREDUCE-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698043#comment-13698043 ] Sandy Ryza commented on MAPREDUCE-5351: --- [~arpitgupta], how often is this occurring for you? JobTracker memory leak caused by CleanupQueue reopening FileSystem -- Key: MAPREDUCE-5351 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5351 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 1.1.2 Reporter: Sandy Ryza Assignee: Sandy Ryza Priority: Critical Fix For: 1.2.1 Attachments: MAPREDUCE-5351-1.patch, MAPREDUCE-5351-2.patch, MAPREDUCE-5351.patch When a job is completed, closeAllForUGI is called to close all the cached FileSystems in the FileSystem cache. However, the CleanupQueue may run after this occurs and call FileSystem.get() to delete the staging directory, adding a FileSystem to the cache that will never be closed. People on the user-list have reported this causing their JobTrackers to OOME every two weeks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5351) JobTracker memory leak caused by CleanupQueue reopening FileSystem
[ https://issues.apache.org/jira/browse/MAPREDUCE-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698055#comment-13698055 ] Arpit Gupta commented on MAPREDUCE-5351: Its happening very frequently that quite a few mr jobs failed because of this exception. Various example jobs, jobs through pig, oozie etc also failed because of this. JobTracker memory leak caused by CleanupQueue reopening FileSystem -- Key: MAPREDUCE-5351 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5351 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 1.1.2 Reporter: Sandy Ryza Assignee: Sandy Ryza Priority: Critical Fix For: 1.2.1 Attachments: MAPREDUCE-5351-1.patch, MAPREDUCE-5351-2.patch, MAPREDUCE-5351.patch When a job is completed, closeAllForUGI is called to close all the cached FileSystems in the FileSystem cache. However, the CleanupQueue may run after this occurs and call FileSystem.get() to delete the staging directory, adding a FileSystem to the cache that will never be closed. People on the user-list have reported this causing their JobTrackers to OOME every two weeks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5351) JobTracker memory leak caused by CleanupQueue reopening FileSystem
[ https://issues.apache.org/jira/browse/MAPREDUCE-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698069#comment-13698069 ] Sandy Ryza commented on MAPREDUCE-5351: --- Thanks. I'll look into this. JobTracker memory leak caused by CleanupQueue reopening FileSystem -- Key: MAPREDUCE-5351 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5351 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 1.1.2 Reporter: Sandy Ryza Assignee: Sandy Ryza Priority: Critical Fix For: 1.2.1 Attachments: MAPREDUCE-5351-1.patch, MAPREDUCE-5351-2.patch, MAPREDUCE-5351.patch When a job is completed, closeAllForUGI is called to close all the cached FileSystems in the FileSystem cache. However, the CleanupQueue may run after this occurs and call FileSystem.get() to delete the staging directory, adding a FileSystem to the cache that will never be closed. People on the user-list have reported this causing their JobTrackers to OOME every two weeks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5351) JobTracker memory leak caused by CleanupQueue reopening FileSystem
[ https://issues.apache.org/jira/browse/MAPREDUCE-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698168#comment-13698168 ] Sandy Ryza commented on MAPREDUCE-5351: --- Attaching an addendum patch that should fix the issue. I was able to reproduce the issue by running pi jobs repeatedly. With the patch, the exception no longer occurs. JobTracker memory leak caused by CleanupQueue reopening FileSystem -- Key: MAPREDUCE-5351 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5351 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 1.1.2 Reporter: Sandy Ryza Assignee: Sandy Ryza Priority: Critical Fix For: 1.2.1 Attachments: MAPREDUCE-5351-1.patch, MAPREDUCE-5351-2.patch, MAPREDUCE-5351.patch When a job is completed, closeAllForUGI is called to close all the cached FileSystems in the FileSystem cache. However, the CleanupQueue may run after this occurs and call FileSystem.get() to delete the staging directory, adding a FileSystem to the cache that will never be closed. People on the user-list have reported this causing their JobTrackers to OOME every two weeks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5351) JobTracker memory leak caused by CleanupQueue reopening FileSystem
[ https://issues.apache.org/jira/browse/MAPREDUCE-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated MAPREDUCE-5351: -- Attachment: MAPREDUCE-5351-addendum.patch JobTracker memory leak caused by CleanupQueue reopening FileSystem -- Key: MAPREDUCE-5351 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5351 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 1.1.2 Reporter: Sandy Ryza Assignee: Sandy Ryza Priority: Critical Fix For: 1.2.1 Attachments: MAPREDUCE-5351-1.patch, MAPREDUCE-5351-2.patch, MAPREDUCE-5351-addendum.patch, MAPREDUCE-5351.patch When a job is completed, closeAllForUGI is called to close all the cached FileSystems in the FileSystem cache. However, the CleanupQueue may run after this occurs and call FileSystem.get() to delete the staging directory, adding a FileSystem to the cache that will never be closed. People on the user-list have reported this causing their JobTrackers to OOME every two weeks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5351) JobTracker memory leak caused by CleanupQueue reopening FileSystem
[ https://issues.apache.org/jira/browse/MAPREDUCE-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated MAPREDUCE-5351: -- Status: Patch Available (was: Reopened) JobTracker memory leak caused by CleanupQueue reopening FileSystem -- Key: MAPREDUCE-5351 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5351 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 1.1.2 Reporter: Sandy Ryza Assignee: Sandy Ryza Priority: Critical Fix For: 1.2.1 Attachments: MAPREDUCE-5351-1.patch, MAPREDUCE-5351-2.patch, MAPREDUCE-5351-addendum.patch, MAPREDUCE-5351.patch When a job is completed, closeAllForUGI is called to close all the cached FileSystems in the FileSystem cache. However, the CleanupQueue may run after this occurs and call FileSystem.get() to delete the staging directory, adding a FileSystem to the cache that will never be closed. People on the user-list have reported this causing their JobTrackers to OOME every two weeks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5351) JobTracker memory leak caused by CleanupQueue reopening FileSystem
[ https://issues.apache.org/jira/browse/MAPREDUCE-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698177#comment-13698177 ] Hadoop QA commented on MAPREDUCE-5351: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12590494/MAPREDUCE-5351-addendum.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3822//console This message is automatically generated. JobTracker memory leak caused by CleanupQueue reopening FileSystem -- Key: MAPREDUCE-5351 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5351 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 1.1.2 Reporter: Sandy Ryza Assignee: Sandy Ryza Priority: Critical Fix For: 1.2.1 Attachments: MAPREDUCE-5351-1.patch, MAPREDUCE-5351-2.patch, MAPREDUCE-5351-addendum.patch, MAPREDUCE-5351.patch When a job is completed, closeAllForUGI is called to close all the cached FileSystems in the FileSystem cache. However, the CleanupQueue may run after this occurs and call FileSystem.get() to delete the staging directory, adding a FileSystem to the cache that will never be closed. People on the user-list have reported this causing their JobTrackers to OOME every two weeks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-5330) JVM manager should not forcefully kill the process on Signal.TERM on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-5330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth resolved MAPREDUCE-5330. -- Resolution: Fixed Fix Version/s: 1-win I committed this to branch-1-win. Xi, thank you for contributing this patch. JVM manager should not forcefully kill the process on Signal.TERM on Windows Key: MAPREDUCE-5330 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5330 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1-win Environment: Windows Reporter: Xi Fang Assignee: Xi Fang Fix For: 1-win Attachments: MAPREDUCE-5330.patch In MapReduce, we sometimes kill a task's JVM before it naturally shuts down if we want to launch other tasks (look in JvmManager$JvmManagerForType.reapJvm). This behavior means that if the map task process is in the middle of doing some cleanup/finalization after the task is done, it might be interrupted/killed without giving it a chance. In the Microsoft's Hadoop Service, after a Map/Reduce task is done and during closing file systems in a special shutdown hook, we're typically uploading storage (ASV in our context) usage metrics to Microsoft Azure Tables. So if this kill happens these metrics get lost. The impact is that for many MR jobs we don't see accurate metrics reported most of the time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5330) JVM manager should not forcefully kill the process on Signal.TERM on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-5330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698205#comment-13698205 ] Xi Fang commented on MAPREDUCE-5330: Thanks Ivan and Chris! JVM manager should not forcefully kill the process on Signal.TERM on Windows Key: MAPREDUCE-5330 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5330 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1-win Environment: Windows Reporter: Xi Fang Assignee: Xi Fang Fix For: 1-win Attachments: MAPREDUCE-5330.patch In MapReduce, we sometimes kill a task's JVM before it naturally shuts down if we want to launch other tasks (look in JvmManager$JvmManagerForType.reapJvm). This behavior means that if the map task process is in the middle of doing some cleanup/finalization after the task is done, it might be interrupted/killed without giving it a chance. In the Microsoft's Hadoop Service, after a Map/Reduce task is done and during closing file systems in a special shutdown hook, we're typically uploading storage (ASV in our context) usage metrics to Microsoft Azure Tables. So if this kill happens these metrics get lost. The impact is that for many MR jobs we don't see accurate metrics reported most of the time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5351) JobTracker memory leak caused by CleanupQueue reopening FileSystem
[ https://issues.apache.org/jira/browse/MAPREDUCE-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698218#comment-13698218 ] Arun C Murthy commented on MAPREDUCE-5351: -- [~sandyr] - JIP.cleanupJob always results in PathDeletionContext with ugi being null... how does this fix the original problem? I'm missing something? Tx. JobTracker memory leak caused by CleanupQueue reopening FileSystem -- Key: MAPREDUCE-5351 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5351 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 1.1.2 Reporter: Sandy Ryza Assignee: Sandy Ryza Priority: Critical Fix For: 1.2.1 Attachments: MAPREDUCE-5351-1.patch, MAPREDUCE-5351-2.patch, MAPREDUCE-5351-addendum.patch, MAPREDUCE-5351.patch When a job is completed, closeAllForUGI is called to close all the cached FileSystems in the FileSystem cache. However, the CleanupQueue may run after this occurs and call FileSystem.get() to delete the staging directory, adding a FileSystem to the cache that will never be closed. People on the user-list have reported this causing their JobTrackers to OOME every two weeks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5351) JobTracker memory leak caused by CleanupQueue reopening FileSystem
[ https://issues.apache.org/jira/browse/MAPREDUCE-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698228#comment-13698228 ] Sandy Ryza commented on MAPREDUCE-5351: --- In JIP.cleanupJob: {code} Path tempDir = jobtracker.getSystemDirectoryForJob(getJobID()); CleanupQueue.getInstance().addToQueue( new PathDeletionContext(tempDir, conf)); // delete the staging area for the job and cancel delegation token String jobTempDir = conf.get(mapreduce.job.dir); if (jobTempDir != null conf.getKeepTaskFilesPattern() == null !conf.getKeepFailedTaskFiles()) { Path jobTempDirPath = new Path(jobTempDir); tempDirFs = jobTempDirPath.getFileSystem(conf); CleanupQueue.getInstance().addToQueue( new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); } {code} The CleanupQueue is used twice, once with the UGI set and once without. JobTracker memory leak caused by CleanupQueue reopening FileSystem -- Key: MAPREDUCE-5351 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5351 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 1.1.2 Reporter: Sandy Ryza Assignee: Sandy Ryza Priority: Critical Fix For: 1.2.1 Attachments: MAPREDUCE-5351-1.patch, MAPREDUCE-5351-2.patch, MAPREDUCE-5351-addendum.patch, MAPREDUCE-5351.patch When a job is completed, closeAllForUGI is called to close all the cached FileSystems in the FileSystem cache. However, the CleanupQueue may run after this occurs and call FileSystem.get() to delete the staging directory, adding a FileSystem to the cache that will never be closed. People on the user-list have reported this causing their JobTrackers to OOME every two weeks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5370) Rolling Restart Tasktrackers from JobTracker
Cindy Li created MAPREDUCE-5370: --- Summary: Rolling Restart Tasktrackers from JobTracker Key: MAPREDUCE-5370 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5370 Project: Hadoop Map/Reduce Issue Type: Task Reporter: Cindy Li Priority: Minor For near real-time jobs running on hadoop, we want to minimize impact on them when rolling restarting tasktrackers. The idea here is to restart tasktrackers from jobtracker and selectively choose task trackers according to tasks' status on them to do rolling restart, such that the impact on total job running time is minimum. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5371) TestProxyUserFromEnv#testProxyUserFromEnvironment failed caused by domains of windows users
Xi Fang created MAPREDUCE-5371: -- Summary: TestProxyUserFromEnv#testProxyUserFromEnvironment failed caused by domains of windows users Key: MAPREDUCE-5371 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5371 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1-win Environment: Windows Reporter: Xi Fang Assignee: Xi Fang Priority: Minor Fix For: 1-win The error message was: Error Message expected:[sijenkins-vm2]jenkins but was:[]jenkins Stacktrace at org.apache.hadoop.security.TestProxyUserFromEnv.testProxyUserFromEnvironment(TestProxyUserFromEnv.java:45) The root cause of this failure is the domain used on Windows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5371) TestProxyUserFromEnv#testProxyUserFromEnvironment failed caused by domains of windows users
[ https://issues.apache.org/jira/browse/MAPREDUCE-5371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated MAPREDUCE-5371: --- Attachment: MAPREDUCE-5371.patch TestProxyUserFromEnv#testProxyUserFromEnvironment failed caused by domains of windows users --- Key: MAPREDUCE-5371 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5371 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1-win Environment: Windows Reporter: Xi Fang Assignee: Xi Fang Priority: Minor Fix For: 1-win Attachments: MAPREDUCE-5371.patch The error message was: Error Message expected:[sijenkins-vm2]jenkins but was:[]jenkins Stacktrace at org.apache.hadoop.security.TestProxyUserFromEnv.testProxyUserFromEnvironment(TestProxyUserFromEnv.java:45) The root cause of this failure is the domain used on Windows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work started] (MAPREDUCE-5371) TestProxyUserFromEnv#testProxyUserFromEnvironment failed caused by domains of windows users
[ https://issues.apache.org/jira/browse/MAPREDUCE-5371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on MAPREDUCE-5371 started by Xi Fang. TestProxyUserFromEnv#testProxyUserFromEnvironment failed caused by domains of windows users --- Key: MAPREDUCE-5371 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5371 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1-win Environment: Windows Reporter: Xi Fang Assignee: Xi Fang Priority: Minor Fix For: 1-win Attachments: MAPREDUCE-5371.patch The error message was: Error Message expected:[sijenkins-vm2]jenkins but was:[]jenkins Stacktrace at org.apache.hadoop.security.TestProxyUserFromEnv.testProxyUserFromEnvironment(TestProxyUserFromEnv.java:45) The root cause of this failure is the domain used on Windows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5371) TestProxyUserFromEnv#testProxyUserFromEnvironment failed caused by domains of windows users
[ https://issues.apache.org/jira/browse/MAPREDUCE-5371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698240#comment-13698240 ] Xi Fang commented on MAPREDUCE-5371: The attached patch removed the domains from user names. TestProxyUserFromEnv#testProxyUserFromEnvironment failed caused by domains of windows users --- Key: MAPREDUCE-5371 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5371 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1-win Environment: Windows Reporter: Xi Fang Assignee: Xi Fang Priority: Minor Fix For: 1-win Attachments: MAPREDUCE-5371.patch The error message was: Error Message expected:[sijenkins-vm2]jenkins but was:[]jenkins Stacktrace at org.apache.hadoop.security.TestProxyUserFromEnv.testProxyUserFromEnvironment(TestProxyUserFromEnv.java:45) The root cause of this failure is the domain used on Windows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5355) MiniMRYarnCluster with localFs does not work on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-5355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698284#comment-13698284 ] Chris Nauroth commented on MAPREDUCE-5355: -- Thank you, Chuan. Nice find! I verified these tests on Mac and Windows. The patch has a couple of long lines. Can you please change the patch so that lines wrap at 80 characters? Otherwise, it looks good. MiniMRYarnCluster with localFs does not work on Windows --- Key: MAPREDUCE-5355 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5355 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0, 2.1.0-beta Reporter: Chuan Liu Assignee: Chuan Liu Priority: Minor Attachments: MAPREDUCE-5355-branch-2.patch, MAPREDUCE-5355-trunk.patch When MiniMRYarnCluster configured to run on localFs instead of remoteFs, i.e. MiniDFSCluster, the job will fail on Windows. The error message looks like the following. {noformat} java.io.IOException: Job status not available {noformat} In my testing, the following unit tests hit this exception. * TestMRJobsWithHistoryService * TestClusterMRNotification * TestJobCleanup * TestJobCounters * TestMiniMRClientCluster * TestJobOutputCommitter * TestMRAppWithCombiner * TestMROldApiJobs * TestSpeculativeExecution -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3193) FileInputFormat doesn't read files recursively in the input path dir
[ https://issues.apache.org/jira/browse/MAPREDUCE-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698305#comment-13698305 ] Hudson commented on MAPREDUCE-3193: --- Integrated in Hadoop-trunk-Commit #4033 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4033/]) MAPREDUCE-3193. FileInputFormat doesn't read files recursively in the input path dir. Contributed by Devaraj K (Revision 1499125) Result = SUCCESS jlowe : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1499125 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/util/ConfigUtil.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/lib/input * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestFileInputFormat.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestFileInputFormat.java FileInputFormat doesn't read files recursively in the input path dir Key: MAPREDUCE-3193 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3193 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1, mrv2 Affects Versions: 0.23.2, 2.0.0-alpha, 3.0.0 Reporter: Ramgopal N Assignee: Devaraj K Attachments: MAPREDUCE-3193-1.patch, MAPREDUCE-3193-2.patch, MAPREDUCE-3193-2.patch, MAPREDUCE-3193-3.patch, MAPREDUCE-3193-4.patch, MAPREDUCE-3193-5.patch, MAPREDUCE-3193.patch, MAPREDUCE-3193.security.patch java.io.FileNotFoundException is thrown,if input file is more than one folder level deep and the job is getting failed. Example:Input file is /r1/r2/input.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3193) FileInputFormat doesn't read files recursively in the input path dir
[ https://issues.apache.org/jira/browse/MAPREDUCE-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated MAPREDUCE-3193: -- Resolution: Fixed Fix Version/s: 0.23.10 2.3.0 3.0.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks Devaraj, and to all others who contributed to the review! I committed this to trunk, branch-2, and branch-0.23. FileInputFormat doesn't read files recursively in the input path dir Key: MAPREDUCE-3193 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3193 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1, mrv2 Affects Versions: 0.23.2, 2.0.0-alpha, 3.0.0 Reporter: Ramgopal N Assignee: Devaraj K Fix For: 3.0.0, 2.3.0, 0.23.10 Attachments: MAPREDUCE-3193-1.patch, MAPREDUCE-3193-2.patch, MAPREDUCE-3193-2.patch, MAPREDUCE-3193-3.patch, MAPREDUCE-3193-4.patch, MAPREDUCE-3193-5.patch, MAPREDUCE-3193.patch, MAPREDUCE-3193.security.patch java.io.FileNotFoundException is thrown,if input file is more than one folder level deep and the job is getting failed. Example:Input file is /r1/r2/input.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5355) MiniMRYarnCluster with localFs does not work on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-5355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chuan Liu updated MAPREDUCE-5355: - Attachment: MAPREDUCE-5355-trunk.2.patch Thanks for reviewing, Chris! Attach a new patch that fixes long lines. MiniMRYarnCluster with localFs does not work on Windows --- Key: MAPREDUCE-5355 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5355 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0, 2.1.0-beta Reporter: Chuan Liu Assignee: Chuan Liu Priority: Minor Attachments: MAPREDUCE-5355-branch-2.patch, MAPREDUCE-5355-trunk.2.patch, MAPREDUCE-5355-trunk.patch When MiniMRYarnCluster configured to run on localFs instead of remoteFs, i.e. MiniDFSCluster, the job will fail on Windows. The error message looks like the following. {noformat} java.io.IOException: Job status not available {noformat} In my testing, the following unit tests hit this exception. * TestMRJobsWithHistoryService * TestClusterMRNotification * TestJobCleanup * TestJobCounters * TestMiniMRClientCluster * TestJobOutputCommitter * TestMRAppWithCombiner * TestMROldApiJobs * TestSpeculativeExecution -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5357) Job staging directory owner checking could fail on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698321#comment-13698321 ] Chris Nauroth commented on MAPREDUCE-5357: -- Hi, Chuan. Can you list a sample of the tests that were fixed by this patch in your environment? I'd like to take it for a test run. Job staging directory owner checking could fail on Windows -- Key: MAPREDUCE-5357 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5357 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0, 2.1.0-beta Reporter: Chuan Liu Assignee: Chuan Liu Priority: Minor Attachments: MAPREDUCE-5357-trunk.patch In {{JobSubmissionFiles.getStagingDir()}}, we have following code that will throw exception if the directory owner is not the current user. {code:java} String owner = fsStatus.getOwner(); if (!(owner.equals(currentUser) || owner.equals(realUser))) { throw new IOException(The ownership on the staging directory + stagingArea + is not as expected. + It is owned by + owner + . The directory must + be owned by the submitter + currentUser + or + by + realUser); } {code} This check will fail on Windows when the underlying file system is LocalFileSystem. Because on Windows, the default file or directory owner could be Administrators group if the user belongs to Administrators group. Quite a few MR unit tests that runs MR mini cluster with localFs as underlying file system fail because of this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5367) Local jobs all use same local working directory
[ https://issues.apache.org/jira/browse/MAPREDUCE-5367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated MAPREDUCE-5367: -- Affects Version/s: (was: 2.0.5-alpha) 1.2.0 Local jobs all use same local working directory --- Key: MAPREDUCE-5367 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5367 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 1.2.0 Reporter: Sandy Ryza Assignee: Sandy Ryza This means that local jobs, even in different JVMs, can't run concurrently because they might delete each other's files during work directory setup. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (MAPREDUCE-5363) Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus
[ https://issues.apache.org/jira/browse/MAPREDUCE-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA reassigned MAPREDUCE-5363: Assignee: Akira AJISAKA Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus Key: MAPREDUCE-5363 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5363 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1, mrv2 Affects Versions: 1.1.2, 2.1.0-beta Reporter: Sandy Ryza Assignee: Akira AJISAKA Labels: newbie The doc for TaskCompletionEvent#get(Task)Status in both MR1 and MR2 is {code} Returns enum Status.SUCESS or Status.FAILURE. @return task tracker status {code} The actual values that the Status enum can take are FAILED, KILLED, SUCCEEDED, OBSOLETE, TIPFAILED -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5355) MiniMRYarnCluster with localFs does not work on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-5355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated MAPREDUCE-5355: - Hadoop Flags: Reviewed +1 for the patch. Thanks for addressing the formatting. I'll commit this. MiniMRYarnCluster with localFs does not work on Windows --- Key: MAPREDUCE-5355 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5355 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0, 2.1.0-beta Reporter: Chuan Liu Assignee: Chuan Liu Priority: Minor Attachments: MAPREDUCE-5355-branch-2.patch, MAPREDUCE-5355-trunk.2.patch, MAPREDUCE-5355-trunk.patch When MiniMRYarnCluster configured to run on localFs instead of remoteFs, i.e. MiniDFSCluster, the job will fail on Windows. The error message looks like the following. {noformat} java.io.IOException: Job status not available {noformat} In my testing, the following unit tests hit this exception. * TestMRJobsWithHistoryService * TestClusterMRNotification * TestJobCleanup * TestJobCounters * TestMiniMRClientCluster * TestJobOutputCommitter * TestMRAppWithCombiner * TestMROldApiJobs * TestSpeculativeExecution -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5317) Stale files left behind for failed jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698366#comment-13698366 ] Jason Lowe commented on MAPREDUCE-5317: --- Thanks for the update, Ravi. Are we pushing the JOB_WAIT_TIMEOUT to another JIRA? I didn't see that addressed. A few more comments: * Why does FAIL_WAIT ignore the JOB_COMMIT_COMPLETED/JOB_COMMIT_FAILED events? I don't see how those events could arrive in this state, as it would require the committer to have been invoked sometime before entering this state. Maybe I'm missing a scenario where that does occur? KILL_WAIT doesn't do this, for example, so it seems we should either not need this in FAIL_WAIT or KILL_WAIT also needs it. * In the testcase, it's using AsyncDispatcher yet checking immediately after handling an event that the committer has not been invoked. This is inherently racy due to the nature of AsyncDispatcher. Couple of options to fix it: ** Use InlineDispatcher or DrainDispatcher and call drain() (the latter is still technically a bit racy but the window is much smaller) ** Rather than checking the committer directly, spy/mock the event handler and verify after the event was handled that we didn't try to dispatch a committer event * Nit: rather than explicitly waiting a hardcoded duration in the test case, we might be able to use verify with a timeout so we don't have to wait the full duration under normal test conditions. Stale files left behind for failed jobs --- Key: MAPREDUCE-5317 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5317 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 3.0.0, 2.0.4-alpha, 0.23.8 Reporter: Ravi Prakash Assignee: Ravi Prakash Attachments: MAPREDUCE-5317.branch-0.23.patch, MAPREDUCE-5317.patch, MAPREDUCE-5317.patch, MAPREDUCE-5317.patch, MAPREDUCE-5317.patch Courtesy [~amar_kamat]! {quote} We are seeing _temporary files left behind in the output folder if the job fails. The job were failed due to hitting quota issue. I simply ran the randomwriter (from hadoop examples) with the default setting. That failed and left behind some stray files. {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5359) JobHistory should not use File.separator to match timestamp in path
[ https://issues.apache.org/jira/browse/MAPREDUCE-5359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated MAPREDUCE-5359: - Hadoop Flags: Reviewed +1 for the patch. I'll commit this. JobHistory should not use File.separator to match timestamp in path --- Key: MAPREDUCE-5359 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5359 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0, 2.1.0-beta Reporter: Chuan Liu Assignee: Chuan Liu Priority: Minor Attachments: MAPREDUCE-5359-trunk.2.patch, MAPREDUCE-5359-trunk.patch In {{HistoryFileManager.getTimestampPartFromPath()}} method, we use the following regular expression to match the timestamp in a Path object. {code:java} \\d{4} + \\ + File.separator + \\d{2} + \\ + File.separator + \\d{2} {code} This is incorrect because Path uses backslash even for Windows path while File.separator is platform dependent, and is a forward slash on Windows. This leads to failure matching the timestamp on Windows. One consequence is that {{addDirectoryToSerialNumberIndex()}} also failed. Later, {{getFileInfo()}} will fail if the job info is not in cache or intermediate directory. The test case {{TestJobHistoryParsing.testScanningOldDirs()}} tests exactly the above scenario and fails on Windows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5355) MiniMRYarnCluster with localFs does not work on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-5355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698378#comment-13698378 ] Hadoop QA commented on MAPREDUCE-5355: -- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12590530/MAPREDUCE-5355-trunk.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3823//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3823//console This message is automatically generated. MiniMRYarnCluster with localFs does not work on Windows --- Key: MAPREDUCE-5355 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5355 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0, 2.1.0-beta Reporter: Chuan Liu Assignee: Chuan Liu Priority: Minor Attachments: MAPREDUCE-5355-branch-2.patch, MAPREDUCE-5355-trunk.2.patch, MAPREDUCE-5355-trunk.patch When MiniMRYarnCluster configured to run on localFs instead of remoteFs, i.e. MiniDFSCluster, the job will fail on Windows. The error message looks like the following. {noformat} java.io.IOException: Job status not available {noformat} In my testing, the following unit tests hit this exception. * TestMRJobsWithHistoryService * TestClusterMRNotification * TestJobCleanup * TestJobCounters * TestMiniMRClientCluster * TestJobOutputCommitter * TestMRAppWithCombiner * TestMROldApiJobs * TestSpeculativeExecution -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5372) ControlledJob#getMapredJobID capitalization is inconsistent between MR1 and MR2
Sandy Ryza created MAPREDUCE-5372: - Summary: ControlledJob#getMapredJobID capitalization is inconsistent between MR1 and MR2 Key: MAPREDUCE-5372 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5372 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Sandy Ryza In MR2, the 'd' in Id is lowercase, but in MR1, it is capitalized. While ControlledJob is marked as Evolving, there is no reason to be inconsistent here. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5355) MiniMRYarnCluster with localFs does not work on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-5355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698388#comment-13698388 ] Hudson commented on MAPREDUCE-5355: --- Integrated in Hadoop-trunk-Commit #4035 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4035/]) MAPREDUCE-5355. MiniMRYarnCluster with localFs does not work on Windows. Contributed by Chuan Liu. (Revision 1499148) Result = SUCCESS cnauroth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1499148 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/MiniMRYarnCluster.java MiniMRYarnCluster with localFs does not work on Windows --- Key: MAPREDUCE-5355 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5355 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0, 2.1.0-beta Reporter: Chuan Liu Assignee: Chuan Liu Priority: Minor Attachments: MAPREDUCE-5355-branch-2.patch, MAPREDUCE-5355-trunk.2.patch, MAPREDUCE-5355-trunk.patch When MiniMRYarnCluster configured to run on localFs instead of remoteFs, i.e. MiniDFSCluster, the job will fail on Windows. The error message looks like the following. {noformat} java.io.IOException: Job status not available {noformat} In my testing, the following unit tests hit this exception. * TestMRJobsWithHistoryService * TestClusterMRNotification * TestJobCleanup * TestJobCounters * TestMiniMRClientCluster * TestJobOutputCommitter * TestMRAppWithCombiner * TestMROldApiJobs * TestSpeculativeExecution -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5357) Job staging directory owner checking could fail on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698391#comment-13698391 ] Chuan Liu commented on MAPREDUCE-5357: -- Hi Chris, the following tests fail on my machine. * TestClusterMRNotification * TestJobCleanup * TestJobCounters * TestJobOutputCommitter * TestMROldApiJobs * TestSpeculativeExecution You may need to delete the stating directory on your local drive to repro the failure. The user running the tests need to be an Administrators group user. Error message looks like the following for me. {noformat} testSpeculativeExecution(org.apache.hadoop.mapreduce.v2.TestSpeculativeExecution) Time elapsed: 22109 sec ERROR! java.io.IOException: The ownership on the staging directory E:/tmp/hadoop-yarn/staging/chuanliu/.staging is not as expected. It is owned by Administrators. The directory must be owned by the submitter chuanliu or by chuanliu {noformat} Job staging directory owner checking could fail on Windows -- Key: MAPREDUCE-5357 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5357 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0, 2.1.0-beta Reporter: Chuan Liu Assignee: Chuan Liu Priority: Minor Attachments: MAPREDUCE-5357-trunk.patch In {{JobSubmissionFiles.getStagingDir()}}, we have following code that will throw exception if the directory owner is not the current user. {code:java} String owner = fsStatus.getOwner(); if (!(owner.equals(currentUser) || owner.equals(realUser))) { throw new IOException(The ownership on the staging directory + stagingArea + is not as expected. + It is owned by + owner + . The directory must + be owned by the submitter + currentUser + or + by + realUser); } {code} This check will fail on Windows when the underlying file system is LocalFileSystem. Because on Windows, the default file or directory owner could be Administrators group if the user belongs to Administrators group. Quite a few MR unit tests that runs MR mini cluster with localFs as underlying file system fail because of this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5359) JobHistory should not use File.separator to match timestamp in path
[ https://issues.apache.org/jira/browse/MAPREDUCE-5359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698394#comment-13698394 ] Chris Nauroth commented on MAPREDUCE-5359: -- {quote} -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {quote} This patch fixes existing tests on Windows, so no new tests are needed. JobHistory should not use File.separator to match timestamp in path --- Key: MAPREDUCE-5359 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5359 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0, 2.1.0-beta Reporter: Chuan Liu Assignee: Chuan Liu Priority: Minor Attachments: MAPREDUCE-5359-trunk.2.patch, MAPREDUCE-5359-trunk.patch In {{HistoryFileManager.getTimestampPartFromPath()}} method, we use the following regular expression to match the timestamp in a Path object. {code:java} \\d{4} + \\ + File.separator + \\d{2} + \\ + File.separator + \\d{2} {code} This is incorrect because Path uses backslash even for Windows path while File.separator is platform dependent, and is a forward slash on Windows. This leads to failure matching the timestamp on Windows. One consequence is that {{addDirectoryToSerialNumberIndex()}} also failed. Later, {{getFileInfo()}} will fail if the job info is not in cache or intermediate directory. The test case {{TestJobHistoryParsing.testScanningOldDirs()}} tests exactly the above scenario and fails on Windows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5355) MiniMRYarnCluster with localFs does not work on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-5355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated MAPREDUCE-5355: - Resolution: Fixed Fix Version/s: 2.1.0-beta 3.0.0 Target Version/s: 3.0.0, 2.1.0-beta Status: Resolved (was: Patch Available) I committed this to trunk, branch-2, and branch-2.1-beta. Chuan, thank you for your contribution. MiniMRYarnCluster with localFs does not work on Windows --- Key: MAPREDUCE-5355 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5355 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0, 2.1.0-beta Reporter: Chuan Liu Assignee: Chuan Liu Priority: Minor Fix For: 3.0.0, 2.1.0-beta Attachments: MAPREDUCE-5355-branch-2.patch, MAPREDUCE-5355-trunk.2.patch, MAPREDUCE-5355-trunk.patch When MiniMRYarnCluster configured to run on localFs instead of remoteFs, i.e. MiniDFSCluster, the job will fail on Windows. The error message looks like the following. {noformat} java.io.IOException: Job status not available {noformat} In my testing, the following unit tests hit this exception. * TestMRJobsWithHistoryService * TestClusterMRNotification * TestJobCleanup * TestJobCounters * TestMiniMRClientCluster * TestJobOutputCommitter * TestMRAppWithCombiner * TestMROldApiJobs * TestSpeculativeExecution -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5359) JobHistory should not use File.separator to match timestamp in path
[ https://issues.apache.org/jira/browse/MAPREDUCE-5359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698422#comment-13698422 ] Hudson commented on MAPREDUCE-5359: --- Integrated in Hadoop-trunk-Commit #4037 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4037/]) MAPREDUCE-5359. JobHistory should not use File.separator to match timestamp in path. Contributed by Chuan Liu. (Revision 1499153) Result = SUCCESS cnauroth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1499153 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/jobhistory/JobHistoryUtils.java JobHistory should not use File.separator to match timestamp in path --- Key: MAPREDUCE-5359 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5359 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0, 2.1.0-beta Reporter: Chuan Liu Assignee: Chuan Liu Priority: Minor Attachments: MAPREDUCE-5359-trunk.2.patch, MAPREDUCE-5359-trunk.patch In {{HistoryFileManager.getTimestampPartFromPath()}} method, we use the following regular expression to match the timestamp in a Path object. {code:java} \\d{4} + \\ + File.separator + \\d{2} + \\ + File.separator + \\d{2} {code} This is incorrect because Path uses backslash even for Windows path while File.separator is platform dependent, and is a forward slash on Windows. This leads to failure matching the timestamp on Windows. One consequence is that {{addDirectoryToSerialNumberIndex()}} also failed. Later, {{getFileInfo()}} will fail if the job info is not in cache or intermediate directory. The test case {{TestJobHistoryParsing.testScanningOldDirs()}} tests exactly the above scenario and fails on Windows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5359) JobHistory should not use File.separator to match timestamp in path
[ https://issues.apache.org/jira/browse/MAPREDUCE-5359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated MAPREDUCE-5359: - Resolution: Fixed Fix Version/s: 2.1.0-beta 3.0.0 Target Version/s: 3.0.0, 2.1.0-beta Status: Resolved (was: Patch Available) I committed this to trunk, branch-2, and branch-2.1-beta. Thank you to Chuan for contributing this patch. JobHistory should not use File.separator to match timestamp in path --- Key: MAPREDUCE-5359 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5359 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0, 2.1.0-beta Reporter: Chuan Liu Assignee: Chuan Liu Priority: Minor Fix For: 3.0.0, 2.1.0-beta Attachments: MAPREDUCE-5359-trunk.2.patch, MAPREDUCE-5359-trunk.patch In {{HistoryFileManager.getTimestampPartFromPath()}} method, we use the following regular expression to match the timestamp in a Path object. {code:java} \\d{4} + \\ + File.separator + \\d{2} + \\ + File.separator + \\d{2} {code} This is incorrect because Path uses backslash even for Windows path while File.separator is platform dependent, and is a forward slash on Windows. This leads to failure matching the timestamp on Windows. One consequence is that {{addDirectoryToSerialNumberIndex()}} also failed. Later, {{getFileInfo()}} will fail if the job info is not in cache or intermediate directory. The test case {{TestJobHistoryParsing.testScanningOldDirs()}} tests exactly the above scenario and fails on Windows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5363) Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus
[ https://issues.apache.org/jira/browse/MAPREDUCE-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated MAPREDUCE-5363: - Status: Patch Available (was: Open) Wrote all values Status enum can take and fixed spelling. Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus Key: MAPREDUCE-5363 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5363 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1, mrv2 Affects Versions: 1.1.2, 2.1.0-beta Reporter: Sandy Ryza Assignee: Akira AJISAKA Labels: newbie The doc for TaskCompletionEvent#get(Task)Status in both MR1 and MR2 is {code} Returns enum Status.SUCESS or Status.FAILURE. @return task tracker status {code} The actual values that the Status enum can take are FAILED, KILLED, SUCCEEDED, OBSOLETE, TIPFAILED -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5363) Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus
[ https://issues.apache.org/jira/browse/MAPREDUCE-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated MAPREDUCE-5363: - Attachment: MAPREDUCE-5363-1.patch Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus Key: MAPREDUCE-5363 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5363 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1, mrv2 Affects Versions: 1.1.2, 2.1.0-beta Reporter: Sandy Ryza Assignee: Akira AJISAKA Labels: newbie Attachments: MAPREDUCE-5363-1.patch The doc for TaskCompletionEvent#get(Task)Status in both MR1 and MR2 is {code} Returns enum Status.SUCESS or Status.FAILURE. @return task tracker status {code} The actual values that the Status enum can take are FAILED, KILLED, SUCCEEDED, OBSOLETE, TIPFAILED -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5363) Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus
[ https://issues.apache.org/jira/browse/MAPREDUCE-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated MAPREDUCE-5363: - Status: Open (was: Patch Available) I forgot to add a patch. Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus Key: MAPREDUCE-5363 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5363 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1, mrv2 Affects Versions: 1.1.2, 2.1.0-beta Reporter: Sandy Ryza Assignee: Akira AJISAKA Labels: newbie Attachments: MAPREDUCE-5363-1.patch The doc for TaskCompletionEvent#get(Task)Status in both MR1 and MR2 is {code} Returns enum Status.SUCESS or Status.FAILURE. @return task tracker status {code} The actual values that the Status enum can take are FAILED, KILLED, SUCCEEDED, OBSOLETE, TIPFAILED -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5363) Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus
[ https://issues.apache.org/jira/browse/MAPREDUCE-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated MAPREDUCE-5363: - Status: Patch Available (was: Open) I added a patch. Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus Key: MAPREDUCE-5363 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5363 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1, mrv2 Affects Versions: 1.1.2, 2.1.0-beta Reporter: Sandy Ryza Assignee: Akira AJISAKA Labels: newbie Attachments: MAPREDUCE-5363-1.patch The doc for TaskCompletionEvent#get(Task)Status in both MR1 and MR2 is {code} Returns enum Status.SUCESS or Status.FAILURE. @return task tracker status {code} The actual values that the Status enum can take are FAILED, KILLED, SUCCEEDED, OBSOLETE, TIPFAILED -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5351) JobTracker memory leak caused by CleanupQueue reopening FileSystem
[ https://issues.apache.org/jira/browse/MAPREDUCE-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698438#comment-13698438 ] Arun C Murthy commented on MAPREDUCE-5351: -- Duh, good point. Can you please add a comment to the addendum patch explaining the rationale for the check? Also, a test case specifically covering this bug would be good. Thanks! JobTracker memory leak caused by CleanupQueue reopening FileSystem -- Key: MAPREDUCE-5351 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5351 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 1.1.2 Reporter: Sandy Ryza Assignee: Sandy Ryza Priority: Critical Fix For: 1.2.1 Attachments: MAPREDUCE-5351-1.patch, MAPREDUCE-5351-2.patch, MAPREDUCE-5351-addendum.patch, MAPREDUCE-5351.patch When a job is completed, closeAllForUGI is called to close all the cached FileSystems in the FileSystem cache. However, the CleanupQueue may run after this occurs and call FileSystem.get() to delete the staging directory, adding a FileSystem to the cache that will never be closed. People on the user-list have reported this causing their JobTrackers to OOME every two weeks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5363) Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus
[ https://issues.apache.org/jira/browse/MAPREDUCE-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698445#comment-13698445 ] Hadoop QA commented on MAPREDUCE-5363: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12590549/MAPREDUCE-5363-1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3824//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3824//console This message is automatically generated. Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus Key: MAPREDUCE-5363 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5363 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1, mrv2 Affects Versions: 1.1.2, 2.1.0-beta Reporter: Sandy Ryza Assignee: Akira AJISAKA Labels: newbie Attachments: MAPREDUCE-5363-1.patch The doc for TaskCompletionEvent#get(Task)Status in both MR1 and MR2 is {code} Returns enum Status.SUCESS or Status.FAILURE. @return task tracker status {code} The actual values that the Status enum can take are FAILED, KILLED, SUCCEEDED, OBSOLETE, TIPFAILED -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5363) Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus
[ https://issues.apache.org/jira/browse/MAPREDUCE-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698448#comment-13698448 ] Sandy Ryza commented on MAPREDUCE-5363: --- Thanks for taking this up Akira. Those changes look good to me. Sorry I didn't mention this before, but I think it would also be clearer to replace task tracker status with task completion status. Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus Key: MAPREDUCE-5363 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5363 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1, mrv2 Affects Versions: 1.1.2, 2.1.0-beta Reporter: Sandy Ryza Assignee: Akira AJISAKA Labels: newbie Attachments: MAPREDUCE-5363-1.patch The doc for TaskCompletionEvent#get(Task)Status in both MR1 and MR2 is {code} Returns enum Status.SUCESS or Status.FAILURE. @return task tracker status {code} The actual values that the Status enum can take are FAILED, KILLED, SUCCEEDED, OBSOLETE, TIPFAILED -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5363) Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus
[ https://issues.apache.org/jira/browse/MAPREDUCE-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated MAPREDUCE-5363: - Attachment: MAPREDUCE-5363-2.patch Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus Key: MAPREDUCE-5363 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5363 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1, mrv2 Affects Versions: 1.1.2, 2.1.0-beta Reporter: Sandy Ryza Assignee: Akira AJISAKA Labels: newbie Attachments: MAPREDUCE-5363-1.patch, MAPREDUCE-5363-2.patch The doc for TaskCompletionEvent#get(Task)Status in both MR1 and MR2 is {code} Returns enum Status.SUCESS or Status.FAILURE. @return task tracker status {code} The actual values that the Status enum can take are FAILED, KILLED, SUCCEEDED, OBSOLETE, TIPFAILED -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5363) Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus
[ https://issues.apache.org/jira/browse/MAPREDUCE-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698467#comment-13698467 ] Akira AJISAKA commented on MAPREDUCE-5363: -- I agree with your proposal. I attached a patch. Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus Key: MAPREDUCE-5363 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5363 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1, mrv2 Affects Versions: 1.1.2, 2.1.0-beta Reporter: Sandy Ryza Assignee: Akira AJISAKA Labels: newbie Attachments: MAPREDUCE-5363-1.patch, MAPREDUCE-5363-2.patch The doc for TaskCompletionEvent#get(Task)Status in both MR1 and MR2 is {code} Returns enum Status.SUCESS or Status.FAILURE. @return task tracker status {code} The actual values that the Status enum can take are FAILED, KILLED, SUCCEEDED, OBSOLETE, TIPFAILED -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (MAPREDUCE-4192) the TaskMemoryManager thread is not interrupt when the TaskTracker is oedered to reinit by JobTracker
[ https://issues.apache.org/jira/browse/MAPREDUCE-4192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hua xu reassigned MAPREDUCE-4192: - Assignee: Hua xu the TaskMemoryManager thread is not interrupt when the TaskTracker is oedered to reinit by JobTracker - Key: MAPREDUCE-4192 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4192 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.2 Reporter: Hua xu Assignee: Hua xu When the TaskTracker is oedered to reinit by JobTracker, it will interrupt some threads and then reinit them, but TaskTracker does not interrupt TaskMemoryManager thread and create a new TaskMemoryManager thread again. I use the tool--jstack to find that(I reinit TaskTracker 3 times through JobTracker send TaskTrackerAction.ActionType.REINIT_TRACKER). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5363) Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus
[ https://issues.apache.org/jira/browse/MAPREDUCE-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698474#comment-13698474 ] Hadoop QA commented on MAPREDUCE-5363: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12590554/MAPREDUCE-5363-2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3825//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3825//console This message is automatically generated. Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus Key: MAPREDUCE-5363 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5363 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1, mrv2 Affects Versions: 1.1.2, 2.1.0-beta Reporter: Sandy Ryza Assignee: Akira AJISAKA Labels: newbie Attachments: MAPREDUCE-5363-1.patch, MAPREDUCE-5363-2.patch The doc for TaskCompletionEvent#get(Task)Status in both MR1 and MR2 is {code} Returns enum Status.SUCESS or Status.FAILURE. @return task tracker status {code} The actual values that the Status enum can take are FAILED, KILLED, SUCCEEDED, OBSOLETE, TIPFAILED -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5357) Job staging directory owner checking could fail on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated MAPREDUCE-5357: - Target Version/s: 3.0.0, 2.1.0-beta Hadoop Flags: Reviewed Job staging directory owner checking could fail on Windows -- Key: MAPREDUCE-5357 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5357 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0, 2.1.0-beta Reporter: Chuan Liu Assignee: Chuan Liu Priority: Minor Attachments: MAPREDUCE-5357-trunk.patch In {{JobSubmissionFiles.getStagingDir()}}, we have following code that will throw exception if the directory owner is not the current user. {code:java} String owner = fsStatus.getOwner(); if (!(owner.equals(currentUser) || owner.equals(realUser))) { throw new IOException(The ownership on the staging directory + stagingArea + is not as expected. + It is owned by + owner + . The directory must + be owned by the submitter + currentUser + or + by + realUser); } {code} This check will fail on Windows when the underlying file system is LocalFileSystem. Because on Windows, the default file or directory owner could be Administrators group if the user belongs to Administrators group. Quite a few MR unit tests that runs MR mini cluster with localFs as underlying file system fail because of this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5357) Job staging directory owner checking could fail on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698610#comment-13698610 ] Chris Nauroth commented on MAPREDUCE-5357: -- {quote} The user running the tests need to be an Administrators group user. {quote} This explains why I wasn't seeing the problem earlier. I've been running as a non-admin user. I've verified that the tests still pass when running as a non-admin user. +1 for the patch. I'll commit this. Job staging directory owner checking could fail on Windows -- Key: MAPREDUCE-5357 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5357 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0, 2.1.0-beta Reporter: Chuan Liu Assignee: Chuan Liu Priority: Minor Attachments: MAPREDUCE-5357-trunk.patch In {{JobSubmissionFiles.getStagingDir()}}, we have following code that will throw exception if the directory owner is not the current user. {code:java} String owner = fsStatus.getOwner(); if (!(owner.equals(currentUser) || owner.equals(realUser))) { throw new IOException(The ownership on the staging directory + stagingArea + is not as expected. + It is owned by + owner + . The directory must + be owned by the submitter + currentUser + or + by + realUser); } {code} This check will fail on Windows when the underlying file system is LocalFileSystem. Because on Windows, the default file or directory owner could be Administrators group if the user belongs to Administrators group. Quite a few MR unit tests that runs MR mini cluster with localFs as underlying file system fail because of this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5357) Job staging directory owner checking could fail on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698615#comment-13698615 ] Chris Nauroth commented on MAPREDUCE-5357: -- {quote} -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {quote} The patch fixes multiple existing tests, so no new tests are required. Job staging directory owner checking could fail on Windows -- Key: MAPREDUCE-5357 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5357 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0, 2.1.0-beta Reporter: Chuan Liu Assignee: Chuan Liu Priority: Minor Attachments: MAPREDUCE-5357-trunk.patch In {{JobSubmissionFiles.getStagingDir()}}, we have following code that will throw exception if the directory owner is not the current user. {code:java} String owner = fsStatus.getOwner(); if (!(owner.equals(currentUser) || owner.equals(realUser))) { throw new IOException(The ownership on the staging directory + stagingArea + is not as expected. + It is owned by + owner + . The directory must + be owned by the submitter + currentUser + or + by + realUser); } {code} This check will fail on Windows when the underlying file system is LocalFileSystem. Because on Windows, the default file or directory owner could be Administrators group if the user belongs to Administrators group. Quite a few MR unit tests that runs MR mini cluster with localFs as underlying file system fail because of this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5363) Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus
[ https://issues.apache.org/jira/browse/MAPREDUCE-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698617#comment-13698617 ] Zhijie Shen commented on MAPREDUCE-5363: The doc fix looks good, but is it more concise to refer to Status enum as follows: {code} * Returns {@link Status} {code} instead of listing all enum values? Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus Key: MAPREDUCE-5363 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5363 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1, mrv2 Affects Versions: 1.1.2, 2.1.0-beta Reporter: Sandy Ryza Assignee: Akira AJISAKA Labels: newbie Attachments: MAPREDUCE-5363-1.patch, MAPREDUCE-5363-2.patch The doc for TaskCompletionEvent#get(Task)Status in both MR1 and MR2 is {code} Returns enum Status.SUCESS or Status.FAILURE. @return task tracker status {code} The actual values that the Status enum can take are FAILED, KILLED, SUCCEEDED, OBSOLETE, TIPFAILED -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5357) Job staging directory owner checking could fail on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698622#comment-13698622 ] Hudson commented on MAPREDUCE-5357: --- Integrated in Hadoop-trunk-Commit #4038 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4038/]) MAPREDUCE-5357. Job staging directory owner checking could fail on Windows. (Revision 1499210) Result = SUCCESS cnauroth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1499210 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobSubmissionFiles.java Job staging directory owner checking could fail on Windows -- Key: MAPREDUCE-5357 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5357 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0, 2.1.0-beta Reporter: Chuan Liu Assignee: Chuan Liu Priority: Minor Attachments: MAPREDUCE-5357-trunk.patch In {{JobSubmissionFiles.getStagingDir()}}, we have following code that will throw exception if the directory owner is not the current user. {code:java} String owner = fsStatus.getOwner(); if (!(owner.equals(currentUser) || owner.equals(realUser))) { throw new IOException(The ownership on the staging directory + stagingArea + is not as expected. + It is owned by + owner + . The directory must + be owned by the submitter + currentUser + or + by + realUser); } {code} This check will fail on Windows when the underlying file system is LocalFileSystem. Because on Windows, the default file or directory owner could be Administrators group if the user belongs to Administrators group. Quite a few MR unit tests that runs MR mini cluster with localFs as underlying file system fail because of this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5357) Job staging directory owner checking could fail on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated MAPREDUCE-5357: - Resolution: Fixed Status: Resolved (was: Patch Available) I committed this to trunk, branch-2, and branch-2.1-beta. Thank you for contributing the patch, Chuan. Job staging directory owner checking could fail on Windows -- Key: MAPREDUCE-5357 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5357 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0, 2.1.0-beta Reporter: Chuan Liu Assignee: Chuan Liu Priority: Minor Attachments: MAPREDUCE-5357-trunk.patch In {{JobSubmissionFiles.getStagingDir()}}, we have following code that will throw exception if the directory owner is not the current user. {code:java} String owner = fsStatus.getOwner(); if (!(owner.equals(currentUser) || owner.equals(realUser))) { throw new IOException(The ownership on the staging directory + stagingArea + is not as expected. + It is owned by + owner + . The directory must + be owned by the submitter + currentUser + or + by + realUser); } {code} This check will fail on Windows when the underlying file system is LocalFileSystem. Because on Windows, the default file or directory owner could be Administrators group if the user belongs to Administrators group. Quite a few MR unit tests that runs MR mini cluster with localFs as underlying file system fail because of this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5357) Job staging directory owner checking could fail on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated MAPREDUCE-5357: - Fix Version/s: 2.1.0-beta 3.0.0 Job staging directory owner checking could fail on Windows -- Key: MAPREDUCE-5357 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5357 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0, 2.1.0-beta Reporter: Chuan Liu Assignee: Chuan Liu Priority: Minor Fix For: 3.0.0, 2.1.0-beta Attachments: MAPREDUCE-5357-trunk.patch In {{JobSubmissionFiles.getStagingDir()}}, we have following code that will throw exception if the directory owner is not the current user. {code:java} String owner = fsStatus.getOwner(); if (!(owner.equals(currentUser) || owner.equals(realUser))) { throw new IOException(The ownership on the staging directory + stagingArea + is not as expected. + It is owned by + owner + . The directory must + be owned by the submitter + currentUser + or + by + realUser); } {code} This check will fail on Windows when the underlying file system is LocalFileSystem. Because on Windows, the default file or directory owner could be Administrators group if the user belongs to Administrators group. Quite a few MR unit tests that runs MR mini cluster with localFs as underlying file system fail because of this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira