[jira] [Commented] (MAPREDUCE-5364) Deadlock between RenewalTimerTask methods cancel() and run()

2013-07-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697540#comment-13697540
 ] 

Hadoop QA commented on MAPREDUCE-5364:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12590296/mr-5364-1.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3820//console

This message is automatically generated.

 Deadlock between RenewalTimerTask methods cancel() and run()
 

 Key: MAPREDUCE-5364
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5364
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: mr-5364-1.patch


 MAPREDUCE-4860 introduced a local variable {{cancelled}} in 
 {{RenewalTimerTask}} to fix the race where {{DelegationTokenRenewal}} 
 attempts to renew a token even after the job is removed. However, the patch 
 also makes {{run()}} and {{cancel()}} synchronized methods leading to a 
 potential deadlock against {{run()}}'s catch-block (error-path).
 The deadlock stacks below:
 {noformat}
  - 
 org.apache.hadoop.mapreduce.security.token.DelegationTokenRenewal$RenewalTimerTask.cancel()
  @bci=0, line=240 (Interpreted frame)
  - 
 org.apache.hadoop.mapreduce.security.token.DelegationTokenRenewal.removeDelegationTokenRenewalForJob(org.apache.hadoop.mapreduce.JobID)
  @bci=109, line=319 (Interpreted frame)
 {noformat}
 {noformat}
  - 
 org.apache.hadoop.mapreduce.security.token.DelegationTokenRenewal.removeFailedDelegationToken(org.apache.hadoop.mapreduce.security.token.DelegationTokenRenewal$DelegationTokenToRenew)
  @bci=62, line=297 (Interpreted frame)
  - 
 org.apache.hadoop.mapreduce.security.token.DelegationTokenRenewal.access$300(org.apache.hadoop.mapreduce.security.token.DelegationTokenRenewal$DelegationTokenToRenew)
  @bci=1, line=47 (Interpreted frame)
  - 
 org.apache.hadoop.mapreduce.security.token.DelegationTokenRenewal$RenewalTimerTask.run()
  @bci=148, line=234 (Interpreted frame)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5365) Set mapreduce.job.classloader to true by default

2013-07-02 Thread Devaraj K (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697559#comment-13697559
 ] 

Devaraj K commented on MAPREDUCE-5365:
--

+1, Changes look good to me.

 Set mapreduce.job.classloader to true by default
 

 Key: MAPREDUCE-5365
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5365
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.0.5-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: MAPREDUCE-5365.patch


 MAPREDUCE-1700 introduced the mapreduce.job.classpath option, which uses a 
 custom classloader to separate system classes from user classes.  It seems 
 like there are only rare cases when a user would not want this on, and that 
 it should enabled by default.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5368) Save memory by set capacity, load factor and concurrency level for ConcurrentHashMap in TaskInProgress

2013-07-02 Thread zhaoyunjiong (JIRA)
zhaoyunjiong created MAPREDUCE-5368:
---

 Summary: Save memory by  set capacity, load factor and concurrency 
level for ConcurrentHashMap in TaskInProgress
 Key: MAPREDUCE-5368
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5368
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Affects Versions: 1.2.0
Reporter: zhaoyunjiong
 Fix For: 1.2.1


Below is histo from our JobTracker:

 num #instances #bytes  class name
--
   1: 13604882411347237456  [C
   2: 124156992 5959535616  
java.util.concurrent.locks.ReentrantLock$NonfairSync
   3: 124156973 5959534704  
java.util.concurrent.ConcurrentHashMap$Segment
   4: 135887753 5435510120  java.lang.String
   5: 124213692 3975044400  
[Ljava.util.concurrent.ConcurrentHashMap$HashEntry;
   6:  63777311 3061310928  java.util.HashMap$Entry
   7:  35038252 2803060160  java.util.TreeMap
   8:  16921110 2712480072  [Ljava.util.HashMap$Entry;
   9:   4803617 2420449192  [Ljava.lang.Object;
  10:  50392816 2015712640  org.apache.hadoop.mapred.Counters$Counter
  11:   7775438 1181866576  
[Ljava.util.concurrent.ConcurrentHashMap$Segment;
  12:   3882847 1118259936  org.apache.hadoop.mapred.TaskInProgress


ConcurrentHashMap takes more than 14G(5959535616 + 5959534704 + 3975044400).
The trouble maker are below codes in TaskInProgress.java:
  MapTaskAttemptID, Locality taskLocality = 
  new ConcurrentHashMapTaskAttemptID, Locality();
  MapTaskAttemptID, Avataar taskAvataar = 
  new ConcurrentHashMapTaskAttemptID, Avataar();


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5368) Save memory by set capacity, load factor and concurrency level for ConcurrentHashMap in TaskInProgress

2013-07-02 Thread zhaoyunjiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaoyunjiong updated MAPREDUCE-5368:


Attachment: MAPREDUCE-5368.patch

This simple patch can save more than 10GB when there are 4m TaskInProgress 
instances.


 Save memory by  set capacity, load factor and concurrency level for 
 ConcurrentHashMap in TaskInProgress
 ---

 Key: MAPREDUCE-5368
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5368
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Affects Versions: 1.2.0
Reporter: zhaoyunjiong
 Fix For: 1.2.1

 Attachments: MAPREDUCE-5368.patch


 Below is histo from our JobTracker:
  num #instances #bytes  class name
 --
1: 13604882411347237456  [C
2: 124156992 5959535616  
 java.util.concurrent.locks.ReentrantLock$NonfairSync
3: 124156973 5959534704  
 java.util.concurrent.ConcurrentHashMap$Segment
4: 135887753 5435510120  java.lang.String
5: 124213692 3975044400  
 [Ljava.util.concurrent.ConcurrentHashMap$HashEntry;
6:  63777311 3061310928  java.util.HashMap$Entry
7:  35038252 2803060160  java.util.TreeMap
8:  16921110 2712480072  [Ljava.util.HashMap$Entry;
9:   4803617 2420449192  [Ljava.lang.Object;
   10:  50392816 2015712640  org.apache.hadoop.mapred.Counters$Counter
   11:   7775438 1181866576  
 [Ljava.util.concurrent.ConcurrentHashMap$Segment;
   12:   3882847 1118259936  org.apache.hadoop.mapred.TaskInProgress
 ConcurrentHashMap takes more than 14G(5959535616 + 5959534704 + 3975044400).
 The trouble maker are below codes in TaskInProgress.java:
   MapTaskAttemptID, Locality taskLocality = 
   new ConcurrentHashMapTaskAttemptID, Locality();
   MapTaskAttemptID, Avataar taskAvataar = 
   new ConcurrentHashMapTaskAttemptID, Avataar();

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (MAPREDUCE-2036) Enable Erasure Code in Tool similar to Hadoop Archive

2013-07-02 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

李志然 reassigned MAPREDUCE-2036:
--

Assignee: 李志然  (was: Wittawat Tantisiriroj)

 Enable Erasure Code in Tool similar to Hadoop Archive
 -

 Key: MAPREDUCE-2036
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2036
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/raid, harchive
Reporter: Wittawat Tantisiriroj
Assignee: 李志然
Priority: Minor
 Attachments: hdfs-raid.tar.gz, MAPREDUCE-2036.patch, RaidTool.pdf


 Features:
 1) HAR-like Tool
 2) RAID5/RAID6  pluggable interface to implement additional coding
 3) Enable to group blocks across files
 4) Portable across cluster since all necessary metadata is embedded
 While it was developed separately from HAR or RAID due to time constraints, 
 it would make sense to integrate with either of them.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-3193) FileInputFormat doesn't read files recursively in the input path dir

2013-07-02 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated MAPREDUCE-3193:
-

Status: Open  (was: Patch Available)

 FileInputFormat doesn't read files recursively in the input path dir
 

 Key: MAPREDUCE-3193
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3193
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Affects Versions: 2.0.0-alpha, 0.23.2, 3.0.0
Reporter: Ramgopal N
Assignee: Devaraj K
 Attachments: MAPREDUCE-3193-1.patch, MAPREDUCE-3193-2.patch, 
 MAPREDUCE-3193-2.patch, MAPREDUCE-3193-3.patch, MAPREDUCE-3193-4.patch, 
 MAPREDUCE-3193.patch, MAPREDUCE-3193.security.patch


 java.io.FileNotFoundException is thrown,if input file is more than one folder 
 level deep and the job is getting failed.
 Example:Input file is /r1/r2/input.txt

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-3193) FileInputFormat doesn't read files recursively in the input path dir

2013-07-02 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated MAPREDUCE-3193:
-

Attachment: MAPREDUCE-3193-5.patch

 FileInputFormat doesn't read files recursively in the input path dir
 

 Key: MAPREDUCE-3193
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3193
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Affects Versions: 0.23.2, 2.0.0-alpha, 3.0.0
Reporter: Ramgopal N
Assignee: Devaraj K
 Attachments: MAPREDUCE-3193-1.patch, MAPREDUCE-3193-2.patch, 
 MAPREDUCE-3193-2.patch, MAPREDUCE-3193-3.patch, MAPREDUCE-3193-4.patch, 
 MAPREDUCE-3193-5.patch, MAPREDUCE-3193.patch, MAPREDUCE-3193.security.patch


 java.io.FileNotFoundException is thrown,if input file is more than one folder 
 level deep and the job is getting failed.
 Example:Input file is /r1/r2/input.txt

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-3193) FileInputFormat doesn't read files recursively in the input path dir

2013-07-02 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated MAPREDUCE-3193:
-

Status: Patch Available  (was: Open)

Thanks Amar and Jason.

I have updated the patch with the suggested change, Please review this.

 FileInputFormat doesn't read files recursively in the input path dir
 

 Key: MAPREDUCE-3193
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3193
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Affects Versions: 2.0.0-alpha, 0.23.2, 3.0.0
Reporter: Ramgopal N
Assignee: Devaraj K
 Attachments: MAPREDUCE-3193-1.patch, MAPREDUCE-3193-2.patch, 
 MAPREDUCE-3193-2.patch, MAPREDUCE-3193-3.patch, MAPREDUCE-3193-4.patch, 
 MAPREDUCE-3193-5.patch, MAPREDUCE-3193.patch, MAPREDUCE-3193.security.patch


 java.io.FileNotFoundException is thrown,if input file is more than one folder 
 level deep and the job is getting failed.
 Example:Input file is /r1/r2/input.txt

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5369) Progress for jobs with multiple splits in local mode is wrong

2013-07-02 Thread Johannes Zillmann (JIRA)
Johannes Zillmann created MAPREDUCE-5369:


 Summary: Progress for jobs with multiple splits in local mode is 
wrong
 Key: MAPREDUCE-5369
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5369
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.2
Reporter: Johannes Zillmann


In case a job with multiple splits is executed in local mode (LocalJobRunner) 
its progress calculation is wrong.
After the first split is processed it jumps to 100%, then back to 50% and so 
on. 

The reason lies in the progress calculation in LocalJobRunner:
{code}
  float taskIndex = mapIds.indexOf(taskId);
  if (taskIndex = 0) {   // mapping
float numTasks = mapIds.size();
status.setMapProgress(taskIndex/numTasks + 
taskStatus.getProgress()/numTasks);
  } else {
status.setReduceProgress(taskStatus.getProgress());
  }
{code}

The problem is that {{mapIds}} is filled lazily in run(). There is an loop over 
all splits. In the loop, the splits task id is added to {{mapIds}}, then the 
split is processed. That means {{numTasks}} is 1 while the first split is 
processed, it is 2 while the second task is processed and so on...

I tried Hadoop 0.20.2, 1.0.3, 1.1.2 and cdh-4.1. All the same behaviour!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-3193) FileInputFormat doesn't read files recursively in the input path dir

2013-07-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697621#comment-13697621
 ] 

Hadoop QA commented on MAPREDUCE-3193:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12590385/MAPREDUCE-3193-5.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3821//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3821//console

This message is automatically generated.

 FileInputFormat doesn't read files recursively in the input path dir
 

 Key: MAPREDUCE-3193
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3193
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Affects Versions: 0.23.2, 2.0.0-alpha, 3.0.0
Reporter: Ramgopal N
Assignee: Devaraj K
 Attachments: MAPREDUCE-3193-1.patch, MAPREDUCE-3193-2.patch, 
 MAPREDUCE-3193-2.patch, MAPREDUCE-3193-3.patch, MAPREDUCE-3193-4.patch, 
 MAPREDUCE-3193-5.patch, MAPREDUCE-3193.patch, MAPREDUCE-3193.security.patch


 java.io.FileNotFoundException is thrown,if input file is more than one folder 
 level deep and the job is getting failed.
 Example:Input file is /r1/r2/input.txt

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5369) Progress for jobs with multiple splits in local mode is wrong

2013-07-02 Thread Johannes Zillmann (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697626#comment-13697626
 ] 

Johannes Zillmann commented on MAPREDUCE-5369:
--

Update: tested that with cdh4.3 and the problem does not exists anymore in that 
version. So it might be fixed somewhere along the 2.x branch.

 Progress for jobs with multiple splits in local mode is wrong
 -

 Key: MAPREDUCE-5369
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5369
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.2
Reporter: Johannes Zillmann

 In case a job with multiple splits is executed in local mode (LocalJobRunner) 
 its progress calculation is wrong.
 After the first split is processed it jumps to 100%, then back to 50% and so 
 on. 
 The reason lies in the progress calculation in LocalJobRunner:
 {code}
   float taskIndex = mapIds.indexOf(taskId);
   if (taskIndex = 0) {   // mapping
 float numTasks = mapIds.size();
 status.setMapProgress(taskIndex/numTasks + 
 taskStatus.getProgress()/numTasks);
   } else {
 status.setReduceProgress(taskStatus.getProgress());
   }
 {code}
 The problem is that {{mapIds}} is filled lazily in run(). There is an loop 
 over all splits. In the loop, the splits task id is added to {{mapIds}}, then 
 the split is processed. That means {{numTasks}} is 1 while the first split is 
 processed, it is 2 while the second task is processed and so on...
 I tried Hadoop 0.20.2, 1.0.3, 1.1.2 and cdh-4.1. All the same behaviour!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5364) Deadlock between RenewalTimerTask methods cancel() and run()

2013-07-02 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697799#comment-13697799
 ] 

Karthik Kambatla commented on MAPREDUCE-5364:
-

The patch doesn't apply because it is for branch-1.

 Deadlock between RenewalTimerTask methods cancel() and run()
 

 Key: MAPREDUCE-5364
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5364
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: mr-5364-1.patch


 MAPREDUCE-4860 introduced a local variable {{cancelled}} in 
 {{RenewalTimerTask}} to fix the race where {{DelegationTokenRenewal}} 
 attempts to renew a token even after the job is removed. However, the patch 
 also makes {{run()}} and {{cancel()}} synchronized methods leading to a 
 potential deadlock against {{run()}}'s catch-block (error-path).
 The deadlock stacks below:
 {noformat}
  - 
 org.apache.hadoop.mapreduce.security.token.DelegationTokenRenewal$RenewalTimerTask.cancel()
  @bci=0, line=240 (Interpreted frame)
  - 
 org.apache.hadoop.mapreduce.security.token.DelegationTokenRenewal.removeDelegationTokenRenewalForJob(org.apache.hadoop.mapreduce.JobID)
  @bci=109, line=319 (Interpreted frame)
 {noformat}
 {noformat}
  - 
 org.apache.hadoop.mapreduce.security.token.DelegationTokenRenewal.removeFailedDelegationToken(org.apache.hadoop.mapreduce.security.token.DelegationTokenRenewal$DelegationTokenToRenew)
  @bci=62, line=297 (Interpreted frame)
  - 
 org.apache.hadoop.mapreduce.security.token.DelegationTokenRenewal.access$300(org.apache.hadoop.mapreduce.security.token.DelegationTokenRenewal$DelegationTokenToRenew)
  @bci=1, line=47 (Interpreted frame)
  - 
 org.apache.hadoop.mapreduce.security.token.DelegationTokenRenewal$RenewalTimerTask.run()
  @bci=148, line=234 (Interpreted frame)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5368) Save memory by set capacity, load factor and concurrency level for ConcurrentHashMap in TaskInProgress

2013-07-02 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697850#comment-13697850
 ] 

Karthik Kambatla commented on MAPREDUCE-5368:
-

The patch is a definite improvement over the current situation. As in branch-2, 
can we get rid of the ConcurrentHashMaps altogether, and move Locality and 
Avataar to TaskAttemptID itself? 

 Save memory by  set capacity, load factor and concurrency level for 
 ConcurrentHashMap in TaskInProgress
 ---

 Key: MAPREDUCE-5368
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5368
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Affects Versions: 1.2.0
Reporter: zhaoyunjiong
 Fix For: 1.2.1

 Attachments: MAPREDUCE-5368.patch


 Below is histo from our JobTracker:
  num #instances #bytes  class name
 --
1: 13604882411347237456  [C
2: 124156992 5959535616  
 java.util.concurrent.locks.ReentrantLock$NonfairSync
3: 124156973 5959534704  
 java.util.concurrent.ConcurrentHashMap$Segment
4: 135887753 5435510120  java.lang.String
5: 124213692 3975044400  
 [Ljava.util.concurrent.ConcurrentHashMap$HashEntry;
6:  63777311 3061310928  java.util.HashMap$Entry
7:  35038252 2803060160  java.util.TreeMap
8:  16921110 2712480072  [Ljava.util.HashMap$Entry;
9:   4803617 2420449192  [Ljava.lang.Object;
   10:  50392816 2015712640  org.apache.hadoop.mapred.Counters$Counter
   11:   7775438 1181866576  
 [Ljava.util.concurrent.ConcurrentHashMap$Segment;
   12:   3882847 1118259936  org.apache.hadoop.mapred.TaskInProgress
 ConcurrentHashMap takes more than 14G(5959535616 + 5959534704 + 3975044400).
 The trouble maker are below codes in TaskInProgress.java:
   MapTaskAttemptID, Locality taskLocality = 
   new ConcurrentHashMapTaskAttemptID, Locality();
   MapTaskAttemptID, Avataar taskAvataar = 
   new ConcurrentHashMapTaskAttemptID, Avataar();

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-3193) FileInputFormat doesn't read files recursively in the input path dir

2013-07-02 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697951#comment-13697951
 ] 

Jason Lowe commented on MAPREDUCE-3193:
---

+1, will commit later today

 FileInputFormat doesn't read files recursively in the input path dir
 

 Key: MAPREDUCE-3193
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3193
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Affects Versions: 0.23.2, 2.0.0-alpha, 3.0.0
Reporter: Ramgopal N
Assignee: Devaraj K
 Attachments: MAPREDUCE-3193-1.patch, MAPREDUCE-3193-2.patch, 
 MAPREDUCE-3193-2.patch, MAPREDUCE-3193-3.patch, MAPREDUCE-3193-4.patch, 
 MAPREDUCE-3193-5.patch, MAPREDUCE-3193.patch, MAPREDUCE-3193.security.patch


 java.io.FileNotFoundException is thrown,if input file is more than one folder 
 level deep and the job is getting failed.
 Example:Input file is /r1/r2/input.txt

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Reopened] (MAPREDUCE-5351) JobTracker memory leak caused by CleanupQueue reopening FileSystem

2013-07-02 Thread Arpit Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta reopened MAPREDUCE-5351:



Reopening as with this fix we are seeing jobs fail with the following exception

{code}
13/07/02 16:06:57 DEBUG mapred.JobClient: Printing tokens for job: 
job_201307020820_0012
13/07/02 16:06:57 DEBUG ipc.Client: IPC Client (47) connection to host/ip:50300 
from hortonar sending #32
13/07/02 16:06:57 DEBUG ipc.Client: IPC Client (47) connection to host/ip:50300 
from hortonar got value #32
13/07/02 16:06:57 DEBUG retry.RetryUtils: RETRY 0) policy=TryOnceThenFail, 
exception=org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:383)
at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1633)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:364)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1166)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:350)
at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:3599)
at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:3561)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1444)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1440)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1438)

13/07/02 16:06:57 INFO mapred.JobClient: Cleaning up the staging area 
hdfs://host:8020/user/hortonar/.staging/job_201307020820_0012
13/07/02 16:06:57 ERROR security.UserGroupInformation: 
PriviledgedActionException as:hortonar 
cause:org.apache.hadoop.ipc.RemoteException: java.io.IOException: Filesystem 
closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:383)
at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1633)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:364)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1166)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:350)
at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:3599)
at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:3561)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1444)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1440)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1438)

org.apache.hadoop.ipc.RemoteException: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:383)
at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1633)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:364)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1166)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:350)
at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:3599)
at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:3561)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1444)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1440)
at 

[jira] [Commented] (MAPREDUCE-5368) Save memory by set capacity, load factor and concurrency level for ConcurrentHashMap in TaskInProgress

2013-07-02 Thread Mayank Bansal (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698042#comment-13698042
 ] 

Mayank Bansal commented on MAPREDUCE-5368:
--

The Default values of the Load factor is .75 anyways 
The Default concurrency level is 16 which I think is reasonable for the jobs.
The Default initial capacity is also 16 which is also reasonable.

I am not sure how we are saving memory here. Can you please explain a bit?

Moreover I really dont think to change the concurrency level so low as it will 
increase the contention in the threads a lot.

Thoughts?

Thanks,
Mayank


 Save memory by  set capacity, load factor and concurrency level for 
 ConcurrentHashMap in TaskInProgress
 ---

 Key: MAPREDUCE-5368
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5368
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Affects Versions: 1.2.0
Reporter: zhaoyunjiong
 Fix For: 1.2.1

 Attachments: MAPREDUCE-5368.patch


 Below is histo from our JobTracker:
  num #instances #bytes  class name
 --
1: 13604882411347237456  [C
2: 124156992 5959535616  
 java.util.concurrent.locks.ReentrantLock$NonfairSync
3: 124156973 5959534704  
 java.util.concurrent.ConcurrentHashMap$Segment
4: 135887753 5435510120  java.lang.String
5: 124213692 3975044400  
 [Ljava.util.concurrent.ConcurrentHashMap$HashEntry;
6:  63777311 3061310928  java.util.HashMap$Entry
7:  35038252 2803060160  java.util.TreeMap
8:  16921110 2712480072  [Ljava.util.HashMap$Entry;
9:   4803617 2420449192  [Ljava.lang.Object;
   10:  50392816 2015712640  org.apache.hadoop.mapred.Counters$Counter
   11:   7775438 1181866576  
 [Ljava.util.concurrent.ConcurrentHashMap$Segment;
   12:   3882847 1118259936  org.apache.hadoop.mapred.TaskInProgress
 ConcurrentHashMap takes more than 14G(5959535616 + 5959534704 + 3975044400).
 The trouble maker are below codes in TaskInProgress.java:
   MapTaskAttemptID, Locality taskLocality = 
   new ConcurrentHashMapTaskAttemptID, Locality();
   MapTaskAttemptID, Avataar taskAvataar = 
   new ConcurrentHashMapTaskAttemptID, Avataar();

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5351) JobTracker memory leak caused by CleanupQueue reopening FileSystem

2013-07-02 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698043#comment-13698043
 ] 

Sandy Ryza commented on MAPREDUCE-5351:
---

[~arpitgupta], how often is this occurring for you?

 JobTracker memory leak caused by CleanupQueue reopening FileSystem
 --

 Key: MAPREDUCE-5351
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5351
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 1.1.2
Reporter: Sandy Ryza
Assignee: Sandy Ryza
Priority: Critical
 Fix For: 1.2.1

 Attachments: MAPREDUCE-5351-1.patch, MAPREDUCE-5351-2.patch, 
 MAPREDUCE-5351.patch


 When a job is completed, closeAllForUGI is called to close all the cached 
 FileSystems in the FileSystem cache.  However, the CleanupQueue may run after 
 this occurs and call FileSystem.get() to delete the staging directory, adding 
 a FileSystem to the cache that will never be closed.
 People on the user-list have reported this causing their JobTrackers to OOME 
 every two weeks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5351) JobTracker memory leak caused by CleanupQueue reopening FileSystem

2013-07-02 Thread Arpit Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698055#comment-13698055
 ] 

Arpit Gupta commented on MAPREDUCE-5351:


Its happening very frequently that quite a few mr jobs failed because of this 
exception. Various example jobs, jobs through pig, oozie etc also failed 
because of this.

 JobTracker memory leak caused by CleanupQueue reopening FileSystem
 --

 Key: MAPREDUCE-5351
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5351
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 1.1.2
Reporter: Sandy Ryza
Assignee: Sandy Ryza
Priority: Critical
 Fix For: 1.2.1

 Attachments: MAPREDUCE-5351-1.patch, MAPREDUCE-5351-2.patch, 
 MAPREDUCE-5351.patch


 When a job is completed, closeAllForUGI is called to close all the cached 
 FileSystems in the FileSystem cache.  However, the CleanupQueue may run after 
 this occurs and call FileSystem.get() to delete the staging directory, adding 
 a FileSystem to the cache that will never be closed.
 People on the user-list have reported this causing their JobTrackers to OOME 
 every two weeks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5351) JobTracker memory leak caused by CleanupQueue reopening FileSystem

2013-07-02 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698069#comment-13698069
 ] 

Sandy Ryza commented on MAPREDUCE-5351:
---

Thanks.  I'll look into this.

 JobTracker memory leak caused by CleanupQueue reopening FileSystem
 --

 Key: MAPREDUCE-5351
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5351
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 1.1.2
Reporter: Sandy Ryza
Assignee: Sandy Ryza
Priority: Critical
 Fix For: 1.2.1

 Attachments: MAPREDUCE-5351-1.patch, MAPREDUCE-5351-2.patch, 
 MAPREDUCE-5351.patch


 When a job is completed, closeAllForUGI is called to close all the cached 
 FileSystems in the FileSystem cache.  However, the CleanupQueue may run after 
 this occurs and call FileSystem.get() to delete the staging directory, adding 
 a FileSystem to the cache that will never be closed.
 People on the user-list have reported this causing their JobTrackers to OOME 
 every two weeks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5351) JobTracker memory leak caused by CleanupQueue reopening FileSystem

2013-07-02 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698168#comment-13698168
 ] 

Sandy Ryza commented on MAPREDUCE-5351:
---

Attaching an addendum patch that should fix the issue.  I was able to reproduce 
the issue by running pi jobs repeatedly.  With the patch, the exception no 
longer occurs.

 JobTracker memory leak caused by CleanupQueue reopening FileSystem
 --

 Key: MAPREDUCE-5351
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5351
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 1.1.2
Reporter: Sandy Ryza
Assignee: Sandy Ryza
Priority: Critical
 Fix For: 1.2.1

 Attachments: MAPREDUCE-5351-1.patch, MAPREDUCE-5351-2.patch, 
 MAPREDUCE-5351.patch


 When a job is completed, closeAllForUGI is called to close all the cached 
 FileSystems in the FileSystem cache.  However, the CleanupQueue may run after 
 this occurs and call FileSystem.get() to delete the staging directory, adding 
 a FileSystem to the cache that will never be closed.
 People on the user-list have reported this causing their JobTrackers to OOME 
 every two weeks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5351) JobTracker memory leak caused by CleanupQueue reopening FileSystem

2013-07-02 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated MAPREDUCE-5351:
--

Attachment: MAPREDUCE-5351-addendum.patch

 JobTracker memory leak caused by CleanupQueue reopening FileSystem
 --

 Key: MAPREDUCE-5351
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5351
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 1.1.2
Reporter: Sandy Ryza
Assignee: Sandy Ryza
Priority: Critical
 Fix For: 1.2.1

 Attachments: MAPREDUCE-5351-1.patch, MAPREDUCE-5351-2.patch, 
 MAPREDUCE-5351-addendum.patch, MAPREDUCE-5351.patch


 When a job is completed, closeAllForUGI is called to close all the cached 
 FileSystems in the FileSystem cache.  However, the CleanupQueue may run after 
 this occurs and call FileSystem.get() to delete the staging directory, adding 
 a FileSystem to the cache that will never be closed.
 People on the user-list have reported this causing their JobTrackers to OOME 
 every two weeks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5351) JobTracker memory leak caused by CleanupQueue reopening FileSystem

2013-07-02 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated MAPREDUCE-5351:
--

Status: Patch Available  (was: Reopened)

 JobTracker memory leak caused by CleanupQueue reopening FileSystem
 --

 Key: MAPREDUCE-5351
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5351
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 1.1.2
Reporter: Sandy Ryza
Assignee: Sandy Ryza
Priority: Critical
 Fix For: 1.2.1

 Attachments: MAPREDUCE-5351-1.patch, MAPREDUCE-5351-2.patch, 
 MAPREDUCE-5351-addendum.patch, MAPREDUCE-5351.patch


 When a job is completed, closeAllForUGI is called to close all the cached 
 FileSystems in the FileSystem cache.  However, the CleanupQueue may run after 
 this occurs and call FileSystem.get() to delete the staging directory, adding 
 a FileSystem to the cache that will never be closed.
 People on the user-list have reported this causing their JobTrackers to OOME 
 every two weeks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5351) JobTracker memory leak caused by CleanupQueue reopening FileSystem

2013-07-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698177#comment-13698177
 ] 

Hadoop QA commented on MAPREDUCE-5351:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12590494/MAPREDUCE-5351-addendum.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3822//console

This message is automatically generated.

 JobTracker memory leak caused by CleanupQueue reopening FileSystem
 --

 Key: MAPREDUCE-5351
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5351
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 1.1.2
Reporter: Sandy Ryza
Assignee: Sandy Ryza
Priority: Critical
 Fix For: 1.2.1

 Attachments: MAPREDUCE-5351-1.patch, MAPREDUCE-5351-2.patch, 
 MAPREDUCE-5351-addendum.patch, MAPREDUCE-5351.patch


 When a job is completed, closeAllForUGI is called to close all the cached 
 FileSystems in the FileSystem cache.  However, the CleanupQueue may run after 
 this occurs and call FileSystem.get() to delete the staging directory, adding 
 a FileSystem to the cache that will never be closed.
 People on the user-list have reported this causing their JobTrackers to OOME 
 every two weeks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (MAPREDUCE-5330) JVM manager should not forcefully kill the process on Signal.TERM on Windows

2013-07-02 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth resolved MAPREDUCE-5330.
--

   Resolution: Fixed
Fix Version/s: 1-win

I committed this to branch-1-win.  Xi, thank you for contributing this patch.

 JVM manager should not forcefully kill the process on Signal.TERM on Windows
 

 Key: MAPREDUCE-5330
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5330
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1-win
 Environment: Windows
Reporter: Xi Fang
Assignee: Xi Fang
 Fix For: 1-win

 Attachments: MAPREDUCE-5330.patch


 In MapReduce, we sometimes kill a task's JVM before it naturally shuts down 
 if we want to launch other tasks (look in 
 JvmManager$JvmManagerForType.reapJvm). This behavior means that if the map 
 task process is in the middle of doing some cleanup/finalization after the 
 task is done, it might be interrupted/killed without giving it a chance. 
 In the Microsoft's Hadoop Service, after a Map/Reduce task is done and during 
 closing file systems in a special shutdown hook, we're typically uploading 
 storage (ASV in our context) usage metrics to Microsoft Azure Tables. So if 
 this kill happens these metrics get lost. The impact is that for many MR jobs 
 we don't see accurate metrics reported most of the time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5330) JVM manager should not forcefully kill the process on Signal.TERM on Windows

2013-07-02 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698205#comment-13698205
 ] 

Xi Fang commented on MAPREDUCE-5330:


Thanks Ivan and Chris!

 JVM manager should not forcefully kill the process on Signal.TERM on Windows
 

 Key: MAPREDUCE-5330
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5330
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1-win
 Environment: Windows
Reporter: Xi Fang
Assignee: Xi Fang
 Fix For: 1-win

 Attachments: MAPREDUCE-5330.patch


 In MapReduce, we sometimes kill a task's JVM before it naturally shuts down 
 if we want to launch other tasks (look in 
 JvmManager$JvmManagerForType.reapJvm). This behavior means that if the map 
 task process is in the middle of doing some cleanup/finalization after the 
 task is done, it might be interrupted/killed without giving it a chance. 
 In the Microsoft's Hadoop Service, after a Map/Reduce task is done and during 
 closing file systems in a special shutdown hook, we're typically uploading 
 storage (ASV in our context) usage metrics to Microsoft Azure Tables. So if 
 this kill happens these metrics get lost. The impact is that for many MR jobs 
 we don't see accurate metrics reported most of the time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5351) JobTracker memory leak caused by CleanupQueue reopening FileSystem

2013-07-02 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698218#comment-13698218
 ] 

Arun C Murthy commented on MAPREDUCE-5351:
--

[~sandyr] - JIP.cleanupJob always results in PathDeletionContext with ugi being 
null... how does this fix the original problem? I'm missing something? Tx.

 JobTracker memory leak caused by CleanupQueue reopening FileSystem
 --

 Key: MAPREDUCE-5351
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5351
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 1.1.2
Reporter: Sandy Ryza
Assignee: Sandy Ryza
Priority: Critical
 Fix For: 1.2.1

 Attachments: MAPREDUCE-5351-1.patch, MAPREDUCE-5351-2.patch, 
 MAPREDUCE-5351-addendum.patch, MAPREDUCE-5351.patch


 When a job is completed, closeAllForUGI is called to close all the cached 
 FileSystems in the FileSystem cache.  However, the CleanupQueue may run after 
 this occurs and call FileSystem.get() to delete the staging directory, adding 
 a FileSystem to the cache that will never be closed.
 People on the user-list have reported this causing their JobTrackers to OOME 
 every two weeks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5351) JobTracker memory leak caused by CleanupQueue reopening FileSystem

2013-07-02 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698228#comment-13698228
 ] 

Sandy Ryza commented on MAPREDUCE-5351:
---

In JIP.cleanupJob:
{code}
Path tempDir = jobtracker.getSystemDirectoryForJob(getJobID());
CleanupQueue.getInstance().addToQueue(
new PathDeletionContext(tempDir, conf));

// delete the staging area for the job and cancel delegation token
String jobTempDir = conf.get(mapreduce.job.dir);
if (jobTempDir != null  conf.getKeepTaskFilesPattern() == null 
!conf.getKeepFailedTaskFiles()) {
  Path jobTempDirPath = new Path(jobTempDir);
  tempDirFs = jobTempDirPath.getFileSystem(conf);
  CleanupQueue.getInstance().addToQueue(
  new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
}
{code}
The CleanupQueue is used twice, once with the UGI set and once without.

 JobTracker memory leak caused by CleanupQueue reopening FileSystem
 --

 Key: MAPREDUCE-5351
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5351
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 1.1.2
Reporter: Sandy Ryza
Assignee: Sandy Ryza
Priority: Critical
 Fix For: 1.2.1

 Attachments: MAPREDUCE-5351-1.patch, MAPREDUCE-5351-2.patch, 
 MAPREDUCE-5351-addendum.patch, MAPREDUCE-5351.patch


 When a job is completed, closeAllForUGI is called to close all the cached 
 FileSystems in the FileSystem cache.  However, the CleanupQueue may run after 
 this occurs and call FileSystem.get() to delete the staging directory, adding 
 a FileSystem to the cache that will never be closed.
 People on the user-list have reported this causing their JobTrackers to OOME 
 every two weeks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5370) Rolling Restart Tasktrackers from JobTracker

2013-07-02 Thread Cindy Li (JIRA)
Cindy Li created MAPREDUCE-5370:
---

 Summary: Rolling Restart Tasktrackers from JobTracker
 Key: MAPREDUCE-5370
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5370
 Project: Hadoop Map/Reduce
  Issue Type: Task
Reporter: Cindy Li
Priority: Minor


For near real-time jobs running on hadoop, we want to minimize impact on them 
when rolling restarting tasktrackers. The idea here is to restart tasktrackers 
from jobtracker and selectively choose task trackers according to tasks' status 
on them to do rolling restart, such that the impact on total job running time 
is minimum. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5371) TestProxyUserFromEnv#testProxyUserFromEnvironment failed caused by domains of windows users

2013-07-02 Thread Xi Fang (JIRA)
Xi Fang created MAPREDUCE-5371:
--

 Summary: TestProxyUserFromEnv#testProxyUserFromEnvironment failed 
caused by domains of windows users
 Key: MAPREDUCE-5371
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5371
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1-win
 Environment: Windows
Reporter: Xi Fang
Assignee: Xi Fang
Priority: Minor
 Fix For: 1-win


The error message was:
Error Message
expected:[sijenkins-vm2]jenkins but was:[]jenkins
Stacktrace
at 
org.apache.hadoop.security.TestProxyUserFromEnv.testProxyUserFromEnvironment(TestProxyUserFromEnv.java:45)

The root cause of this failure is the domain used on Windows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5371) TestProxyUserFromEnv#testProxyUserFromEnvironment failed caused by domains of windows users

2013-07-02 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated MAPREDUCE-5371:
---

Attachment: MAPREDUCE-5371.patch

 TestProxyUserFromEnv#testProxyUserFromEnvironment failed caused by domains of 
 windows users
 ---

 Key: MAPREDUCE-5371
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5371
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1-win
 Environment: Windows
Reporter: Xi Fang
Assignee: Xi Fang
Priority: Minor
 Fix For: 1-win

 Attachments: MAPREDUCE-5371.patch


 The error message was:
 Error Message
 expected:[sijenkins-vm2]jenkins but was:[]jenkins
 Stacktrace
 at 
 org.apache.hadoop.security.TestProxyUserFromEnv.testProxyUserFromEnvironment(TestProxyUserFromEnv.java:45)
 The root cause of this failure is the domain used on Windows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Work started] (MAPREDUCE-5371) TestProxyUserFromEnv#testProxyUserFromEnvironment failed caused by domains of windows users

2013-07-02 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on MAPREDUCE-5371 started by Xi Fang.

 TestProxyUserFromEnv#testProxyUserFromEnvironment failed caused by domains of 
 windows users
 ---

 Key: MAPREDUCE-5371
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5371
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1-win
 Environment: Windows
Reporter: Xi Fang
Assignee: Xi Fang
Priority: Minor
 Fix For: 1-win

 Attachments: MAPREDUCE-5371.patch


 The error message was:
 Error Message
 expected:[sijenkins-vm2]jenkins but was:[]jenkins
 Stacktrace
 at 
 org.apache.hadoop.security.TestProxyUserFromEnv.testProxyUserFromEnvironment(TestProxyUserFromEnv.java:45)
 The root cause of this failure is the domain used on Windows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5371) TestProxyUserFromEnv#testProxyUserFromEnvironment failed caused by domains of windows users

2013-07-02 Thread Xi Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698240#comment-13698240
 ] 

Xi Fang commented on MAPREDUCE-5371:


The attached patch removed the domains from user names.

 TestProxyUserFromEnv#testProxyUserFromEnvironment failed caused by domains of 
 windows users
 ---

 Key: MAPREDUCE-5371
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5371
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1-win
 Environment: Windows
Reporter: Xi Fang
Assignee: Xi Fang
Priority: Minor
 Fix For: 1-win

 Attachments: MAPREDUCE-5371.patch


 The error message was:
 Error Message
 expected:[sijenkins-vm2]jenkins but was:[]jenkins
 Stacktrace
 at 
 org.apache.hadoop.security.TestProxyUserFromEnv.testProxyUserFromEnvironment(TestProxyUserFromEnv.java:45)
 The root cause of this failure is the domain used on Windows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5355) MiniMRYarnCluster with localFs does not work on Windows

2013-07-02 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698284#comment-13698284
 ] 

Chris Nauroth commented on MAPREDUCE-5355:
--

Thank you, Chuan.  Nice find!  I verified these tests on Mac and Windows.

The patch has a couple of long lines.  Can you please change the patch so that 
lines wrap at 80 characters?  Otherwise, it looks good.

 MiniMRYarnCluster with localFs does not work on Windows
 ---

 Key: MAPREDUCE-5355
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5355
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Chuan Liu
Assignee: Chuan Liu
Priority: Minor
 Attachments: MAPREDUCE-5355-branch-2.patch, MAPREDUCE-5355-trunk.patch


 When MiniMRYarnCluster configured to run on localFs instead of remoteFs, i.e. 
 MiniDFSCluster, the job will fail on Windows. The error message looks like 
 the following.
 {noformat}
 java.io.IOException: Job status not available
 {noformat}
 In my testing, the following unit tests hit this exception.
 * TestMRJobsWithHistoryService
 * TestClusterMRNotification
 * TestJobCleanup
 * TestJobCounters
 * TestMiniMRClientCluster
 * TestJobOutputCommitter
 * TestMRAppWithCombiner
 * TestMROldApiJobs
 * TestSpeculativeExecution

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-3193) FileInputFormat doesn't read files recursively in the input path dir

2013-07-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698305#comment-13698305
 ] 

Hudson commented on MAPREDUCE-3193:
---

Integrated in Hadoop-trunk-Commit #4033 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4033/])
MAPREDUCE-3193. FileInputFormat doesn't read files recursively in the input 
path dir. Contributed by Devaraj K (Revision 1499125)

 Result = SUCCESS
jlowe : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1499125
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/util/ConfigUtil.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/lib/input
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestFileInputFormat.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestFileInputFormat.java


 FileInputFormat doesn't read files recursively in the input path dir
 

 Key: MAPREDUCE-3193
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3193
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Affects Versions: 0.23.2, 2.0.0-alpha, 3.0.0
Reporter: Ramgopal N
Assignee: Devaraj K
 Attachments: MAPREDUCE-3193-1.patch, MAPREDUCE-3193-2.patch, 
 MAPREDUCE-3193-2.patch, MAPREDUCE-3193-3.patch, MAPREDUCE-3193-4.patch, 
 MAPREDUCE-3193-5.patch, MAPREDUCE-3193.patch, MAPREDUCE-3193.security.patch


 java.io.FileNotFoundException is thrown,if input file is more than one folder 
 level deep and the job is getting failed.
 Example:Input file is /r1/r2/input.txt

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-3193) FileInputFormat doesn't read files recursively in the input path dir

2013-07-02 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-3193:
--

   Resolution: Fixed
Fix Version/s: 0.23.10
   2.3.0
   3.0.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Thanks Devaraj, and to all others who contributed to the review!  I committed 
this to trunk, branch-2, and branch-0.23.

 FileInputFormat doesn't read files recursively in the input path dir
 

 Key: MAPREDUCE-3193
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3193
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Affects Versions: 0.23.2, 2.0.0-alpha, 3.0.0
Reporter: Ramgopal N
Assignee: Devaraj K
 Fix For: 3.0.0, 2.3.0, 0.23.10

 Attachments: MAPREDUCE-3193-1.patch, MAPREDUCE-3193-2.patch, 
 MAPREDUCE-3193-2.patch, MAPREDUCE-3193-3.patch, MAPREDUCE-3193-4.patch, 
 MAPREDUCE-3193-5.patch, MAPREDUCE-3193.patch, MAPREDUCE-3193.security.patch


 java.io.FileNotFoundException is thrown,if input file is more than one folder 
 level deep and the job is getting failed.
 Example:Input file is /r1/r2/input.txt

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5355) MiniMRYarnCluster with localFs does not work on Windows

2013-07-02 Thread Chuan Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chuan Liu updated MAPREDUCE-5355:
-

Attachment: MAPREDUCE-5355-trunk.2.patch

Thanks for reviewing, Chris! Attach a new patch that fixes long lines.

 MiniMRYarnCluster with localFs does not work on Windows
 ---

 Key: MAPREDUCE-5355
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5355
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Chuan Liu
Assignee: Chuan Liu
Priority: Minor
 Attachments: MAPREDUCE-5355-branch-2.patch, 
 MAPREDUCE-5355-trunk.2.patch, MAPREDUCE-5355-trunk.patch


 When MiniMRYarnCluster configured to run on localFs instead of remoteFs, i.e. 
 MiniDFSCluster, the job will fail on Windows. The error message looks like 
 the following.
 {noformat}
 java.io.IOException: Job status not available
 {noformat}
 In my testing, the following unit tests hit this exception.
 * TestMRJobsWithHistoryService
 * TestClusterMRNotification
 * TestJobCleanup
 * TestJobCounters
 * TestMiniMRClientCluster
 * TestJobOutputCommitter
 * TestMRAppWithCombiner
 * TestMROldApiJobs
 * TestSpeculativeExecution

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5357) Job staging directory owner checking could fail on Windows

2013-07-02 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698321#comment-13698321
 ] 

Chris Nauroth commented on MAPREDUCE-5357:
--

Hi, Chuan.  Can you list a sample of the tests that were fixed by this patch in 
your environment?  I'd like to take it for a test run.

 Job staging directory owner checking could fail on Windows
 --

 Key: MAPREDUCE-5357
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5357
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Chuan Liu
Assignee: Chuan Liu
Priority: Minor
 Attachments: MAPREDUCE-5357-trunk.patch


 In {{JobSubmissionFiles.getStagingDir()}}, we have following code that will 
 throw exception if the directory owner is not the current user.
 {code:java}
   String owner = fsStatus.getOwner();
   if (!(owner.equals(currentUser) || owner.equals(realUser))) {
  throw new IOException(The ownership on the staging directory  +
   stagingArea +  is not as expected.  +
   It is owned by  + owner + . The directory must  +
   be owned by the submitter  + currentUser +  or  +
   by  + realUser);
   }
 {code}
 This check will fail on Windows when the underlying file system is 
 LocalFileSystem. Because on Windows, the default file or directory owner 
 could be Administrators group if the user belongs to Administrators group.
 Quite a few MR unit tests that runs MR mini cluster with localFs as 
 underlying file system fail because of this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5367) Local jobs all use same local working directory

2013-07-02 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated MAPREDUCE-5367:
--

Affects Version/s: (was: 2.0.5-alpha)
   1.2.0

 Local jobs all use same local working directory
 ---

 Key: MAPREDUCE-5367
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5367
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Sandy Ryza
Assignee: Sandy Ryza

 This means that local jobs, even in different JVMs, can't run concurrently 
 because they might delete each other's files during work directory setup.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (MAPREDUCE-5363) Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus

2013-07-02 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA reassigned MAPREDUCE-5363:


Assignee: Akira AJISAKA

 Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus
 

 Key: MAPREDUCE-5363
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5363
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Affects Versions: 1.1.2, 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Akira AJISAKA
  Labels: newbie

 The doc for TaskCompletionEvent#get(Task)Status in both MR1 and MR2 is
 {code}
 Returns enum Status.SUCESS or Status.FAILURE.
 @return task tracker status
 {code}
 The actual values that the Status enum can take are
 FAILED, KILLED, SUCCEEDED, OBSOLETE, TIPFAILED

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5355) MiniMRYarnCluster with localFs does not work on Windows

2013-07-02 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated MAPREDUCE-5355:
-

Hadoop Flags: Reviewed

+1 for the patch.  Thanks for addressing the formatting.  I'll commit this.

 MiniMRYarnCluster with localFs does not work on Windows
 ---

 Key: MAPREDUCE-5355
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5355
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Chuan Liu
Assignee: Chuan Liu
Priority: Minor
 Attachments: MAPREDUCE-5355-branch-2.patch, 
 MAPREDUCE-5355-trunk.2.patch, MAPREDUCE-5355-trunk.patch


 When MiniMRYarnCluster configured to run on localFs instead of remoteFs, i.e. 
 MiniDFSCluster, the job will fail on Windows. The error message looks like 
 the following.
 {noformat}
 java.io.IOException: Job status not available
 {noformat}
 In my testing, the following unit tests hit this exception.
 * TestMRJobsWithHistoryService
 * TestClusterMRNotification
 * TestJobCleanup
 * TestJobCounters
 * TestMiniMRClientCluster
 * TestJobOutputCommitter
 * TestMRAppWithCombiner
 * TestMROldApiJobs
 * TestSpeculativeExecution

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5317) Stale files left behind for failed jobs

2013-07-02 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698366#comment-13698366
 ] 

Jason Lowe commented on MAPREDUCE-5317:
---

Thanks for the update, Ravi.  Are we pushing the JOB_WAIT_TIMEOUT to another 
JIRA?  I didn't see that addressed.  A few more comments:

* Why does FAIL_WAIT ignore the JOB_COMMIT_COMPLETED/JOB_COMMIT_FAILED events?  
I don't see how those events could arrive in this state, as it would require 
the committer to have been invoked sometime before entering this state.  Maybe 
I'm missing a scenario where that does occur?  KILL_WAIT doesn't do this, for 
example, so it seems we should either not need this in FAIL_WAIT or KILL_WAIT 
also needs it.
* In the testcase, it's using AsyncDispatcher yet checking immediately after 
handling an event that the committer has not been invoked.  This is inherently 
racy due to the nature of AsyncDispatcher.  Couple of options to fix it:
** Use InlineDispatcher or DrainDispatcher and call drain() (the latter is 
still technically a bit racy but the window is much smaller)
** Rather than checking the committer directly, spy/mock the event handler and 
verify after the event was handled that we didn't try to dispatch a committer 
event
* Nit: rather than explicitly waiting a hardcoded duration in the test case, we 
might be able to use verify with a timeout so we don't have to wait the full 
duration under normal test conditions.

 Stale files left behind for failed jobs
 ---

 Key: MAPREDUCE-5317
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5317
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 3.0.0, 2.0.4-alpha, 0.23.8
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: MAPREDUCE-5317.branch-0.23.patch, MAPREDUCE-5317.patch, 
 MAPREDUCE-5317.patch, MAPREDUCE-5317.patch, MAPREDUCE-5317.patch


 Courtesy [~amar_kamat]!
 {quote}
 We are seeing _temporary files left behind in the output folder if the job
 fails.
 The job were failed due to hitting quota issue.
 I simply ran the randomwriter (from hadoop examples) with the default setting.
 That failed and left behind some stray files.
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5359) JobHistory should not use File.separator to match timestamp in path

2013-07-02 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated MAPREDUCE-5359:
-

Hadoop Flags: Reviewed

+1 for the patch.  I'll commit this.

 JobHistory should not use File.separator to match timestamp in path
 ---

 Key: MAPREDUCE-5359
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5359
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Chuan Liu
Assignee: Chuan Liu
Priority: Minor
 Attachments: MAPREDUCE-5359-trunk.2.patch, MAPREDUCE-5359-trunk.patch


 In {{HistoryFileManager.getTimestampPartFromPath()}} method, we use the 
 following regular expression to match the timestamp in a Path object. 
 {code:java}
 \\d{4} + \\ + File.separator +  \\d{2} + \\ + File.separator + 
 \\d{2}
 {code}
 This is incorrect because Path uses backslash even for Windows path while 
 File.separator is platform dependent, and is a forward slash on Windows.
 This leads to failure matching the timestamp on Windows. One consequence is 
 that {{addDirectoryToSerialNumberIndex()}} also failed. Later, 
 {{getFileInfo()}} will fail if the job info is not in cache or intermediate 
 directory.
 The test case {{TestJobHistoryParsing.testScanningOldDirs()}} tests exactly 
 the above scenario and fails on Windows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5355) MiniMRYarnCluster with localFs does not work on Windows

2013-07-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698378#comment-13698378
 ] 

Hadoop QA commented on MAPREDUCE-5355:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12590530/MAPREDUCE-5355-trunk.2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3823//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3823//console

This message is automatically generated.

 MiniMRYarnCluster with localFs does not work on Windows
 ---

 Key: MAPREDUCE-5355
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5355
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Chuan Liu
Assignee: Chuan Liu
Priority: Minor
 Attachments: MAPREDUCE-5355-branch-2.patch, 
 MAPREDUCE-5355-trunk.2.patch, MAPREDUCE-5355-trunk.patch


 When MiniMRYarnCluster configured to run on localFs instead of remoteFs, i.e. 
 MiniDFSCluster, the job will fail on Windows. The error message looks like 
 the following.
 {noformat}
 java.io.IOException: Job status not available
 {noformat}
 In my testing, the following unit tests hit this exception.
 * TestMRJobsWithHistoryService
 * TestClusterMRNotification
 * TestJobCleanup
 * TestJobCounters
 * TestMiniMRClientCluster
 * TestJobOutputCommitter
 * TestMRAppWithCombiner
 * TestMROldApiJobs
 * TestSpeculativeExecution

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5372) ControlledJob#getMapredJobID capitalization is inconsistent between MR1 and MR2

2013-07-02 Thread Sandy Ryza (JIRA)
Sandy Ryza created MAPREDUCE-5372:
-

 Summary: ControlledJob#getMapredJobID capitalization is 
inconsistent between MR1 and MR2
 Key: MAPREDUCE-5372
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5372
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza


In MR2, the 'd' in Id is lowercase, but in MR1, it is capitalized.  While 
ControlledJob is marked as Evolving, there is no reason to be inconsistent here.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5355) MiniMRYarnCluster with localFs does not work on Windows

2013-07-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698388#comment-13698388
 ] 

Hudson commented on MAPREDUCE-5355:
---

Integrated in Hadoop-trunk-Commit #4035 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4035/])
MAPREDUCE-5355. MiniMRYarnCluster with localFs does not work on Windows. 
Contributed by Chuan Liu. (Revision 1499148)

 Result = SUCCESS
cnauroth : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1499148
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/MiniMRYarnCluster.java


 MiniMRYarnCluster with localFs does not work on Windows
 ---

 Key: MAPREDUCE-5355
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5355
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Chuan Liu
Assignee: Chuan Liu
Priority: Minor
 Attachments: MAPREDUCE-5355-branch-2.patch, 
 MAPREDUCE-5355-trunk.2.patch, MAPREDUCE-5355-trunk.patch


 When MiniMRYarnCluster configured to run on localFs instead of remoteFs, i.e. 
 MiniDFSCluster, the job will fail on Windows. The error message looks like 
 the following.
 {noformat}
 java.io.IOException: Job status not available
 {noformat}
 In my testing, the following unit tests hit this exception.
 * TestMRJobsWithHistoryService
 * TestClusterMRNotification
 * TestJobCleanup
 * TestJobCounters
 * TestMiniMRClientCluster
 * TestJobOutputCommitter
 * TestMRAppWithCombiner
 * TestMROldApiJobs
 * TestSpeculativeExecution

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5357) Job staging directory owner checking could fail on Windows

2013-07-02 Thread Chuan Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698391#comment-13698391
 ] 

Chuan Liu commented on MAPREDUCE-5357:
--

Hi Chris, the following tests fail on my machine.

* TestClusterMRNotification
* TestJobCleanup
* TestJobCounters
* TestJobOutputCommitter
* TestMROldApiJobs
* TestSpeculativeExecution

You may need to delete the stating directory on your local drive to repro the 
failure.
The user running the tests need to be an Administrators group user.
Error message looks like the following for me.

{noformat}
testSpeculativeExecution(org.apache.hadoop.mapreduce.v2.TestSpeculativeExecution)
  Time elapsed: 22109 sec   ERROR!
java.io.IOException: The ownership on the staging directory 
E:/tmp/hadoop-yarn/staging/chuanliu/.staging is not as expected. It is owned by 
Administrators. The directory must be owned by the submitter chuanliu or by 
chuanliu
{noformat}


 Job staging directory owner checking could fail on Windows
 --

 Key: MAPREDUCE-5357
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5357
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Chuan Liu
Assignee: Chuan Liu
Priority: Minor
 Attachments: MAPREDUCE-5357-trunk.patch


 In {{JobSubmissionFiles.getStagingDir()}}, we have following code that will 
 throw exception if the directory owner is not the current user.
 {code:java}
   String owner = fsStatus.getOwner();
   if (!(owner.equals(currentUser) || owner.equals(realUser))) {
  throw new IOException(The ownership on the staging directory  +
   stagingArea +  is not as expected.  +
   It is owned by  + owner + . The directory must  +
   be owned by the submitter  + currentUser +  or  +
   by  + realUser);
   }
 {code}
 This check will fail on Windows when the underlying file system is 
 LocalFileSystem. Because on Windows, the default file or directory owner 
 could be Administrators group if the user belongs to Administrators group.
 Quite a few MR unit tests that runs MR mini cluster with localFs as 
 underlying file system fail because of this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5359) JobHistory should not use File.separator to match timestamp in path

2013-07-02 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698394#comment-13698394
 ] 

Chris Nauroth commented on MAPREDUCE-5359:
--

{quote}
-1 tests included. The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this patch.
Also please list what manual steps were performed to verify this patch.
{quote}

This patch fixes existing tests on Windows, so no new tests are needed.

 JobHistory should not use File.separator to match timestamp in path
 ---

 Key: MAPREDUCE-5359
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5359
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Chuan Liu
Assignee: Chuan Liu
Priority: Minor
 Attachments: MAPREDUCE-5359-trunk.2.patch, MAPREDUCE-5359-trunk.patch


 In {{HistoryFileManager.getTimestampPartFromPath()}} method, we use the 
 following regular expression to match the timestamp in a Path object. 
 {code:java}
 \\d{4} + \\ + File.separator +  \\d{2} + \\ + File.separator + 
 \\d{2}
 {code}
 This is incorrect because Path uses backslash even for Windows path while 
 File.separator is platform dependent, and is a forward slash on Windows.
 This leads to failure matching the timestamp on Windows. One consequence is 
 that {{addDirectoryToSerialNumberIndex()}} also failed. Later, 
 {{getFileInfo()}} will fail if the job info is not in cache or intermediate 
 directory.
 The test case {{TestJobHistoryParsing.testScanningOldDirs()}} tests exactly 
 the above scenario and fails on Windows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5355) MiniMRYarnCluster with localFs does not work on Windows

2013-07-02 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated MAPREDUCE-5355:
-

  Resolution: Fixed
   Fix Version/s: 2.1.0-beta
  3.0.0
Target Version/s: 3.0.0, 2.1.0-beta
  Status: Resolved  (was: Patch Available)

I committed this to trunk, branch-2, and branch-2.1-beta.  Chuan, thank you for 
your contribution.

 MiniMRYarnCluster with localFs does not work on Windows
 ---

 Key: MAPREDUCE-5355
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5355
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Chuan Liu
Assignee: Chuan Liu
Priority: Minor
 Fix For: 3.0.0, 2.1.0-beta

 Attachments: MAPREDUCE-5355-branch-2.patch, 
 MAPREDUCE-5355-trunk.2.patch, MAPREDUCE-5355-trunk.patch


 When MiniMRYarnCluster configured to run on localFs instead of remoteFs, i.e. 
 MiniDFSCluster, the job will fail on Windows. The error message looks like 
 the following.
 {noformat}
 java.io.IOException: Job status not available
 {noformat}
 In my testing, the following unit tests hit this exception.
 * TestMRJobsWithHistoryService
 * TestClusterMRNotification
 * TestJobCleanup
 * TestJobCounters
 * TestMiniMRClientCluster
 * TestJobOutputCommitter
 * TestMRAppWithCombiner
 * TestMROldApiJobs
 * TestSpeculativeExecution

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5359) JobHistory should not use File.separator to match timestamp in path

2013-07-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698422#comment-13698422
 ] 

Hudson commented on MAPREDUCE-5359:
---

Integrated in Hadoop-trunk-Commit #4037 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4037/])
MAPREDUCE-5359. JobHistory should not use File.separator to match timestamp 
in path. Contributed by Chuan Liu. (Revision 1499153)

 Result = SUCCESS
cnauroth : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1499153
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/jobhistory/JobHistoryUtils.java


 JobHistory should not use File.separator to match timestamp in path
 ---

 Key: MAPREDUCE-5359
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5359
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Chuan Liu
Assignee: Chuan Liu
Priority: Minor
 Attachments: MAPREDUCE-5359-trunk.2.patch, MAPREDUCE-5359-trunk.patch


 In {{HistoryFileManager.getTimestampPartFromPath()}} method, we use the 
 following regular expression to match the timestamp in a Path object. 
 {code:java}
 \\d{4} + \\ + File.separator +  \\d{2} + \\ + File.separator + 
 \\d{2}
 {code}
 This is incorrect because Path uses backslash even for Windows path while 
 File.separator is platform dependent, and is a forward slash on Windows.
 This leads to failure matching the timestamp on Windows. One consequence is 
 that {{addDirectoryToSerialNumberIndex()}} also failed. Later, 
 {{getFileInfo()}} will fail if the job info is not in cache or intermediate 
 directory.
 The test case {{TestJobHistoryParsing.testScanningOldDirs()}} tests exactly 
 the above scenario and fails on Windows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5359) JobHistory should not use File.separator to match timestamp in path

2013-07-02 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated MAPREDUCE-5359:
-

  Resolution: Fixed
   Fix Version/s: 2.1.0-beta
  3.0.0
Target Version/s: 3.0.0, 2.1.0-beta
  Status: Resolved  (was: Patch Available)

I committed this to trunk, branch-2, and branch-2.1-beta.  Thank you to Chuan 
for contributing this patch.

 JobHistory should not use File.separator to match timestamp in path
 ---

 Key: MAPREDUCE-5359
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5359
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Chuan Liu
Assignee: Chuan Liu
Priority: Minor
 Fix For: 3.0.0, 2.1.0-beta

 Attachments: MAPREDUCE-5359-trunk.2.patch, MAPREDUCE-5359-trunk.patch


 In {{HistoryFileManager.getTimestampPartFromPath()}} method, we use the 
 following regular expression to match the timestamp in a Path object. 
 {code:java}
 \\d{4} + \\ + File.separator +  \\d{2} + \\ + File.separator + 
 \\d{2}
 {code}
 This is incorrect because Path uses backslash even for Windows path while 
 File.separator is platform dependent, and is a forward slash on Windows.
 This leads to failure matching the timestamp on Windows. One consequence is 
 that {{addDirectoryToSerialNumberIndex()}} also failed. Later, 
 {{getFileInfo()}} will fail if the job info is not in cache or intermediate 
 directory.
 The test case {{TestJobHistoryParsing.testScanningOldDirs()}} tests exactly 
 the above scenario and fails on Windows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5363) Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus

2013-07-02 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated MAPREDUCE-5363:
-

Status: Patch Available  (was: Open)

Wrote all values Status enum can take and fixed spelling.

 Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus
 

 Key: MAPREDUCE-5363
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5363
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Affects Versions: 1.1.2, 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Akira AJISAKA
  Labels: newbie

 The doc for TaskCompletionEvent#get(Task)Status in both MR1 and MR2 is
 {code}
 Returns enum Status.SUCESS or Status.FAILURE.
 @return task tracker status
 {code}
 The actual values that the Status enum can take are
 FAILED, KILLED, SUCCEEDED, OBSOLETE, TIPFAILED

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5363) Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus

2013-07-02 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated MAPREDUCE-5363:
-

Attachment: MAPREDUCE-5363-1.patch

 Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus
 

 Key: MAPREDUCE-5363
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5363
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Affects Versions: 1.1.2, 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Akira AJISAKA
  Labels: newbie
 Attachments: MAPREDUCE-5363-1.patch


 The doc for TaskCompletionEvent#get(Task)Status in both MR1 and MR2 is
 {code}
 Returns enum Status.SUCESS or Status.FAILURE.
 @return task tracker status
 {code}
 The actual values that the Status enum can take are
 FAILED, KILLED, SUCCEEDED, OBSOLETE, TIPFAILED

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5363) Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus

2013-07-02 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated MAPREDUCE-5363:
-

Status: Open  (was: Patch Available)

I forgot to add a patch.

 Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus
 

 Key: MAPREDUCE-5363
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5363
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Affects Versions: 1.1.2, 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Akira AJISAKA
  Labels: newbie
 Attachments: MAPREDUCE-5363-1.patch


 The doc for TaskCompletionEvent#get(Task)Status in both MR1 and MR2 is
 {code}
 Returns enum Status.SUCESS or Status.FAILURE.
 @return task tracker status
 {code}
 The actual values that the Status enum can take are
 FAILED, KILLED, SUCCEEDED, OBSOLETE, TIPFAILED

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5363) Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus

2013-07-02 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated MAPREDUCE-5363:
-

Status: Patch Available  (was: Open)

I added a patch.

 Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus
 

 Key: MAPREDUCE-5363
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5363
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Affects Versions: 1.1.2, 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Akira AJISAKA
  Labels: newbie
 Attachments: MAPREDUCE-5363-1.patch


 The doc for TaskCompletionEvent#get(Task)Status in both MR1 and MR2 is
 {code}
 Returns enum Status.SUCESS or Status.FAILURE.
 @return task tracker status
 {code}
 The actual values that the Status enum can take are
 FAILED, KILLED, SUCCEEDED, OBSOLETE, TIPFAILED

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5351) JobTracker memory leak caused by CleanupQueue reopening FileSystem

2013-07-02 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698438#comment-13698438
 ] 

Arun C Murthy commented on MAPREDUCE-5351:
--

Duh, good point.

Can you please add a comment to the addendum patch explaining the rationale for 
the check? Also, a test case specifically covering this bug would be good. 
Thanks!

 JobTracker memory leak caused by CleanupQueue reopening FileSystem
 --

 Key: MAPREDUCE-5351
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5351
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 1.1.2
Reporter: Sandy Ryza
Assignee: Sandy Ryza
Priority: Critical
 Fix For: 1.2.1

 Attachments: MAPREDUCE-5351-1.patch, MAPREDUCE-5351-2.patch, 
 MAPREDUCE-5351-addendum.patch, MAPREDUCE-5351.patch


 When a job is completed, closeAllForUGI is called to close all the cached 
 FileSystems in the FileSystem cache.  However, the CleanupQueue may run after 
 this occurs and call FileSystem.get() to delete the staging directory, adding 
 a FileSystem to the cache that will never be closed.
 People on the user-list have reported this causing their JobTrackers to OOME 
 every two weeks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5363) Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus

2013-07-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698445#comment-13698445
 ] 

Hadoop QA commented on MAPREDUCE-5363:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12590549/MAPREDUCE-5363-1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3824//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3824//console

This message is automatically generated.

 Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus
 

 Key: MAPREDUCE-5363
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5363
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Affects Versions: 1.1.2, 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Akira AJISAKA
  Labels: newbie
 Attachments: MAPREDUCE-5363-1.patch


 The doc for TaskCompletionEvent#get(Task)Status in both MR1 and MR2 is
 {code}
 Returns enum Status.SUCESS or Status.FAILURE.
 @return task tracker status
 {code}
 The actual values that the Status enum can take are
 FAILED, KILLED, SUCCEEDED, OBSOLETE, TIPFAILED

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5363) Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus

2013-07-02 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698448#comment-13698448
 ] 

Sandy Ryza commented on MAPREDUCE-5363:
---

Thanks for taking this up Akira.  Those changes look good to me.  Sorry I 
didn't mention this before, but I think it would also be clearer to replace 
task tracker status with task completion status.

 Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus
 

 Key: MAPREDUCE-5363
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5363
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Affects Versions: 1.1.2, 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Akira AJISAKA
  Labels: newbie
 Attachments: MAPREDUCE-5363-1.patch


 The doc for TaskCompletionEvent#get(Task)Status in both MR1 and MR2 is
 {code}
 Returns enum Status.SUCESS or Status.FAILURE.
 @return task tracker status
 {code}
 The actual values that the Status enum can take are
 FAILED, KILLED, SUCCEEDED, OBSOLETE, TIPFAILED

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5363) Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus

2013-07-02 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated MAPREDUCE-5363:
-

Attachment: MAPREDUCE-5363-2.patch

 Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus
 

 Key: MAPREDUCE-5363
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5363
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Affects Versions: 1.1.2, 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Akira AJISAKA
  Labels: newbie
 Attachments: MAPREDUCE-5363-1.patch, MAPREDUCE-5363-2.patch


 The doc for TaskCompletionEvent#get(Task)Status in both MR1 and MR2 is
 {code}
 Returns enum Status.SUCESS or Status.FAILURE.
 @return task tracker status
 {code}
 The actual values that the Status enum can take are
 FAILED, KILLED, SUCCEEDED, OBSOLETE, TIPFAILED

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5363) Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus

2013-07-02 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698467#comment-13698467
 ] 

Akira AJISAKA commented on MAPREDUCE-5363:
--

I agree with your proposal. I attached a patch.

 Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus
 

 Key: MAPREDUCE-5363
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5363
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Affects Versions: 1.1.2, 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Akira AJISAKA
  Labels: newbie
 Attachments: MAPREDUCE-5363-1.patch, MAPREDUCE-5363-2.patch


 The doc for TaskCompletionEvent#get(Task)Status in both MR1 and MR2 is
 {code}
 Returns enum Status.SUCESS or Status.FAILURE.
 @return task tracker status
 {code}
 The actual values that the Status enum can take are
 FAILED, KILLED, SUCCEEDED, OBSOLETE, TIPFAILED

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (MAPREDUCE-4192) the TaskMemoryManager thread is not interrupt when the TaskTracker is oedered to reinit by JobTracker

2013-07-02 Thread Hua xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hua xu reassigned MAPREDUCE-4192:
-

Assignee: Hua xu

 the TaskMemoryManager thread is not interrupt when the TaskTracker is oedered 
 to reinit by JobTracker
 -

 Key: MAPREDUCE-4192
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4192
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.2
Reporter: Hua xu
Assignee: Hua xu

 When the TaskTracker is oedered to reinit by JobTracker, it will interrupt 
 some threads and then reinit them, but TaskTracker does not interrupt  
 TaskMemoryManager thread and create a new TaskMemoryManager thread again. I 
 use the tool--jstack to find that(I reinit TaskTracker 3 times through 
 JobTracker send TaskTrackerAction.ActionType.REINIT_TRACKER).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5363) Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus

2013-07-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698474#comment-13698474
 ] 

Hadoop QA commented on MAPREDUCE-5363:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12590554/MAPREDUCE-5363-2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3825//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3825//console

This message is automatically generated.

 Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus
 

 Key: MAPREDUCE-5363
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5363
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Affects Versions: 1.1.2, 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Akira AJISAKA
  Labels: newbie
 Attachments: MAPREDUCE-5363-1.patch, MAPREDUCE-5363-2.patch


 The doc for TaskCompletionEvent#get(Task)Status in both MR1 and MR2 is
 {code}
 Returns enum Status.SUCESS or Status.FAILURE.
 @return task tracker status
 {code}
 The actual values that the Status enum can take are
 FAILED, KILLED, SUCCEEDED, OBSOLETE, TIPFAILED

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5357) Job staging directory owner checking could fail on Windows

2013-07-02 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated MAPREDUCE-5357:
-

Target Version/s: 3.0.0, 2.1.0-beta
Hadoop Flags: Reviewed

 Job staging directory owner checking could fail on Windows
 --

 Key: MAPREDUCE-5357
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5357
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Chuan Liu
Assignee: Chuan Liu
Priority: Minor
 Attachments: MAPREDUCE-5357-trunk.patch


 In {{JobSubmissionFiles.getStagingDir()}}, we have following code that will 
 throw exception if the directory owner is not the current user.
 {code:java}
   String owner = fsStatus.getOwner();
   if (!(owner.equals(currentUser) || owner.equals(realUser))) {
  throw new IOException(The ownership on the staging directory  +
   stagingArea +  is not as expected.  +
   It is owned by  + owner + . The directory must  +
   be owned by the submitter  + currentUser +  or  +
   by  + realUser);
   }
 {code}
 This check will fail on Windows when the underlying file system is 
 LocalFileSystem. Because on Windows, the default file or directory owner 
 could be Administrators group if the user belongs to Administrators group.
 Quite a few MR unit tests that runs MR mini cluster with localFs as 
 underlying file system fail because of this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5357) Job staging directory owner checking could fail on Windows

2013-07-02 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698610#comment-13698610
 ] 

Chris Nauroth commented on MAPREDUCE-5357:
--

{quote}
The user running the tests need to be an Administrators group user.
{quote}

This explains why I wasn't seeing the problem earlier.  I've been running as a 
non-admin user.  I've verified that the tests still pass when running as a 
non-admin user.

+1 for the patch.  I'll commit this.


 Job staging directory owner checking could fail on Windows
 --

 Key: MAPREDUCE-5357
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5357
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Chuan Liu
Assignee: Chuan Liu
Priority: Minor
 Attachments: MAPREDUCE-5357-trunk.patch


 In {{JobSubmissionFiles.getStagingDir()}}, we have following code that will 
 throw exception if the directory owner is not the current user.
 {code:java}
   String owner = fsStatus.getOwner();
   if (!(owner.equals(currentUser) || owner.equals(realUser))) {
  throw new IOException(The ownership on the staging directory  +
   stagingArea +  is not as expected.  +
   It is owned by  + owner + . The directory must  +
   be owned by the submitter  + currentUser +  or  +
   by  + realUser);
   }
 {code}
 This check will fail on Windows when the underlying file system is 
 LocalFileSystem. Because on Windows, the default file or directory owner 
 could be Administrators group if the user belongs to Administrators group.
 Quite a few MR unit tests that runs MR mini cluster with localFs as 
 underlying file system fail because of this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5357) Job staging directory owner checking could fail on Windows

2013-07-02 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698615#comment-13698615
 ] 

Chris Nauroth commented on MAPREDUCE-5357:
--

{quote}
-1 tests included. The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this patch.
Also please list what manual steps were performed to verify this patch.
{quote}

The patch fixes multiple existing tests, so no new tests are required.

 Job staging directory owner checking could fail on Windows
 --

 Key: MAPREDUCE-5357
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5357
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Chuan Liu
Assignee: Chuan Liu
Priority: Minor
 Attachments: MAPREDUCE-5357-trunk.patch


 In {{JobSubmissionFiles.getStagingDir()}}, we have following code that will 
 throw exception if the directory owner is not the current user.
 {code:java}
   String owner = fsStatus.getOwner();
   if (!(owner.equals(currentUser) || owner.equals(realUser))) {
  throw new IOException(The ownership on the staging directory  +
   stagingArea +  is not as expected.  +
   It is owned by  + owner + . The directory must  +
   be owned by the submitter  + currentUser +  or  +
   by  + realUser);
   }
 {code}
 This check will fail on Windows when the underlying file system is 
 LocalFileSystem. Because on Windows, the default file or directory owner 
 could be Administrators group if the user belongs to Administrators group.
 Quite a few MR unit tests that runs MR mini cluster with localFs as 
 underlying file system fail because of this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5363) Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus

2013-07-02 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698617#comment-13698617
 ] 

Zhijie Shen commented on MAPREDUCE-5363:


The doc fix looks good, but is it more concise to refer to Status enum as 
follows:

{code}
   * Returns {@link Status}
{code}

instead of listing all enum values?

 Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus
 

 Key: MAPREDUCE-5363
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5363
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1, mrv2
Affects Versions: 1.1.2, 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Akira AJISAKA
  Labels: newbie
 Attachments: MAPREDUCE-5363-1.patch, MAPREDUCE-5363-2.patch


 The doc for TaskCompletionEvent#get(Task)Status in both MR1 and MR2 is
 {code}
 Returns enum Status.SUCESS or Status.FAILURE.
 @return task tracker status
 {code}
 The actual values that the Status enum can take are
 FAILED, KILLED, SUCCEEDED, OBSOLETE, TIPFAILED

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5357) Job staging directory owner checking could fail on Windows

2013-07-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698622#comment-13698622
 ] 

Hudson commented on MAPREDUCE-5357:
---

Integrated in Hadoop-trunk-Commit #4038 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4038/])
MAPREDUCE-5357. Job staging directory owner checking could fail on Windows. 
(Revision 1499210)

 Result = SUCCESS
cnauroth : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1499210
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobSubmissionFiles.java


 Job staging directory owner checking could fail on Windows
 --

 Key: MAPREDUCE-5357
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5357
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Chuan Liu
Assignee: Chuan Liu
Priority: Minor
 Attachments: MAPREDUCE-5357-trunk.patch


 In {{JobSubmissionFiles.getStagingDir()}}, we have following code that will 
 throw exception if the directory owner is not the current user.
 {code:java}
   String owner = fsStatus.getOwner();
   if (!(owner.equals(currentUser) || owner.equals(realUser))) {
  throw new IOException(The ownership on the staging directory  +
   stagingArea +  is not as expected.  +
   It is owned by  + owner + . The directory must  +
   be owned by the submitter  + currentUser +  or  +
   by  + realUser);
   }
 {code}
 This check will fail on Windows when the underlying file system is 
 LocalFileSystem. Because on Windows, the default file or directory owner 
 could be Administrators group if the user belongs to Administrators group.
 Quite a few MR unit tests that runs MR mini cluster with localFs as 
 underlying file system fail because of this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5357) Job staging directory owner checking could fail on Windows

2013-07-02 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated MAPREDUCE-5357:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

I committed this to trunk, branch-2, and branch-2.1-beta.  Thank you for 
contributing the patch, Chuan.

 Job staging directory owner checking could fail on Windows
 --

 Key: MAPREDUCE-5357
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5357
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Chuan Liu
Assignee: Chuan Liu
Priority: Minor
 Attachments: MAPREDUCE-5357-trunk.patch


 In {{JobSubmissionFiles.getStagingDir()}}, we have following code that will 
 throw exception if the directory owner is not the current user.
 {code:java}
   String owner = fsStatus.getOwner();
   if (!(owner.equals(currentUser) || owner.equals(realUser))) {
  throw new IOException(The ownership on the staging directory  +
   stagingArea +  is not as expected.  +
   It is owned by  + owner + . The directory must  +
   be owned by the submitter  + currentUser +  or  +
   by  + realUser);
   }
 {code}
 This check will fail on Windows when the underlying file system is 
 LocalFileSystem. Because on Windows, the default file or directory owner 
 could be Administrators group if the user belongs to Administrators group.
 Quite a few MR unit tests that runs MR mini cluster with localFs as 
 underlying file system fail because of this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5357) Job staging directory owner checking could fail on Windows

2013-07-02 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated MAPREDUCE-5357:
-

Fix Version/s: 2.1.0-beta
   3.0.0

 Job staging directory owner checking could fail on Windows
 --

 Key: MAPREDUCE-5357
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5357
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Chuan Liu
Assignee: Chuan Liu
Priority: Minor
 Fix For: 3.0.0, 2.1.0-beta

 Attachments: MAPREDUCE-5357-trunk.patch


 In {{JobSubmissionFiles.getStagingDir()}}, we have following code that will 
 throw exception if the directory owner is not the current user.
 {code:java}
   String owner = fsStatus.getOwner();
   if (!(owner.equals(currentUser) || owner.equals(realUser))) {
  throw new IOException(The ownership on the staging directory  +
   stagingArea +  is not as expected.  +
   It is owned by  + owner + . The directory must  +
   be owned by the submitter  + currentUser +  or  +
   by  + realUser);
   }
 {code}
 This check will fail on Windows when the underlying file system is 
 LocalFileSystem. Because on Windows, the default file or directory owner 
 could be Administrators group if the user belongs to Administrators group.
 Quite a few MR unit tests that runs MR mini cluster with localFs as 
 underlying file system fail because of this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira