[jira] [Commented] (MAPREDUCE-6357) MultipleOutputs.write() API should document that output committing is not utilized when input path is absolute

2015-07-29 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14646575#comment-14646575
 ] 

Ivan Mitic commented on MAPREDUCE-6357:
---

Thanks [~cotedm], please feel free to take it up.

 MultipleOutputs.write() API should document that output committing is not 
 utilized when input path is absolute
 --

 Key: MAPREDUCE-6357
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6357
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.6.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic

 After spending the afternoon debugging a user job where reduce tasks were 
 failing on retry with the below exception, I think it would be worthwhile to 
 add a note in the MultipleOutputs.write() documentation, saying that absolute 
 paths may cause improper execution of tasks on retry or when MR speculative 
 execution is enabled. 
 {code}
 2015-04-28 23:13:10,452 WARN [main] org.apache.hadoop.mapred.YarnChild: 
 Exception running child : java.io.IOException: File already 
 exists:wasb://full20150...@bgtstoragefull.blob.core.windows.net/user/hadoop/some/path/block-r-00299.bz2
at 
 org.apache.hadoop.fs.azure.NativeAzureFileSystem.create(NativeAzureFileSystem.java:1354)
at 
 org.apache.hadoop.fs.azure.NativeAzureFileSystem.create(NativeAzureFileSystem.java:1195)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
at 
 org.apache.hadoop.mapreduce.lib.output.TextOutputFormat.getRecordWriter(TextOutputFormat.java:135)
at 
 org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.getRecordWriter(MultipleOutputs.java:475)
at 
 org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.write(MultipleOutputs.java:433)
at 
 com.ancestry.bigtree.hadoop.LevelReducer.processValue(LevelReducer.java:91)
at 
 com.ancestry.bigtree.hadoop.LevelReducer.reduce(LevelReducer.java:69)
at 
 com.ancestry.bigtree.hadoop.LevelReducer.reduce(LevelReducer.java:14)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
at 
 org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
 {code}
 As discussed in MAPREDUCE-3772, when the baseOutputPath passed to 
 MultipleOutputs.write() is an absolute path (or more precisely a path that 
 resolves outside of the job output-dir), the concept of output committing is 
 not utilized. 
 In this case, the user read thru the MultipleOutputs docs and was assuming 
 that everything will be working fine, as there are blog posts saying that 
 MultipleOutputs does handle output commit. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6357) MultipleOutputs.write() API should document that output committing is not utilized when input path is absolute

2015-05-05 Thread Ivan Mitic (JIRA)
Ivan Mitic created MAPREDUCE-6357:
-

 Summary: MultipleOutputs.write() API should document that output 
committing is not utilized when input path is absolute
 Key: MAPREDUCE-6357
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6357
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.6.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic


After spending the afternoon debugging a user job where reduce tasks were 
failing on retry with the below exception, I think it would be worthwhile to 
add a note in the MultipleOutputs.write() documentation, saying that absolute 
paths may cause improper execution of tasks on retry or when MR speculative 
execution is enabled. 

{code}
2015-04-28 23:13:10,452 WARN [main] org.apache.hadoop.mapred.YarnChild: 
Exception running child : java.io.IOException: File already 
exists:wasb://full20150...@bgtstoragefull.blob.core.windows.net/user/hadoop/some/path/block-r-00299.bz2
   at 
org.apache.hadoop.fs.azure.NativeAzureFileSystem.create(NativeAzureFileSystem.java:1354)
   at 
org.apache.hadoop.fs.azure.NativeAzureFileSystem.create(NativeAzureFileSystem.java:1195)
   at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
   at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
   at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
   at 
org.apache.hadoop.mapreduce.lib.output.TextOutputFormat.getRecordWriter(TextOutputFormat.java:135)
   at 
org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.getRecordWriter(MultipleOutputs.java:475)
   at 
org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.write(MultipleOutputs.java:433)
   at 
com.ancestry.bigtree.hadoop.LevelReducer.processValue(LevelReducer.java:91)
   at com.ancestry.bigtree.hadoop.LevelReducer.reduce(LevelReducer.java:69)
   at com.ancestry.bigtree.hadoop.LevelReducer.reduce(LevelReducer.java:14)
   at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
   at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
{code}

As discussed in MAPREDUCE-3772, when the baseOutputPath passed to 
MultipleOutputs.write() is an absolute path (or more precisely a path that 
resolves outside of the job output-dir), the concept of output committing is 
not utilized. 

In this case, the user read thru the MultipleOutputs docs and was assuming that 
everything will be working fine, as there are blog posts saying that 
MultipleOutputs does handle output commit. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5911) Terasort TeraOutputFormat does not check for output directory existance

2014-10-19 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176456#comment-14176456
 ] 

Ivan Mitic commented on MAPREDUCE-5911:
---

Hi Bruno, it should be ok not to include a test case with this change, it's a 
minor fix to the examples.

Will commit the patch shortly. 

 Terasort TeraOutputFormat does not check for output directory existance
 ---

 Key: MAPREDUCE-5911
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5911
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: examples
Reporter: Ivan Mitic
Assignee: Bruno P. Kinoshita
Priority: Minor
 Attachments: HADOOP-5911.patch


 The enforcement that the directory must not yet exist is implemented in 
 {{FileOutputFormat#checkOutputSpecs}} by throwing 
 {{FileAlreadyExistsException}}.  However, terasort uses a specialized output 
 format, {{TeraOutputFormat}}, which is a subclass of {{FileOutputFormat}}.  
 The subclass overrides {{checkOutputSpecs}}, but does not re-implement the 
 existence check and throw {{FileAlreadyExistsException}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5911) Terasort TeraOutputFormat does not check for output directory existance

2014-10-19 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5911:
--
   Resolution: Fixed
Fix Version/s: 2.6.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to trunk, branch-2 and branch-2.6.

Thank you Bruno for the contribution!

 Terasort TeraOutputFormat does not check for output directory existance
 ---

 Key: MAPREDUCE-5911
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5911
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: examples
Reporter: Ivan Mitic
Assignee: Bruno P. Kinoshita
Priority: Minor
 Fix For: 2.6.0

 Attachments: HADOOP-5911.patch


 The enforcement that the directory must not yet exist is implemented in 
 {{FileOutputFormat#checkOutputSpecs}} by throwing 
 {{FileAlreadyExistsException}}.  However, terasort uses a specialized output 
 format, {{TeraOutputFormat}}, which is a subclass of {{FileOutputFormat}}.  
 The subclass overrides {{checkOutputSpecs}}, but does not re-implement the 
 existence check and throw {{FileAlreadyExistsException}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5911) Terasort TeraOutputFormat does not check for output directory existance

2014-10-19 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176570#comment-14176570
 ] 

Ivan Mitic commented on MAPREDUCE-5911:
---

Thank you [~jira.shegalov] for bringing this up. You are right, this won't work 
with the default partitioner. Sorry I wasn't aware of MAPREDUCE-4879. Let me 
take another look and see whether to revert the change that went in or go with 
your patch as an addendum. 

 Terasort TeraOutputFormat does not check for output directory existance
 ---

 Key: MAPREDUCE-5911
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5911
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: examples
Reporter: Ivan Mitic
Assignee: Bruno P. Kinoshita
Priority: Minor
 Fix For: 2.6.0

 Attachments: HADOOP-5911.patch


 The enforcement that the directory must not yet exist is implemented in 
 {{FileOutputFormat#checkOutputSpecs}} by throwing 
 {{FileAlreadyExistsException}}.  However, terasort uses a specialized output 
 format, {{TeraOutputFormat}}, which is a subclass of {{FileOutputFormat}}.  
 The subclass overrides {{checkOutputSpecs}}, but does not re-implement the 
 existence check and throw {{FileAlreadyExistsException}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (MAPREDUCE-5911) Terasort TeraOutputFormat does not check for output directory existance

2014-10-19 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic reopened MAPREDUCE-5911:
---

 Terasort TeraOutputFormat does not check for output directory existance
 ---

 Key: MAPREDUCE-5911
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5911
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: examples
Reporter: Ivan Mitic
Assignee: Bruno P. Kinoshita
Priority: Minor
 Attachments: HADOOP-5911.patch


 The enforcement that the directory must not yet exist is implemented in 
 {{FileOutputFormat#checkOutputSpecs}} by throwing 
 {{FileAlreadyExistsException}}.  However, terasort uses a specialized output 
 format, {{TeraOutputFormat}}, which is a subclass of {{FileOutputFormat}}.  
 The subclass overrides {{checkOutputSpecs}}, but does not re-implement the 
 existence check and throw {{FileAlreadyExistsException}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5911) Terasort TeraOutputFormat does not check for output directory existance

2014-10-19 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5911:
--
Fix Version/s: (was: 2.6.0)

 Terasort TeraOutputFormat does not check for output directory existance
 ---

 Key: MAPREDUCE-5911
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5911
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: examples
Reporter: Ivan Mitic
Assignee: Bruno P. Kinoshita
Priority: Minor
 Attachments: HADOOP-5911.patch


 The enforcement that the directory must not yet exist is implemented in 
 {{FileOutputFormat#checkOutputSpecs}} by throwing 
 {{FileAlreadyExistsException}}.  However, terasort uses a specialized output 
 format, {{TeraOutputFormat}}, which is a subclass of {{FileOutputFormat}}.  
 The subclass overrides {{checkOutputSpecs}}, but does not re-implement the 
 existence check and throw {{FileAlreadyExistsException}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5911) Terasort TeraOutputFormat does not check for output directory existance

2014-10-19 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176581#comment-14176581
 ] 

Ivan Mitic commented on MAPREDUCE-5911:
---

OK, I am going to revert the change given that it does not work and resolve 
this Jira as a duplicate of MAPREDUCE-4879. Let's iterate further on the other 
Jira. Thanks again Gera for catching this.

 Terasort TeraOutputFormat does not check for output directory existance
 ---

 Key: MAPREDUCE-5911
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5911
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: examples
Reporter: Ivan Mitic
Assignee: Bruno P. Kinoshita
Priority: Minor
 Attachments: HADOOP-5911.patch


 The enforcement that the directory must not yet exist is implemented in 
 {{FileOutputFormat#checkOutputSpecs}} by throwing 
 {{FileAlreadyExistsException}}.  However, terasort uses a specialized output 
 format, {{TeraOutputFormat}}, which is a subclass of {{FileOutputFormat}}.  
 The subclass overrides {{checkOutputSpecs}}, but does not re-implement the 
 existence check and throw {{FileAlreadyExistsException}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (MAPREDUCE-5911) Terasort TeraOutputFormat does not check for output directory existance

2014-10-19 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic resolved MAPREDUCE-5911.
---
Resolution: Duplicate

I reverted the patch from trunk, branch-2 and branch-2.6. Resolving this Jira 
as a dupe of MAPREDUCE-4879, let's iterate on the right fix there.

 Terasort TeraOutputFormat does not check for output directory existance
 ---

 Key: MAPREDUCE-5911
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5911
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: examples
Reporter: Ivan Mitic
Assignee: Bruno P. Kinoshita
Priority: Minor
 Attachments: HADOOP-5911.patch


 The enforcement that the directory must not yet exist is implemented in 
 {{FileOutputFormat#checkOutputSpecs}} by throwing 
 {{FileAlreadyExistsException}}.  However, terasort uses a specialized output 
 format, {{TeraOutputFormat}}, which is a subclass of {{FileOutputFormat}}.  
 The subclass overrides {{checkOutputSpecs}}, but does not re-implement the 
 existence check and throw {{FileAlreadyExistsException}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5911) Terasort TeraOutputFormat does not check for output directory existance

2014-10-18 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5911:
--
Assignee: Bruno P. Kinoshita  (was: Ivan Mitic)

 Terasort TeraOutputFormat does not check for output directory existance
 ---

 Key: MAPREDUCE-5911
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5911
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: examples
Reporter: Ivan Mitic
Assignee: Bruno P. Kinoshita
Priority: Minor
 Attachments: HADOOP-5911.patch


 The enforcement that the directory must not yet exist is implemented in 
 {{FileOutputFormat#checkOutputSpecs}} by throwing 
 {{FileAlreadyExistsException}}.  However, terasort uses a specialized output 
 format, {{TeraOutputFormat}}, which is a subclass of {{FileOutputFormat}}.  
 The subclass overrides {{checkOutputSpecs}}, but does not re-implement the 
 existence check and throw {{FileAlreadyExistsException}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5911) Terasort TeraOutputFormat does not check for output directory existance

2014-10-18 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5911:
--
Status: Patch Available  (was: Open)

 Terasort TeraOutputFormat does not check for output directory existance
 ---

 Key: MAPREDUCE-5911
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5911
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: examples
Reporter: Ivan Mitic
Assignee: Bruno P. Kinoshita
Priority: Minor
 Attachments: HADOOP-5911.patch


 The enforcement that the directory must not yet exist is implemented in 
 {{FileOutputFormat#checkOutputSpecs}} by throwing 
 {{FileAlreadyExistsException}}.  However, terasort uses a specialized output 
 format, {{TeraOutputFormat}}, which is a subclass of {{FileOutputFormat}}.  
 The subclass overrides {{checkOutputSpecs}}, but does not re-implement the 
 existence check and throw {{FileAlreadyExistsException}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-5911) Terasort TeraOutputFormat does not check for output directory existance

2014-10-18 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176210#comment-14176210
 ] 

Ivan Mitic commented on MAPREDUCE-5911:
---

Hi Bruno, thanks for contributing the patch! Looks good, +1.

Will commit when it comes back with +1 from Jenkins.

 Terasort TeraOutputFormat does not check for output directory existance
 ---

 Key: MAPREDUCE-5911
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5911
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: examples
Reporter: Ivan Mitic
Assignee: Bruno P. Kinoshita
Priority: Minor
 Attachments: HADOOP-5911.patch


 The enforcement that the directory must not yet exist is implemented in 
 {{FileOutputFormat#checkOutputSpecs}} by throwing 
 {{FileAlreadyExistsException}}.  However, terasort uses a specialized output 
 format, {{TeraOutputFormat}}, which is a subclass of {{FileOutputFormat}}.  
 The subclass overrides {{checkOutputSpecs}}, but does not re-implement the 
 existence check and throw {{FileAlreadyExistsException}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-5911) Terasort TeraOutputFormat does not check for output directory existance

2014-05-30 Thread Ivan Mitic (JIRA)
Ivan Mitic created MAPREDUCE-5911:
-

 Summary: Terasort TeraOutputFormat does not check for output 
directory existance
 Key: MAPREDUCE-5911
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5911
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: examples
Reporter: Ivan Mitic
Assignee: Ivan Mitic
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5512) TaskTracker hung after failed reconnect to the JobTracker

2013-10-10 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13792356#comment-13792356
 ] 

Ivan Mitic commented on MAPREDUCE-5512:
---

Thanks Chris for the review, will commit the patch shortly.

 TaskTracker hung after failed reconnect to the JobTracker
 -

 Key: MAPREDUCE-5512
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5512
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 1.3.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: hadoop-tasktracker-RD00155DD09100.log, 
 MAPREDUCE-5512.branch-1.patch, tt_Hung.txt


 TaskTracker hung after failed reconnect to the JobTracker. 
 This is the problematic piece of code:
 {code}
 this.distributedCacheManager = new TrackerDistributedCacheManager(
 this.fConf, taskController);
 this.distributedCacheManager.startCleanupThread();
 
 this.jobClient = (InterTrackerProtocol) 
 UserGroupInformation.getLoginUser().doAs(
 new PrivilegedExceptionActionObject() {
   public Object run() throws IOException {
 return RPC.waitForProxy(InterTrackerProtocol.class,
 InterTrackerProtocol.versionID,
 jobTrackAddr, fConf);
   }
 });
 {code}
 In case RPC.waitForProxy() throws, TrackerDistributedCacheManager cleanup 
 thread will never be stopped, and given that it is a non daemon thread it 
 will keep TT up forever.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (MAPREDUCE-5512) TaskTracker hung after failed reconnect to the JobTracker

2013-10-10 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic resolved MAPREDUCE-5512.
---

   Resolution: Fixed
Fix Version/s: 1.3.0
   1-win

Fix committed to branch-1 and branch-1-win. 

 TaskTracker hung after failed reconnect to the JobTracker
 -

 Key: MAPREDUCE-5512
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5512
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 1.3.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Fix For: 1-win, 1.3.0

 Attachments: hadoop-tasktracker-RD00155DD09100.log, 
 MAPREDUCE-5512.branch-1.patch, tt_Hung.txt


 TaskTracker hung after failed reconnect to the JobTracker. 
 This is the problematic piece of code:
 {code}
 this.distributedCacheManager = new TrackerDistributedCacheManager(
 this.fConf, taskController);
 this.distributedCacheManager.startCleanupThread();
 
 this.jobClient = (InterTrackerProtocol) 
 UserGroupInformation.getLoginUser().doAs(
 new PrivilegedExceptionActionObject() {
   public Object run() throws IOException {
 return RPC.waitForProxy(InterTrackerProtocol.class,
 InterTrackerProtocol.versionID,
 jobTrackAddr, fConf);
   }
 });
 {code}
 In case RPC.waitForProxy() throws, TrackerDistributedCacheManager cleanup 
 thread will never be stopped, and given that it is a non daemon thread it 
 will keep TT up forever.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-5512) TaskTracker hung after failed reconnect to the JobTracker

2013-09-30 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5512:
--

Attachment: MAPREDUCE-5512.branch-1.patch

Attaching the patch.

My proposal for the fix is to make the dist cache cleanup thread a daemon. 
Based on the scan thru the code I think it should be safe to make this change. 

For the unittest, I added a test that validates the list of non-daemon threads. 
This is a more general test case but I think it will serve well to protect the 
codebase against regressions in this area. I was not able to come up with a 
nice way to simulate the condition from this bug without adding a test hook in 
the production code, so I moved away from this approach (we would have to start 
JT, stop JT, start JT again which would tell TT to reinit, and then stop JT, 
but last JT stop must have the right timing and run before TT#initialize() 
executes).

Slightly orthogonally, looking at the list of threads I had to whitelist, there 
might be some other candidate threads that could be made daemons, but I'd 
prefer not to make this change in the context of this Jira.

 TaskTracker hung after failed reconnect to the JobTracker
 -

 Key: MAPREDUCE-5512
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5512
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.3.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: hadoop-tasktracker-RD00155DD09100.log, 
 MAPREDUCE-5512.branch-1.patch, tt_Hung.txt


 TaskTracker hung after failed reconnect to the JobTracker. 
 This is the problematic piece of code:
 {code}
 this.distributedCacheManager = new TrackerDistributedCacheManager(
 this.fConf, taskController);
 this.distributedCacheManager.startCleanupThread();
 
 this.jobClient = (InterTrackerProtocol) 
 UserGroupInformation.getLoginUser().doAs(
 new PrivilegedExceptionActionObject() {
   public Object run() throws IOException {
 return RPC.waitForProxy(InterTrackerProtocol.class,
 InterTrackerProtocol.versionID,
 jobTrackAddr, fConf);
   }
 });
 {code}
 In case RPC.waitForProxy() throws, TrackerDistributedCacheManager cleanup 
 thread will never be stopped, and given that it is a non daemon thread it 
 will keep TT up forever.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (MAPREDUCE-5387) Implement Signal.TERM on Windows

2013-07-14 Thread Ivan Mitic (JIRA)
Ivan Mitic created MAPREDUCE-5387:
-

 Summary: Implement Signal.TERM on Windows
 Key: MAPREDUCE-5387
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5387
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0, 1-win, 2.1.0-beta
Reporter: Ivan Mitic
Assignee: Ivan Mitic


Signal.TERM is currently not supported by Hadoop on the Windows platform. 
Tracking Jira for the problem. 

A couple of things to keep in mind:
 - Support for process groups (JobObjects on Windows)
 - Solution should work for both java and other streaming Hadoop apps

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5387) Implement Signal.TERM on Windows

2013-07-14 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708079#comment-13708079
 ] 

Ivan Mitic commented on MAPREDUCE-5387:
---

Copy-pasting [~cnauroth] comment from MAPREDUCE-5330:

{quote}
I came across similar issues while working on the YARN nodemanager changes for 
Windows. Bikas, I agree that this logic doesn't exactly match the meaning of 
SIGTERM. To match SIGTERM, we really need a way for one process to signal 
another process with some graceful shutdown message, and a way for the other 
process to trigger custom code when it receives that message. Unfortunately, 
I'm not aware of anything in the Windows API that provides an exact match. 
Therefore, the logic in this patch seems to be the closest approximation that's 
feasible right now.

To elaborate on this, TerminateProcess immediately kills the target process, 
and there is no way for that process to trap the call and run custom clean-up 
code.

http://msdn.microsoft.com/en-us/library/windows/desktop/ms686714(v=vs.85).aspx

This is much different from Unix signals, which allow the target process to 
install signal handlers to respond gracefully to things like SIGTERM.

There also seems to be some support for programmatically sending CTL-C to a 
process and installing a custom handler to respond to it. This would be 
SetConsoleCtrlHandler and GenerateConsoleCtrlEvent. I've heard anecdotally that 
this can be used to create a rough approximation of Unix signals, but I haven't 
tried it myself.

http://msdn.microsoft.com/en-us/library/windows/desktop/ms686016(v=vs.85).aspx

http://msdn.microsoft.com/en-us/library/windows/desktop/ms683155(v=vs.85).aspx

Aside from that, the only other option seems to be for Windows applications to 
roll their own custom IPC protocol (i.e. one process sends another a custom 
graceful shutdown message over a named pipe).

It might be worth pursuing one of these solutions in the long term for absolute 
correctness, but these approaches will require a lot more coding and testing.
{quote}

 Implement Signal.TERM on Windows
 

 Key: MAPREDUCE-5387
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5387
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0, 1-win, 2.1.0-beta
Reporter: Ivan Mitic
Assignee: Ivan Mitic

 Signal.TERM is currently not supported by Hadoop on the Windows platform. 
 Tracking Jira for the problem. 
 A couple of things to keep in mind:
  - Support for process groups (JobObjects on Windows)
  - Solution should work for both java and other streaming Hadoop apps

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5330) JVM manager should not forcefully kill the process on Signal.TERM on Windows

2013-07-14 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708081#comment-13708081
 ] 

Ivan Mitic commented on MAPREDUCE-5330:
---

Chris, Bikas, Xi, I filed a new Jira MAPREDUCE-5387 to investigate possible 
ways to implement Signal.TERM on Windows. I have already spent time 
investigating this some time ago, will try to come up with a proposal in the 
near term. Chris' summary from above gives a good overview of some possible 
options (I copied it into the new Jira). 

 JVM manager should not forcefully kill the process on Signal.TERM on Windows
 

 Key: MAPREDUCE-5330
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5330
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1-win
 Environment: Windows
Reporter: Xi Fang
Assignee: Xi Fang
 Fix For: 1-win

 Attachments: MAPREDUCE-5330.patch


 In MapReduce, we sometimes kill a task's JVM before it naturally shuts down 
 if we want to launch other tasks (look in 
 JvmManager$JvmManagerForType.reapJvm). This behavior means that if the map 
 task process is in the middle of doing some cleanup/finalization after the 
 task is done, it might be interrupted/killed without giving it a chance. 
 In the Microsoft's Hadoop Service, after a Map/Reduce task is done and during 
 closing file systems in a special shutdown hook, we're typically uploading 
 storage (ASV in our context) usage metrics to Microsoft Azure Tables. So if 
 this kill happens these metrics get lost. The impact is that for many MR jobs 
 we don't see accurate metrics reported most of the time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5387) Implement Signal.TERM on Windows

2013-07-14 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5387:
--

Issue Type: Improvement  (was: Bug)

 Implement Signal.TERM on Windows
 

 Key: MAPREDUCE-5387
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5387
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 3.0.0, 1-win, 2.1.0-beta
Reporter: Ivan Mitic
Assignee: Ivan Mitic

 Signal.TERM is currently not supported by Hadoop on the Windows platform. 
 Tracking Jira for the problem. 
 A couple of things to keep in mind:
  - Support for process groups (JobObjects on Windows)
  - Solution should work for both java and other streaming Hadoop apps

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-2351) mapred.job.tracker.history.completed.location should support an arbitrary filesystem URI

2013-06-21 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13690055#comment-13690055
 ] 

Ivan Mitic commented on MAPREDUCE-2351:
---

bq. Could someone please backport the newly attached patch to branch-1-win?
Chelsey, I committed the patch to branch-1 only, as we'll be merging all 
branch-1 changes to branch-1-win in a day or so, and your patch will be picked 
up.

 mapred.job.tracker.history.completed.location should support an arbitrary 
 filesystem URI
 

 Key: MAPREDUCE-2351
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2351
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.23.0, 1-win, 1.3.0
Reporter: Tom White
Assignee: Tom White
 Fix For: 0.23.0, 1.3.0

 Attachments: HADOOP-472.branch-1-win.3.patch, 
 MAPREDUCE-2351.branch-1-win.patch, MAPREDUCE-2351.patch


 Currently, mapred.job.tracker.history.completed.location is resolved relative 
 to the default filesystem. If not set it defaults to history/done in the 
 local log directory. There is no way to set it to another local filesystem 
 location (with a file:// URI) or an arbitrary Hadoop filesystem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-2351) mapred.job.tracker.history.completed.location should support an arbitrary filesystem URI

2013-06-20 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689787#comment-13689787
 ] 

Ivan Mitic commented on MAPREDUCE-2351:
---

Thanks Chelsey for doing the backport. I verified that the new test passes on 
both Windows and Linux. +1 on the patch. Will commit shortly.

 mapred.job.tracker.history.completed.location should support an arbitrary 
 filesystem URI
 

 Key: MAPREDUCE-2351
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2351
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Tom White
Assignee: Tom White
 Fix For: 0.23.0

 Attachments: HADOOP-472.branch-1-win.3.patch, MAPREDUCE-2351.patch


 Currently, mapred.job.tracker.history.completed.location is resolved relative 
 to the default filesystem. If not set it defaults to history/done in the 
 local log directory. There is no way to set it to another local filesystem 
 location (with a file:// URI) or an arbitrary Hadoop filesystem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-2351) mapred.job.tracker.history.completed.location should support an arbitrary filesystem URI

2013-06-20 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-2351:
--

Attachment: MAPREDUCE-2351.branch-1-win.patch

Chelsey, the name of your patch does not seem valid. You should name it based 
on the Apache Jira id. Attaching the same page with the right name.

 mapred.job.tracker.history.completed.location should support an arbitrary 
 filesystem URI
 

 Key: MAPREDUCE-2351
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2351
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Tom White
Assignee: Tom White
 Fix For: 0.23.0

 Attachments: HADOOP-472.branch-1-win.3.patch, 
 MAPREDUCE-2351.branch-1-win.patch, MAPREDUCE-2351.patch


 Currently, mapred.job.tracker.history.completed.location is resolved relative 
 to the default filesystem. If not set it defaults to history/done in the 
 local log directory. There is no way to set it to another local filesystem 
 location (with a file:// URI) or an arbitrary Hadoop filesystem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-2351) mapred.job.tracker.history.completed.location should support an arbitrary filesystem URI

2013-06-20 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-2351:
--

Affects Version/s: 1.3.0
   1-win
   0.23.0

 mapred.job.tracker.history.completed.location should support an arbitrary 
 filesystem URI
 

 Key: MAPREDUCE-2351
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2351
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.23.0, 1-win, 1.3.0
Reporter: Tom White
Assignee: Tom White
 Fix For: 0.23.0

 Attachments: HADOOP-472.branch-1-win.3.patch, 
 MAPREDUCE-2351.branch-1-win.patch, MAPREDUCE-2351.patch


 Currently, mapred.job.tracker.history.completed.location is resolved relative 
 to the default filesystem. If not set it defaults to history/done in the 
 local log directory. There is no way to set it to another local filesystem 
 location (with a file:// URI) or an arbitrary Hadoop filesystem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-2351) mapred.job.tracker.history.completed.location should support an arbitrary filesystem URI

2013-06-20 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-2351:
--

Fix Version/s: 1.3.0

I committed the backport patch to branch-1. Thank you Chelsey for contribution!

 mapred.job.tracker.history.completed.location should support an arbitrary 
 filesystem URI
 

 Key: MAPREDUCE-2351
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2351
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.23.0, 1-win, 1.3.0
Reporter: Tom White
Assignee: Tom White
 Fix For: 0.23.0, 1.3.0

 Attachments: HADOOP-472.branch-1-win.3.patch, 
 MAPREDUCE-2351.branch-1-win.patch, MAPREDUCE-2351.patch


 Currently, mapred.job.tracker.history.completed.location is resolved relative 
 to the default filesystem. If not set it defaults to history/done in the 
 local log directory. There is no way to set it to another local filesystem 
 location (with a file:// URI) or an arbitrary Hadoop filesystem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5330) Killing M/R JVM's leads to metrics not being uploaded

2013-06-18 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5330:
--

Target Version/s: 1-win

 Killing M/R JVM's leads to metrics not being uploaded
 -

 Key: MAPREDUCE-5330
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5330
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1-win
 Environment: Windows
Reporter: Xi Fang
Assignee: Xi Fang
 Attachments: MAPREDUCE-5330.patch


 In MapReduce, we sometimes kill a task's JVM before it naturally shuts down 
 if we want to launch other tasks (look in 
 JvmManager$JvmManagerForType.reapJvm). This behavior means that if the map 
 task process is in the middle of doing some cleanup/finalization after the 
 task is done, it might be interrupted/killed without giving it a chance. 
 In the Microsoft's Hadoop Service, after a Map/Reduce task is done and during 
 closing file systems in a special shutdown hook, we're typically uploading 
 storage (ASV in our context) usage metrics to Microsoft Azure Tables. So if 
 this kill happens these metrics get lost. The impact is that for many MR jobs 
 we don't see accurate metrics reported most of the time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5330) JVM manager should not forcefully kill the process on Signal.TERM on Windows

2013-06-18 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5330:
--

Summary: JVM manager should not forcefully kill the process on Signal.TERM 
on Windows  (was: Killing M/R JVM's leads to metrics not being uploaded)

 JVM manager should not forcefully kill the process on Signal.TERM on Windows
 

 Key: MAPREDUCE-5330
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5330
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1-win
 Environment: Windows
Reporter: Xi Fang
Assignee: Xi Fang
 Attachments: MAPREDUCE-5330.patch


 In MapReduce, we sometimes kill a task's JVM before it naturally shuts down 
 if we want to launch other tasks (look in 
 JvmManager$JvmManagerForType.reapJvm). This behavior means that if the map 
 task process is in the middle of doing some cleanup/finalization after the 
 task is done, it might be interrupted/killed without giving it a chance. 
 In the Microsoft's Hadoop Service, after a Map/Reduce task is done and during 
 closing file systems in a special shutdown hook, we're typically uploading 
 storage (ASV in our context) usage metrics to Microsoft Azure Tables. So if 
 this kill happens these metrics get lost. The impact is that for many MR jobs 
 we don't see accurate metrics reported most of the time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5330) JVM manager should not forcefully kill the process on Signal.TERM on Windows

2013-06-18 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13687387#comment-13687387
 ] 

Ivan Mitic commented on MAPREDUCE-5330:
---

Thanks Xi for the patch, looks good to me, +1

 JVM manager should not forcefully kill the process on Signal.TERM on Windows
 

 Key: MAPREDUCE-5330
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5330
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1-win
 Environment: Windows
Reporter: Xi Fang
Assignee: Xi Fang
 Attachments: MAPREDUCE-5330.patch


 In MapReduce, we sometimes kill a task's JVM before it naturally shuts down 
 if we want to launch other tasks (look in 
 JvmManager$JvmManagerForType.reapJvm). This behavior means that if the map 
 task process is in the middle of doing some cleanup/finalization after the 
 task is done, it might be interrupted/killed without giving it a chance. 
 In the Microsoft's Hadoop Service, after a Map/Reduce task is done and during 
 closing file systems in a special shutdown hook, we're typically uploading 
 storage (ASV in our context) usage metrics to Microsoft Azure Tables. So if 
 this kill happens these metrics get lost. The impact is that for many MR jobs 
 we don't see accurate metrics reported most of the time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS

2013-06-15 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13684565#comment-13684565
 ] 

Ivan Mitic commented on MAPREDUCE-5224:
---

I already +1ed on the latest patch, will commit shortly.

 JobTracker should allow the system directory to be in non-default FS
 

 Key: MAPREDUCE-5224
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Reporter: Xi Fang
Assignee: Xi Fang
Priority: Minor
 Fix For: 1-win

 Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.3.patch, 
 MAPREDUCE-5224.4.patch, MAPREDUCE-5224.5.patch, MAPREDUCE-5224.patch


  JobTracker today expects the system directory to be in the default file 
 system
 if (fs == null) {
   fs = mrOwner.doAs(new PrivilegedExceptionActionFileSystem() {
 public FileSystem run() throws IOException {
   return FileSystem.get(conf);
   }});
 }
 ...
   public String getSystemDir() {
 Path sysDir = new Path(conf.get(mapred.system.dir, 
 /tmp/hadoop/mapred/system));  
 return fs.makeQualified(sysDir).toString();
   }
 In Cloud like Azure the default file system is set as ASV (Windows Azure Blob 
 Storage), but we would still like the system directory to be in DFS. We 
 should change JobTracker to allow that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS

2013-06-15 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic resolved MAPREDUCE-5224.
---

  Resolution: Fixed
Target Version/s: 1-win
Hadoop Flags: Reviewed

Fix committed to branch-1-win. Thank you Xi for the contribution!

 JobTracker should allow the system directory to be in non-default FS
 

 Key: MAPREDUCE-5224
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Reporter: Xi Fang
Assignee: Xi Fang
Priority: Minor
 Fix For: 1-win

 Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.3.patch, 
 MAPREDUCE-5224.4.patch, MAPREDUCE-5224.5.patch, MAPREDUCE-5224.patch


  JobTracker today expects the system directory to be in the default file 
 system
 if (fs == null) {
   fs = mrOwner.doAs(new PrivilegedExceptionActionFileSystem() {
 public FileSystem run() throws IOException {
   return FileSystem.get(conf);
   }});
 }
 ...
   public String getSystemDir() {
 Path sysDir = new Path(conf.get(mapred.system.dir, 
 /tmp/hadoop/mapred/system));  
 return fs.makeQualified(sysDir).toString();
   }
 In Cloud like Azure the default file system is set as ASV (Windows Azure Blob 
 Storage), but we would still like the system directory to be in DFS. We 
 should change JobTracker to allow that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS

2013-06-15 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5224:
--

Affects Version/s: 1-win

 JobTracker should allow the system directory to be in non-default FS
 

 Key: MAPREDUCE-5224
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 1-win
Reporter: Xi Fang
Assignee: Xi Fang
Priority: Minor
 Fix For: 1-win

 Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.3.patch, 
 MAPREDUCE-5224.4.patch, MAPREDUCE-5224.5.patch, MAPREDUCE-5224.patch


  JobTracker today expects the system directory to be in the default file 
 system
 if (fs == null) {
   fs = mrOwner.doAs(new PrivilegedExceptionActionFileSystem() {
 public FileSystem run() throws IOException {
   return FileSystem.get(conf);
   }});
 }
 ...
   public String getSystemDir() {
 Path sysDir = new Path(conf.get(mapred.system.dir, 
 /tmp/hadoop/mapred/system));  
 return fs.makeQualified(sysDir).toString();
   }
 In Cloud like Azure the default file system is set as ASV (Windows Azure Blob 
 Storage), but we would still like the system directory to be in DFS. We 
 should change JobTracker to allow that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5259) TestTaskLog fails on Windows because of path separators missmatch

2013-06-13 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13682389#comment-13682389
 ] 

Ivan Mitic commented on MAPREDUCE-5259:
---

Thanks Chris for the review and commit!

 TestTaskLog fails on Windows because of path separators missmatch
 -

 Key: MAPREDUCE-5259
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5259
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Fix For: 3.0.0, 2.1.0-beta

 Attachments: MAPREDUCE-5259.patch


 Test failure:
 {noformat}
 Running org.apache.hadoop.mapred.TestTaskLog
 Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.516 sec  
 FAILURE!
 testTaskLog(org.apache.hadoop.mapred.TestTaskLog)  Time elapsed: 409 sec   
 FAILURE!
 junit.framework.AssertionFailedError: null
   at junit.framework.Assert.fail(Assert.java:47)
   at junit.framework.Assert.assertTrue(Assert.java:20)
   at junit.framework.Assert.assertTrue(Assert.java:27)
   at org.apache.hadoop.mapred.TestTaskLog.testTaskLog(TestTaskLog.java:54)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
   at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
   at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
   at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
   at 
 org.junit.internal.runners.statements.FailOnTimeout$1.run(FailOnTimeout.java:28)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-3540) saveVersion.sh script fails in windows/cygwin (hadoop-yarn-common)

2013-06-08 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678899#comment-13678899
 ] 

Ivan Mitic commented on MAPREDUCE-3540:
---

This Jira seems outdated now that Hadoop can be compiled on Windows without 
Cygwin. Should we resolve this Jira?

 saveVersion.sh script fails in windows/cygwin (hadoop-yarn-common)
 --

 Key: MAPREDUCE-3540
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3540
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: build
Affects Versions: 0.24.0, trunk
Reporter: Alejandro Abdelnur
 Fix For: 0.24.0

 Attachments: MAPREDUCE-3540-121001.patch, MAPREDUCE-3540.Nov12.patch, 
 MAPREDUCE-3540.patch


 {code}
 [ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.2:exec 
 (generate-version) on project hadoop-yarn-common: Comman
 d execution failed. Cannot run program scripts\saveVersion.sh (in directory 
 C:\cygwin\home\tucu\src\hadoop\hadoop-mapreduce-proje
 ct\hadoop-yarn\hadoop-yarn-common): CreateProcess error=2, The system cannot 
 find the file specified - [Help 1]
 [ERROR]
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5278) Distributed cache is broken when JT staging dir is not on the default FS

2013-06-08 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5278:
--

Summary: Distributed cache is broken when JT staging dir is not on the 
default FS  (was: Perf: Distributed cache is broken when JT staging dir is not 
on the default FS)

 Distributed cache is broken when JT staging dir is not on the default FS
 

 Key: MAPREDUCE-5278
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distributed-cache
Affects Versions: 1-win
 Environment: Windows
Reporter: Xi Fang
Assignee: Xi Fang
 Fix For: 1-win

 Attachments: MAPREDUCE-5278.patch


 Today, the JobTracker staging dir (mapreduce.jobtracker.staging.root.dir) is 
 set to point to HDFS, even though other file systems (e.g. Amazon S3 file 
 system and Windows ASV file system) are the default file systems.
 For ASV, this config was chosen and there are a few reasons why:
 1. To prevent leak of the storage account credentials to the user's storage 
 account; 
 2. It uses HDFS for the transient job files what is good for two reasons – a) 
 it does not flood the user's storage account with irrelevant data/files b) it 
 leverages HDFS locality for small files
 However, this approach conflicts with how distributed cache caching works, 
 completely negating the feature's functionality.
 When files are added to the distributed cache (thru files/achieves/libjars 
 hadoop generic options), they are copied to the job tracker staging dir only 
 if they reside on a file system different that the jobtracker's. Later on, 
 this path is used as a key to cache the files locally on the tasktracker's 
 machine, and avoid localization (download/unzip) of the distributed cache 
 files if they are already localized.
 In this configuration the caching is completely disabled and we always end up 
 copying dist cache files to the job tracker's staging dir first and 
 localizing them on the task tracker machine second.
 This is especially not good for Oozie scenarios as Oozie uses dist cache to 
 populate Hive/Pig jars throughout the cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5278) Distributed cache is broken when JT staging dir is not on the default FS

2013-06-08 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678901#comment-13678901
 ] 

Ivan Mitic commented on MAPREDUCE-5278:
---

Thanks Xi for posting the patch!

+1 on the proposal, I have largely reviewed this already and tested it out E2E.

A couple of additional comments below:
1.  You’ll also have to provide a trunk compatible patch for the new 
functionality
2.  
TestMRWithDistributedCache#DistributedCacheCheckerJTStagingOnNondefaultFS: I 
would add the validation that localized dist cache entries are properly added 
to the classpath (below check).
{code}
  // Check the class loaders
  LOG.info(Java Classpath:  + System.getProperty(java.class.path));
  ClassLoader cl = Thread.currentThread().getContextClassLoader();
  // Both the file and the archive were added to classpath, so both
  // should be reachable via the class loader.
  TestCase.assertNotNull(cl.getResource(distributed.jar.inside2));
  TestCase.assertNotNull(cl.getResource(distributed.jar.inside3));
  TestCase.assertNull(cl.getResource(distributed.jar.inside4));
{code}

It would be really good to get feedback on the approach from some more senior 
MR folks. 


 Distributed cache is broken when JT staging dir is not on the default FS
 

 Key: MAPREDUCE-5278
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5278
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distributed-cache
Affects Versions: 1-win
 Environment: Windows
Reporter: Xi Fang
Assignee: Xi Fang
 Fix For: 1-win

 Attachments: MAPREDUCE-5278.patch


 Today, the JobTracker staging dir (mapreduce.jobtracker.staging.root.dir) is 
 set to point to HDFS, even though other file systems (e.g. Amazon S3 file 
 system and Windows ASV file system) are the default file systems.
 For ASV, this config was chosen and there are a few reasons why:
 1. To prevent leak of the storage account credentials to the user's storage 
 account; 
 2. It uses HDFS for the transient job files what is good for two reasons – a) 
 it does not flood the user's storage account with irrelevant data/files b) it 
 leverages HDFS locality for small files
 However, this approach conflicts with how distributed cache caching works, 
 completely negating the feature's functionality.
 When files are added to the distributed cache (thru files/achieves/libjars 
 hadoop generic options), they are copied to the job tracker staging dir only 
 if they reside on a file system different that the jobtracker's. Later on, 
 this path is used as a key to cache the files locally on the tasktracker's 
 machine, and avoid localization (download/unzip) of the distributed cache 
 files if they are already localized.
 In this configuration the caching is completely disabled and we always end up 
 copying dist cache files to the job tracker's staging dir first and 
 localizing them on the task tracker machine second.
 This is especially not good for Oozie scenarios as Oozie uses dist cache to 
 populate Hive/Pig jars throughout the cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (MAPREDUCE-5277) Job history completed location cannot be on a file system other than default

2013-06-03 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic resolved MAPREDUCE-5277.
---

Resolution: Duplicate

Just realized that this is a dupe of MAPREDUCE-2351. Will reopen the original 
Jira and attach the branch-1 compatible backport patch.

 Job history completed location cannot be on a file system other than default
 

 Key: MAPREDUCE-5277
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5277
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 1-win
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5277.branch-1-win.patch


 mapred.job.tracker.history.completed.location should be configurable to a 
 location on any available file system. This can come handy for cases where 
 HDFS is not the only file system in use. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-3540) saveVersion.sh script fails in windows/cygwin (hadoop-yarn-common)

2013-05-28 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13668457#comment-13668457
 ] 

Ivan Mitic commented on MAPREDUCE-3540:
---

Hi Anoop, you no longer need to run from Cygwin shell to be able to compile and 
run Hadoop on Windows (this is post HADOOP-8562 merge). Check BUILDING.txt for 
instructions on how to compile natively on Windows. 

 saveVersion.sh script fails in windows/cygwin (hadoop-yarn-common)
 --

 Key: MAPREDUCE-3540
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3540
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: build
Affects Versions: 0.24.0, trunk
Reporter: Alejandro Abdelnur
 Fix For: 0.24.0

 Attachments: MAPREDUCE-3540-121001.patch, MAPREDUCE-3540.Nov12.patch, 
 MAPREDUCE-3540.patch


 {code}
 [ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.2:exec 
 (generate-version) on project hadoop-yarn-common: Comman
 d execution failed. Cannot run program scripts\saveVersion.sh (in directory 
 C:\cygwin\home\tucu\src\hadoop\hadoop-mapreduce-proje
 ct\hadoop-yarn\hadoop-yarn-common): CreateProcess error=2, The system cannot 
 find the file specified - [Help 1]
 [ERROR]
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5277) Job history completed location cannot be on a file system other than default

2013-05-27 Thread Ivan Mitic (JIRA)
Ivan Mitic created MAPREDUCE-5277:
-

 Summary: Job history completed location cannot be on a file system 
other than default
 Key: MAPREDUCE-5277
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5277
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 1-win
Reporter: Ivan Mitic
Assignee: Ivan Mitic


mapred.job.tracker.history.completed.location should be configurable to a 
location on any available file system. This can come handy for cases where HDFS 
is not the only file system in use. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5277) Job history completed location cannot be on a file system other than default

2013-05-27 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5277:
--

Attachment: MAPREDUCE-5277.branch-1-win.patch

Attaching the patch.

 Job history completed location cannot be on a file system other than default
 

 Key: MAPREDUCE-5277
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5277
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 1-win
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5277.branch-1-win.patch


 mapred.job.tracker.history.completed.location should be configurable to a 
 location on any available file system. This can come handy for cases where 
 HDFS is not the only file system in use. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS

2013-05-27 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13667999#comment-13667999
 ] 

Ivan Mitic commented on MAPREDUCE-5224:
---

Thanks Xi for taking time to address all comments! Latest patch looks good to 
me, +1

bq. There is no need to use the default file system for the jobhistory. There 
is another (orthogonal) bug here. Job history completed location also assumes 
the default FS what is not correct. This should be a separate Jira.
I filed a Jira on this: MAPREDUCE-5277


 JobTracker should allow the system directory to be in non-default FS
 

 Key: MAPREDUCE-5224
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Reporter: Xi Fang
Assignee: Xi Fang
Priority: Minor
 Fix For: 1-win

 Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.3.patch, 
 MAPREDUCE-5224.4.patch, MAPREDUCE-5224.5.patch, MAPREDUCE-5224.patch


  JobTracker today expects the system directory to be in the default file 
 system
 if (fs == null) {
   fs = mrOwner.doAs(new PrivilegedExceptionActionFileSystem() {
 public FileSystem run() throws IOException {
   return FileSystem.get(conf);
   }});
 }
 ...
   public String getSystemDir() {
 Path sysDir = new Path(conf.get(mapred.system.dir, 
 /tmp/hadoop/mapred/system));  
 return fs.makeQualified(sysDir).toString();
   }
 In Cloud like Azure the default file system is set as ASV (Windows Azure Blob 
 Storage), but we would still like the system directory to be in DFS. We 
 should change JobTracker to allow that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS

2013-05-27 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13668000#comment-13668000
 ] 

Ivan Mitic commented on MAPREDUCE-5224:
---

PS. I verified that the new test passes on Linux and on Windows.

 JobTracker should allow the system directory to be in non-default FS
 

 Key: MAPREDUCE-5224
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Reporter: Xi Fang
Assignee: Xi Fang
Priority: Minor
 Fix For: 1-win

 Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.3.patch, 
 MAPREDUCE-5224.4.patch, MAPREDUCE-5224.5.patch, MAPREDUCE-5224.patch


  JobTracker today expects the system directory to be in the default file 
 system
 if (fs == null) {
   fs = mrOwner.doAs(new PrivilegedExceptionActionFileSystem() {
 public FileSystem run() throws IOException {
   return FileSystem.get(conf);
   }});
 }
 ...
   public String getSystemDir() {
 Path sysDir = new Path(conf.get(mapred.system.dir, 
 /tmp/hadoop/mapred/system));  
 return fs.makeQualified(sysDir).toString();
   }
 In Cloud like Azure the default file system is set as ASV (Windows Azure Blob 
 Storage), but we would still like the system directory to be in DFS. We 
 should change JobTracker to allow that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS

2013-05-23 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13665677#comment-13665677
 ] 

Ivan Mitic commented on MAPREDUCE-5224:
---

Thanks Xi, you're almost there! A few additional comments below. Once you 
address those, +1 from me

1. You'll have to impersonate the MR owner when you're querying for the 
systemDirFs (same routine as with defaultFs). Sorry, I missed this when I 
initially reviewed the patch.
After:
{code}
if (defaultFs == null) {
  defaultFs = mrOwner.doAs(new PrivilegedExceptionActionFileSystem() {
public FileSystem run() throws IOException {
  return FileSystem.get(conf);
  }});
}
{code}
add the following:
{code}

if (systemDirFs == null) {
  systemDirFs = mrOwner.doAs(new 
PrivilegedExceptionActionFileSystem() {
public FileSystem run() throws IOException {
  Path sysDir = new Path(conf.get(mapred.system.dir,
/tmp/hadoop/mapred/system));
  return FileSystem.get(sysDir.toUri(), conf);
  }});
}
{code}

Once you implement above you should be able to simplify getSystemDir() by 
assuming that systemDirFs is different than null. This will allow you to get 
rid of the IOException (#2 comment from my initial review).

2. Nit: JobTracker.java: It seems that a tab slipped in:
{code}
  if (systemDirFs.exists(restartFile)) {
systemDirFs.delete(tmpRestartFile, false); // delete the tmp file
  } else if (systemDirFs.exists(tmpRestartFile)) {
{code}
I also see some invalid indentation:
{code}
// disable recovery if this is a restart
  shouldRecover = false;
{code}
Please correct.

3. Please remove try/catch from TestJobTrackerWithNonDefaultFS#tearDown since 
it can possibly mask a problem. Instead you can add IOException to the throws 
clause of the method:
{code}
public void tearDown() throws IOException {
{code}

4. TestJobTrackerwithNonDefaultFs#testSystemDir: No need for the try/catch 
block in the test, please remove. The test will fail if any of its asserts fail.

5. Can you also please change TestJobTrackerWithNonDefaultFS#MAPRED_SYS_DIR to 
the following:
{code}
  private final String MAPRED_SYS_DIR =
  System.getProperty(test.build.data, /tmp) + /mapred/system;
{code}
Guideline is for all local test files to go under test.build.data folder.

6. Nit: TestJobTrackerwithNonDefaultFs: You can use assertTrue instead:
{code}
assertEquals(Check if the system dir exists , 
FileSystem.get(sysDirPathURL, conf).exists(sysDirPath), true);
{code}
Btw, when you’re using assertEquals, you should place the expected value as the 
first arg, and the test value as the second arg. For example, 
assertEquals(true, fs.get().exists()).


 JobTracker should allow the system directory to be in non-default FS
 

 Key: MAPREDUCE-5224
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Reporter: Xi Fang
Assignee: Xi Fang
Priority: Minor
 Fix For: 1-win

 Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.3.patch, 
 MAPREDUCE-5224.patch


  JobTracker today expects the system directory to be in the default file 
 system
 if (fs == null) {
   fs = mrOwner.doAs(new PrivilegedExceptionActionFileSystem() {
 public FileSystem run() throws IOException {
   return FileSystem.get(conf);
   }});
 }
 ...
   public String getSystemDir() {
 Path sysDir = new Path(conf.get(mapred.system.dir, 
 /tmp/hadoop/mapred/system));  
 return fs.makeQualified(sysDir).toString();
   }
 In Cloud like Azure the default file system is set as ASV (Windows Azure Blob 
 Storage), but we would still like the system directory to be in DFS. We 
 should change JobTracker to allow that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5191) TestQueue#testQueue fails with timeout on Windows

2013-05-20 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662407#comment-13662407
 ] 

Ivan Mitic commented on MAPREDUCE-5191:
---

Thanks Hitesh!

 TestQueue#testQueue fails with timeout on Windows
 -

 Key: MAPREDUCE-5191
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5191
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Fix For: 3.0.0

 Attachments: MAPREDUCE-5191.2.patch, MAPREDUCE-5191.3.patch, 
 MAPREDUCE-5191.patch


 Test times out on my machine after 5 seconds always on the below stack:
 {code}
 testQueue(org.apache.hadoop.mapred.TestQueue)  Time elapsed: 5009 sec   
 ERROR!
 java.lang.Exception: test timed out after 5000 milliseconds
   at java.lang.Object.wait(Native Method)
   at java.lang.Object.wait(Object.java:485)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedByte(SeedGenerator.java:330)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedBytes(SeedGenerator.java:319)
   at 
 sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:117)
   at 
 sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:114)
   at 
 sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:171)
   at java.security.SecureRandom.nextBytes(SecureRandom.java:433)
   at java.security.SecureRandom.next(SecureRandom.java:455)
   at java.util.Random.nextLong(Random.java:284)
   at java.io.File.generateFile(File.java:1682)
   at java.io.File.createTempFile(File.java:1791)
   at java.io.File.createTempFile(File.java:1828)
   at org.apache.hadoop.mapred.TestQueue.writeFile(TestQueue.java:221)
   at org.apache.hadoop.mapred.TestQueue.testQueue(TestQueue.java:53)
 {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS

2013-05-20 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13662675#comment-13662675
 ] 

Ivan Mitic commented on MAPREDUCE-5224:
---

Thanks Xi for addressing the comments!

bq. For getFilesystemName(), what does fs stand for in this context, default fs 
or systemDir's file system. I guess it denotes the latter one. Right?
Right, I also see it as a systemDir.

 JobTracker should allow the system directory to be in non-default FS
 

 Key: MAPREDUCE-5224
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Reporter: Xi Fang
Assignee: Xi Fang
Priority: Minor
 Fix For: 1-win

 Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.patch


  JobTracker today expects the system directory to be in the default file 
 system
 if (fs == null) {
   fs = mrOwner.doAs(new PrivilegedExceptionActionFileSystem() {
 public FileSystem run() throws IOException {
   return FileSystem.get(conf);
   }});
 }
 ...
   public String getSystemDir() {
 Path sysDir = new Path(conf.get(mapred.system.dir, 
 /tmp/hadoop/mapred/system));  
 return fs.makeQualified(sysDir).toString();
   }
 In Cloud like Azure the default file system is set as ASV (Windows Azure Blob 
 Storage), but we would still like the system directory to be in DFS. We 
 should change JobTracker to allow that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5259) TestTaskLog fails on Windows because of path separators missmatch

2013-05-19 Thread Ivan Mitic (JIRA)
Ivan Mitic created MAPREDUCE-5259:
-

 Summary: TestTaskLog fails on Windows because of path separators 
missmatch
 Key: MAPREDUCE-5259
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5259
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic


Test failure:
{noformat}
Running org.apache.hadoop.mapred.TestTaskLog
Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.516 sec  
FAILURE!
testTaskLog(org.apache.hadoop.mapred.TestTaskLog)  Time elapsed: 409 sec   
FAILURE!
junit.framework.AssertionFailedError: null
at junit.framework.Assert.fail(Assert.java:47)
at junit.framework.Assert.assertTrue(Assert.java:20)
at junit.framework.Assert.assertTrue(Assert.java:27)
at org.apache.hadoop.mapred.TestTaskLog.testTaskLog(TestTaskLog.java:54)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at 
org.junit.internal.runners.statements.FailOnTimeout$1.run(FailOnTimeout.java:28)
{noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5259) TestTaskLog fails on Windows because of path separators missmatch

2013-05-19 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5259:
--

Attachment: MAPREDUCE-5259.patch

Attaching the patch.

The fix is to use File.separatorChar instead of the hardcoded Unix file 
separator ('/').

 TestTaskLog fails on Windows because of path separators missmatch
 -

 Key: MAPREDUCE-5259
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5259
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5259.patch


 Test failure:
 {noformat}
 Running org.apache.hadoop.mapred.TestTaskLog
 Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.516 sec  
 FAILURE!
 testTaskLog(org.apache.hadoop.mapred.TestTaskLog)  Time elapsed: 409 sec   
 FAILURE!
 junit.framework.AssertionFailedError: null
   at junit.framework.Assert.fail(Assert.java:47)
   at junit.framework.Assert.assertTrue(Assert.java:20)
   at junit.framework.Assert.assertTrue(Assert.java:27)
   at org.apache.hadoop.mapred.TestTaskLog.testTaskLog(TestTaskLog.java:54)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
   at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
   at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
   at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
   at 
 org.junit.internal.runners.statements.FailOnTimeout$1.run(FailOnTimeout.java:28)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5259) TestTaskLog fails on Windows because of path separators missmatch

2013-05-19 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5259:
--

Status: Patch Available  (was: Open)

 TestTaskLog fails on Windows because of path separators missmatch
 -

 Key: MAPREDUCE-5259
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5259
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5259.patch


 Test failure:
 {noformat}
 Running org.apache.hadoop.mapred.TestTaskLog
 Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.516 sec  
 FAILURE!
 testTaskLog(org.apache.hadoop.mapred.TestTaskLog)  Time elapsed: 409 sec   
 FAILURE!
 junit.framework.AssertionFailedError: null
   at junit.framework.Assert.fail(Assert.java:47)
   at junit.framework.Assert.assertTrue(Assert.java:20)
   at junit.framework.Assert.assertTrue(Assert.java:27)
   at org.apache.hadoop.mapred.TestTaskLog.testTaskLog(TestTaskLog.java:54)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
   at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
   at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
   at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
   at 
 org.junit.internal.runners.statements.FailOnTimeout$1.run(FailOnTimeout.java:28)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5191) TestQueue#testQueue fails with timeout on Windows

2013-05-18 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661425#comment-13661425
 ] 

Ivan Mitic commented on MAPREDUCE-5191:
---

bq. How about just creating a file under target/ with the name of the test as 
filename?
Thanks Hitesh, totaly makes sense. Will attach the updated patch.

 TestQueue#testQueue fails with timeout on Windows
 -

 Key: MAPREDUCE-5191
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5191
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5191.2.patch, MAPREDUCE-5191.patch


 Test times out on my machine after 5 seconds always on the below stack:
 {code}
 testQueue(org.apache.hadoop.mapred.TestQueue)  Time elapsed: 5009 sec   
 ERROR!
 java.lang.Exception: test timed out after 5000 milliseconds
   at java.lang.Object.wait(Native Method)
   at java.lang.Object.wait(Object.java:485)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedByte(SeedGenerator.java:330)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedBytes(SeedGenerator.java:319)
   at 
 sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:117)
   at 
 sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:114)
   at 
 sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:171)
   at java.security.SecureRandom.nextBytes(SecureRandom.java:433)
   at java.security.SecureRandom.next(SecureRandom.java:455)
   at java.util.Random.nextLong(Random.java:284)
   at java.io.File.generateFile(File.java:1682)
   at java.io.File.createTempFile(File.java:1791)
   at java.io.File.createTempFile(File.java:1828)
   at org.apache.hadoop.mapred.TestQueue.writeFile(TestQueue.java:221)
   at org.apache.hadoop.mapred.TestQueue.testQueue(TestQueue.java:53)
 {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5191) TestQueue#testQueue fails with timeout on Windows

2013-05-18 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5191:
--

Attachment: MAPREDUCE-5191.3.patch

 TestQueue#testQueue fails with timeout on Windows
 -

 Key: MAPREDUCE-5191
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5191
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5191.2.patch, MAPREDUCE-5191.3.patch, 
 MAPREDUCE-5191.patch


 Test times out on my machine after 5 seconds always on the below stack:
 {code}
 testQueue(org.apache.hadoop.mapred.TestQueue)  Time elapsed: 5009 sec   
 ERROR!
 java.lang.Exception: test timed out after 5000 milliseconds
   at java.lang.Object.wait(Native Method)
   at java.lang.Object.wait(Object.java:485)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedByte(SeedGenerator.java:330)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedBytes(SeedGenerator.java:319)
   at 
 sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:117)
   at 
 sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:114)
   at 
 sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:171)
   at java.security.SecureRandom.nextBytes(SecureRandom.java:433)
   at java.security.SecureRandom.next(SecureRandom.java:455)
   at java.util.Random.nextLong(Random.java:284)
   at java.io.File.generateFile(File.java:1682)
   at java.io.File.createTempFile(File.java:1791)
   at java.io.File.createTempFile(File.java:1828)
   at org.apache.hadoop.mapred.TestQueue.writeFile(TestQueue.java:221)
   at org.apache.hadoop.mapred.TestQueue.testQueue(TestQueue.java:53)
 {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5191) TestQueue#testQueue fails with timeout on Windows

2013-05-18 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5191:
--

Status: Patch Available  (was: Open)

 TestQueue#testQueue fails with timeout on Windows
 -

 Key: MAPREDUCE-5191
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5191
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5191.2.patch, MAPREDUCE-5191.3.patch, 
 MAPREDUCE-5191.patch


 Test times out on my machine after 5 seconds always on the below stack:
 {code}
 testQueue(org.apache.hadoop.mapred.TestQueue)  Time elapsed: 5009 sec   
 ERROR!
 java.lang.Exception: test timed out after 5000 milliseconds
   at java.lang.Object.wait(Native Method)
   at java.lang.Object.wait(Object.java:485)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedByte(SeedGenerator.java:330)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedBytes(SeedGenerator.java:319)
   at 
 sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:117)
   at 
 sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:114)
   at 
 sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:171)
   at java.security.SecureRandom.nextBytes(SecureRandom.java:433)
   at java.security.SecureRandom.next(SecureRandom.java:455)
   at java.util.Random.nextLong(Random.java:284)
   at java.io.File.generateFile(File.java:1682)
   at java.io.File.createTempFile(File.java:1791)
   at java.io.File.createTempFile(File.java:1828)
   at org.apache.hadoop.mapred.TestQueue.writeFile(TestQueue.java:221)
   at org.apache.hadoop.mapred.TestQueue.testQueue(TestQueue.java:53)
 {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5224) JobTracker should allow the system directory to be in non-default FS

2013-05-17 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661272#comment-13661272
 ] 

Ivan Mitic commented on MAPREDUCE-5224:
---

Thanks Xi for the patch! I think this is close, have some comments below which 
should be easy to address. 

1. JobTracker.java: Lines should not exceed 80 chars per Hadoop coding 
guidelines.
{code}
FSDataOutputStream out = FileSystem.create(systemDirFs, tmpRestartFile, 
filePerm);
{code}
Same comment for other changes in the patch.

2. I really don't think it is necessary to introduce IOException to so many 
methods and interfaces for the reasons in this Jira. I would change 
JobTracker#getSystemDir() to fallback to the default value and log a warning in 
case {{FileSystem.get()}} throws.

3. Should JobTracker#RecoveryManager#checkAndAddJob() use systemDirFs?

4. I would rename the {{JobTracker#fs}} local member to {{defaultFs}} to 
signify its meaning and avoid possible confusion in the future. I actually 
don’t think you need to keep both defaultFs and systemDirFs as members. The 
only other place where you need defaultFs is {{JobHistory#initDone}} and you 
should be able to query for it locally. 

5. Let's rename the test to TestJobTrackerWithNonDefaultFS

6. What is the expected behavior for TestSysDirOnNonDefaultFS when your code 
changes are not applied? Looks like the setUp step is failing. I would prefer 
if we could have the test case fail instead. 

7. TestSysDirOnNonDefaultFS.java: Please add a more verbose comment on the 
intent of the test.
{code}
/**
 * Class to test jobtracker's system dir
 */
{code}

8. TestSysDirOnNonDefaultFS.java: Why not let setUp throw the IOException() in 
case of an error?

9. TestSysDirOnNonDefaultFS.java: Please use JUnit assertEquals method to 
validate that the expected and the retrieved values are equal.

10. TestSysDirOnNonDefaultFS.java: Can we also add validation that mapred 
system dir is created in the right place by checking for its existence. 

11. Would be good to understand if there are some changes needed to get the 
equivalent functionality in YARN. I would be fine with addressing this via a 
separate Jira.



 JobTracker should allow the system directory to be in non-default FS
 

 Key: MAPREDUCE-5224
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5224
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Reporter: Xi Fang
Assignee: Xi Fang
Priority: Minor
 Fix For: 1-win

 Attachments: MAPREDUCE-5224.2.patch, MAPREDUCE-5224.patch


  JobTracker today expects the system directory to be in the default file 
 system
 if (fs == null) {
   fs = mrOwner.doAs(new PrivilegedExceptionActionFileSystem() {
 public FileSystem run() throws IOException {
   return FileSystem.get(conf);
   }});
 }
 ...
   public String getSystemDir() {
 Path sysDir = new Path(conf.get(mapred.system.dir, 
 /tmp/hadoop/mapred/system));  
 return fs.makeQualified(sysDir).toString();
   }
 In Cloud like Azure the default file system is set as ASV (Windows Azure Blob 
 Storage), but we would still like the system directory to be in DFS. We 
 should change JobTracker to allow that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5210) Job submission has strict permission validation

2013-05-07 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13651574#comment-13651574
 ] 

Ivan Mitic commented on MAPREDUCE-5210:
---

bq. FileSystem should have an API to check ownership.
I like the idea of exposing a FileSystem API for checking the ownership, 
something like {{FileSystem#isOwnedByUser(String username…)}}. We had a problem 
with this check on Windows with many tests that use the local file system. 
Check out HADOOP-8457 to see what we did in branch-1-win.

Just for completeness :), another, 3rd option is to have S3 implement 
setPermissions/setOwner FileSystem APIs. We ended up doing this with our Azure 
FileSystem implementation to be able to run MR on top of it.


 Job submission has strict permission validation
 ---

 Key: MAPREDUCE-5210
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5210
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Amareshwari Sriramadasu
Assignee: samar

 The following code in JobSubmissionFiles.java mandates strict permission on 
 job submission :
 {noformat}
 if (fs.exists(stagingArea)) {
   FileStatus fsStatus = fs.getFileStatus(stagingArea);
   String owner = fsStatus.getOwner();
   if (!(owner.equals(currentUser) || owner.equals(realUser))) {
  throw new IOException(The ownership on the staging directory  +
   stagingArea +  is not as expected.  + 
   It is owned by  + owner + . The directory must  +
   be owned by the submitter  + currentUser +  or  +
   by  + realUser);
   }
 {noformat}
 For file systems such as S3, which do not have permission concept, user can 
 never submit a job with staging area in S3. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-50) NPE in heartbeat when the configured topology script doesn't exist

2013-05-01 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-50?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13646670#comment-13646670
 ] 

Ivan Mitic commented on MAPREDUCE-50:
-

Hi Steve, Vinod,

I've run into the similar problem to this one. In my case, JobTracker started 
failing jobs because the network topology resolution started failing for a 
single node in the cluster:
{code}
2013-04-27 08:33:08,204 ERROR org.apache.hadoop.mapred.JobTracker: Job 
initialization failed:
java.lang.NullPointerException
at 
org.apache.hadoop.mapred.JobTracker.resolveAndAddToTopology(JobTracker.java:3205)
at 
org.apache.hadoop.mapred.JobInProgress.createCache(JobInProgress.java:550)
at 
org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:734)
at org.apache.hadoop.mapred.JobTracker.initJob(JobTracker.java:4214)
at 
org.apache.hadoop.mapred.EagerTaskInitializationListener$InitJob.run(EagerTaskInitializationListener.java:79)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
{code}

What happens is that some input split blocks are located on the datanode with 
the same IP/hostname as the TT. As a side effect this results in many of the 
customer jobs to fail during initialization.

NN on the other hand has a fallback logic that defaults to /default-rack, and 
this inconsistency actually makes this problem more severe :)
{code}
2013-04-27 04:36:47,185 ERROR 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: The resolve call returned 
null! Using /default-rack for host [100.64.34.3]
2013-04-27 04:36:47,185 INFO org.apache.hadoop.net.NetworkTopology: Adding a 
new node: /default-rack/100.64.34.3:50010  
{code}

In terms of the fix, my proposal would be to add the same fallback logic to the 
JobTracker. In our case, we actually had a network topology script that worked 
fine for a year or so, and now started failing for a single node for a reason 
we cannot explain yet.

Let me know what you think. I'll take up this Jira if you don't mind.

 NPE in heartbeat when the configured topology script doesn't exist
 --

 Key: MAPREDUCE-50
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-50
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.0.3
Reporter: Vinod Kumar Vavilapalli



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (MAPREDUCE-50) NPE in heartbeat when the configured topology script doesn't exist

2013-05-01 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-50?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic reassigned MAPREDUCE-50:
---

Assignee: Ivan Mitic

 NPE in heartbeat when the configured topology script doesn't exist
 --

 Key: MAPREDUCE-50
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-50
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.0.3
Reporter: Vinod Kumar Vavilapalli
Assignee: Ivan Mitic



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5191) TestQueue#testQueue fails with timeout on Windows

2013-04-29 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644662#comment-13644662
 ] 

Ivan Mitic commented on MAPREDUCE-5191:
---

bq. Is the increased timeout meant to go on testQueue instead of test2Queue?
That's absolutely right... I overlooked, thanks Chris!

 TestQueue#testQueue fails with timeout on Windows
 -

 Key: MAPREDUCE-5191
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5191
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5191.2.patch, MAPREDUCE-5191.patch


 Test times out on my machine after 5 seconds always on the below stack:
 {code}
 testQueue(org.apache.hadoop.mapred.TestQueue)  Time elapsed: 5009 sec   
 ERROR!
 java.lang.Exception: test timed out after 5000 milliseconds
   at java.lang.Object.wait(Native Method)
   at java.lang.Object.wait(Object.java:485)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedByte(SeedGenerator.java:330)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedBytes(SeedGenerator.java:319)
   at 
 sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:117)
   at 
 sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:114)
   at 
 sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:171)
   at java.security.SecureRandom.nextBytes(SecureRandom.java:433)
   at java.security.SecureRandom.next(SecureRandom.java:455)
   at java.util.Random.nextLong(Random.java:284)
   at java.io.File.generateFile(File.java:1682)
   at java.io.File.createTempFile(File.java:1791)
   at java.io.File.createTempFile(File.java:1828)
   at org.apache.hadoop.mapred.TestQueue.writeFile(TestQueue.java:221)
   at org.apache.hadoop.mapred.TestQueue.testQueue(TestQueue.java:53)
 {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5191) TestQueue#testQueue fails with timeout on Windows

2013-04-29 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5191:
--

Attachment: MAPREDUCE-5191.2.patch

Attaching the updated patch. 

 TestQueue#testQueue fails with timeout on Windows
 -

 Key: MAPREDUCE-5191
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5191
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5191.2.patch, MAPREDUCE-5191.patch


 Test times out on my machine after 5 seconds always on the below stack:
 {code}
 testQueue(org.apache.hadoop.mapred.TestQueue)  Time elapsed: 5009 sec   
 ERROR!
 java.lang.Exception: test timed out after 5000 milliseconds
   at java.lang.Object.wait(Native Method)
   at java.lang.Object.wait(Object.java:485)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedByte(SeedGenerator.java:330)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedBytes(SeedGenerator.java:319)
   at 
 sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:117)
   at 
 sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:114)
   at 
 sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:171)
   at java.security.SecureRandom.nextBytes(SecureRandom.java:433)
   at java.security.SecureRandom.next(SecureRandom.java:455)
   at java.util.Random.nextLong(Random.java:284)
   at java.io.File.generateFile(File.java:1682)
   at java.io.File.createTempFile(File.java:1791)
   at java.io.File.createTempFile(File.java:1828)
   at org.apache.hadoop.mapred.TestQueue.writeFile(TestQueue.java:221)
   at org.apache.hadoop.mapred.TestQueue.testQueue(TestQueue.java:53)
 {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5177) Move to common utils FileUtil#setReadable/Writable/Executable and FileUtil#canRead/Write/Execute

2013-04-29 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5177:
--

Status: Patch Available  (was: Open)

 Move to common utils FileUtil#setReadable/Writable/Executable and 
 FileUtil#canRead/Write/Execute
 

 Key: MAPREDUCE-5177
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5177
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5177.commonutils.2.patch, 
 MAPREDUCE-5177.commonutils.patch


 Move to using common utils described in HADOOP-9413 that work well 
 cross-platform.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5177) Move to common utils FileUtil#setReadable/Writable/Executable and FileUtil#canRead/Write/Execute

2013-04-28 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5177:
--

Attachment: MAPREDUCE-5177.commonutils.2.patch

Attaching the updated patch. Should be good now :)

 Move to common utils FileUtil#setReadable/Writable/Executable and 
 FileUtil#canRead/Write/Execute
 

 Key: MAPREDUCE-5177
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5177
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5177.commonutils.2.patch, 
 MAPREDUCE-5177.commonutils.patch


 Move to using common utils described in HADOOP-9413 that work well 
 cross-platform.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5191) TestQueue#testQueue fails with timeout on Windows

2013-04-28 Thread Ivan Mitic (JIRA)
Ivan Mitic created MAPREDUCE-5191:
-

 Summary: TestQueue#testQueue fails with timeout on Windows
 Key: MAPREDUCE-5191
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5191
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic


Test times out on my machine after 5 seconds always on the below stack:

{code}
testQueue(org.apache.hadoop.mapred.TestQueue)  Time elapsed: 5009 sec   
ERROR!
java.lang.Exception: test timed out after 5000 milliseconds
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:485)
at 
sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedByte(SeedGenerator.java:330)
at 
sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedBytes(SeedGenerator.java:319)
at 
sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:117)
at 
sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:114)
at 
sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:171)
at java.security.SecureRandom.nextBytes(SecureRandom.java:433)
at java.security.SecureRandom.next(SecureRandom.java:455)
at java.util.Random.nextLong(Random.java:284)
at java.io.File.generateFile(File.java:1682)
at java.io.File.createTempFile(File.java:1791)
at java.io.File.createTempFile(File.java:1828)
at org.apache.hadoop.mapred.TestQueue.writeFile(TestQueue.java:221)
at org.apache.hadoop.mapred.TestQueue.testQueue(TestQueue.java:53)
{code} 


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5191) TestQueue#testQueue fails with timeout on Windows

2013-04-28 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5191:
--

Attachment: MAPREDUCE-5191.patch

Attaching the patch. The fix is to increases the timeout from 5 to 10 seconds. 

First, I timed the call to {{File.createTempFile}} and it was ~5 seconds 
consistently on my box. After that, I looked this up online, and turned out to 
be a [known 
issue|http://stackoverflow.com/questions/2608763/why-does-first-call-to-java-io-file-createtempfilestring-string-file-take-5-se].
 


 TestQueue#testQueue fails with timeout on Windows
 -

 Key: MAPREDUCE-5191
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5191
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5191.patch


 Test times out on my machine after 5 seconds always on the below stack:
 {code}
 testQueue(org.apache.hadoop.mapred.TestQueue)  Time elapsed: 5009 sec   
 ERROR!
 java.lang.Exception: test timed out after 5000 milliseconds
   at java.lang.Object.wait(Native Method)
   at java.lang.Object.wait(Object.java:485)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedByte(SeedGenerator.java:330)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedBytes(SeedGenerator.java:319)
   at 
 sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:117)
   at 
 sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:114)
   at 
 sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:171)
   at java.security.SecureRandom.nextBytes(SecureRandom.java:433)
   at java.security.SecureRandom.next(SecureRandom.java:455)
   at java.util.Random.nextLong(Random.java:284)
   at java.io.File.generateFile(File.java:1682)
   at java.io.File.createTempFile(File.java:1791)
   at java.io.File.createTempFile(File.java:1828)
   at org.apache.hadoop.mapred.TestQueue.writeFile(TestQueue.java:221)
   at org.apache.hadoop.mapred.TestQueue.testQueue(TestQueue.java:53)
 {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5191) TestQueue#testQueue fails with timeout on Windows

2013-04-28 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5191:
--

Status: Patch Available  (was: Open)

 TestQueue#testQueue fails with timeout on Windows
 -

 Key: MAPREDUCE-5191
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5191
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5191.patch


 Test times out on my machine after 5 seconds always on the below stack:
 {code}
 testQueue(org.apache.hadoop.mapred.TestQueue)  Time elapsed: 5009 sec   
 ERROR!
 java.lang.Exception: test timed out after 5000 milliseconds
   at java.lang.Object.wait(Native Method)
   at java.lang.Object.wait(Object.java:485)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedByte(SeedGenerator.java:330)
   at 
 sun.security.provider.SeedGenerator$ThreadedSeedGenerator.getSeedBytes(SeedGenerator.java:319)
   at 
 sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:117)
   at 
 sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:114)
   at 
 sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:171)
   at java.security.SecureRandom.nextBytes(SecureRandom.java:433)
   at java.security.SecureRandom.next(SecureRandom.java:455)
   at java.util.Random.nextLong(Random.java:284)
   at java.io.File.generateFile(File.java:1682)
   at java.io.File.createTempFile(File.java:1791)
   at java.io.File.createTempFile(File.java:1828)
   at org.apache.hadoop.mapred.TestQueue.writeFile(TestQueue.java:221)
   at org.apache.hadoop.mapred.TestQueue.testQueue(TestQueue.java:53)
 {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5177) Move to common utils FileUtil#setReadable/Writable/Executable and FileUtil#canRead/Write/Execute

2013-04-24 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5177:
--

Attachment: MAPREDUCE-5177.commonutils.patch

Attaching the patch.

 Move to common utils FileUtil#setReadable/Writable/Executable and 
 FileUtil#canRead/Write/Execute
 

 Key: MAPREDUCE-5177
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5177
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5177.commonutils.patch


 Move to using common utils described in HADOOP-9413 that work well 
 cross-platform.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5177) Move to common utils FileUtil#setReadable/Writable/Executable and FileUtil#canRead/Write/Execute

2013-04-23 Thread Ivan Mitic (JIRA)
Ivan Mitic created MAPREDUCE-5177:
-

 Summary: Move to common utils 
FileUtil#setReadable/Writable/Executable and FileUtil#canRead/Write/Execute
 Key: MAPREDUCE-5177
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5177
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic


Move to using common utils described in HADOOP-9413 that work well 
cross-platform.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5066) JobTracker should set a timeout when calling into job.end.notification.url

2013-04-14 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13631430#comment-13631430
 ] 

Ivan Mitic commented on MAPREDUCE-5066:
---

Thanks for the review Arun! Sounds good, let me prepare the updated patch.

 JobTracker should set a timeout when calling into job.end.notification.url
 --

 Key: MAPREDUCE-5066
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5066
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1-win, 2.0.3-alpha, 1.3.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5066.2.patch, 
 MAPREDUCE-5066.branch-1-win.2.patch, MAPREDUCE-5066.branch-1-win.3.patch, 
 MAPREDUCE-5066.branch-1-win.4.patch, MAPREDUCE-5066.branch-1-win.patch, 
 MAPREDUCE-5066.patch


 In current code, timeout is not specified when JobTracker (JobEndNotifier) 
 calls into the notification URL. When the given URL points to a server that 
 will not respond for a long time, job notifications are completely stuck 
 (given that we have only a single thread processing all notifications). We've 
 seen this cause noticeable delays in job execution in components that rely on 
 job end notifications (like Oozie workflows). 
 I propose we introduce a configurable timeout option and set a default to a 
 reasonably small value.
 If we want, we can also introduce a configurable number of workers processing 
 the notification queue (not sure if this is needed though at this point).
 I will prepare a patch soon. Please comment back.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5066) JobTracker should set a timeout when calling into job.end.notification.url

2013-04-14 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5066:
--

Attachment: MAPREDUCE-5066.branch-1-win.5.patch

 JobTracker should set a timeout when calling into job.end.notification.url
 --

 Key: MAPREDUCE-5066
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5066
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1-win, 2.0.3-alpha, 1.3.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5066.2.patch, 
 MAPREDUCE-5066.branch-1-win.2.patch, MAPREDUCE-5066.branch-1-win.3.patch, 
 MAPREDUCE-5066.branch-1-win.4.patch, MAPREDUCE-5066.branch-1-win.5.patch, 
 MAPREDUCE-5066.branch-1-win.patch, MAPREDUCE-5066.patch


 In current code, timeout is not specified when JobTracker (JobEndNotifier) 
 calls into the notification URL. When the given URL points to a server that 
 will not respond for a long time, job notifications are completely stuck 
 (given that we have only a single thread processing all notifications). We've 
 seen this cause noticeable delays in job execution in components that rely on 
 job end notifications (like Oozie workflows). 
 I propose we introduce a configurable timeout option and set a default to a 
 reasonably small value.
 If we want, we can also introduce a configurable number of workers processing 
 the notification queue (not sure if this is needed though at this point).
 I will prepare a patch soon. Please comment back.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5066) JobTracker should set a timeout when calling into job.end.notification.url

2013-04-14 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5066:
--

Attachment: MAPREDUCE-5066.3.patch

Attaching updated patches. Arun, let me know it this looks good.

Thanks

 JobTracker should set a timeout when calling into job.end.notification.url
 --

 Key: MAPREDUCE-5066
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5066
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1-win, 2.0.3-alpha, 1.3.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5066.2.patch, MAPREDUCE-5066.3.patch, 
 MAPREDUCE-5066.branch-1-win.2.patch, MAPREDUCE-5066.branch-1-win.3.patch, 
 MAPREDUCE-5066.branch-1-win.4.patch, MAPREDUCE-5066.branch-1-win.5.patch, 
 MAPREDUCE-5066.branch-1-win.patch, MAPREDUCE-5066.patch


 In current code, timeout is not specified when JobTracker (JobEndNotifier) 
 calls into the notification URL. When the given URL points to a server that 
 will not respond for a long time, job notifications are completely stuck 
 (given that we have only a single thread processing all notifications). We've 
 seen this cause noticeable delays in job execution in components that rely on 
 job end notifications (like Oozie workflows). 
 I propose we introduce a configurable timeout option and set a default to a 
 reasonably small value.
 If we want, we can also introduce a configurable number of workers processing 
 the notification queue (not sure if this is needed though at this point).
 I will prepare a patch soon. Please comment back.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5056) TestProcfsBasedProcessTree fails on Windows with Process-tree dump doesn't start with a proper header

2013-04-09 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5056:
--

Resolution: Not A Problem
Status: Resolved  (was: Patch Available)

ProcfsBasedProcessTree was recently removed from the mapreduce project via 
MAPREDUCE-5077. Resolving this Jira as not a problem. 

Yarn's TestProcfsBasedProcessTree already passes on Windows.

 TestProcfsBasedProcessTree fails on Windows with Process-tree dump doesn't 
 start with a proper header
 -

 Key: MAPREDUCE-5056
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5056
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5056.trunk.2.patch, 
 MAPREDUCE-5056.trunk.3.patch, MAPREDUCE-5056.trunk.patch


 Test fails on the below assertion:
 Running org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree
 Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.266 sec  
 FAILURE!
 testProcessTreeDump(org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree)
   Time elapsed: 0 sec   FAILURE!
 junit.framework.AssertionFailedError: Process-tree dump doesn't start with a 
 proper header
   at junit.framework.Assert.fail(Assert.java:47)
   at junit.framework.Assert.assertTrue(Assert.java:20)
   at 
 org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree.testProcessTreeDump(TestProcfsBasedProcessTree.java:564)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at junit.framework.TestCase.runTest(TestCase.java:168)
   at junit.framework.TestCase.runBare(TestCase.java:134)
   at junit.framework.TestResult$1.protect(TestResult.java:110)
   at junit.framework.TestResult.runProtected(TestResult.java:128)
   at junit.framework.TestResult.run(TestResult.java:113)
   at junit.framework.TestCase.run(TestCase.java:124)
   at junit.framework.TestSuite.runTest(TestSuite.java:243)
   at junit.framework.TestSuite.run(TestSuite.java:238)
   at 
 org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
   at 
 org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
   at 
 org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
   at 
 org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
   at 
 org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5066) JobTracker should set a timeout when calling into job.end.notification.url

2013-04-04 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13622851#comment-13622851
 ] 

Ivan Mitic commented on MAPREDUCE-5066:
---

Arun, did you get a chance to take a look at my latest branch-1/branch-2 
patches? Please check my comment from above from some context.

Thx!


 JobTracker should set a timeout when calling into job.end.notification.url
 --

 Key: MAPREDUCE-5066
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5066
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1-win, 2.0.3-alpha, 1.3.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5066.2.patch, 
 MAPREDUCE-5066.branch-1-win.2.patch, MAPREDUCE-5066.branch-1-win.3.patch, 
 MAPREDUCE-5066.branch-1-win.4.patch, MAPREDUCE-5066.branch-1-win.patch, 
 MAPREDUCE-5066.patch


 In current code, timeout is not specified when JobTracker (JobEndNotifier) 
 calls into the notification URL. When the given URL points to a server that 
 will not respond for a long time, job notifications are completely stuck 
 (given that we have only a single thread processing all notifications). We've 
 seen this cause noticeable delays in job execution in components that rely on 
 job end notifications (like Oozie workflows). 
 I propose we introduce a configurable timeout option and set a default to a 
 reasonably small value.
 If we want, we can also introduce a configurable number of workers processing 
 the notification queue (not sure if this is needed though at this point).
 I will prepare a patch soon. Please comment back.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5056) TestProcfsBasedProcessTree fails on Windows with Process-tree dump doesn't start with a proper header

2013-04-04 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13622852#comment-13622852
 ] 

Ivan Mitic commented on MAPREDUCE-5056:
---

Bikas, can you please take a look at the latest patch? Thx!

 TestProcfsBasedProcessTree fails on Windows with Process-tree dump doesn't 
 start with a proper header
 -

 Key: MAPREDUCE-5056
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5056
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5056.trunk.2.patch, 
 MAPREDUCE-5056.trunk.3.patch, MAPREDUCE-5056.trunk.patch


 Test fails on the below assertion:
 Running org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree
 Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.266 sec  
 FAILURE!
 testProcessTreeDump(org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree)
   Time elapsed: 0 sec   FAILURE!
 junit.framework.AssertionFailedError: Process-tree dump doesn't start with a 
 proper header
   at junit.framework.Assert.fail(Assert.java:47)
   at junit.framework.Assert.assertTrue(Assert.java:20)
   at 
 org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree.testProcessTreeDump(TestProcfsBasedProcessTree.java:564)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at junit.framework.TestCase.runTest(TestCase.java:168)
   at junit.framework.TestCase.runBare(TestCase.java:134)
   at junit.framework.TestResult$1.protect(TestResult.java:110)
   at junit.framework.TestResult.runProtected(TestResult.java:128)
   at junit.framework.TestResult.run(TestResult.java:113)
   at junit.framework.TestCase.run(TestCase.java:124)
   at junit.framework.TestSuite.runTest(TestSuite.java:243)
   at junit.framework.TestSuite.run(TestSuite.java:238)
   at 
 org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
   at 
 org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
   at 
 org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
   at 
 org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
   at 
 org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5056) TestProcfsBasedProcessTree fails on Windows with Process-tree dump doesn't start with a proper header

2013-04-04 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13622978#comment-13622978
 ] 

Ivan Mitic commented on MAPREDUCE-5056:
---

Thanks Chris for the comment. Bikas, this also makes the test consistent with 
the implementation {{ProcfsBasedProcessTree#getProcessTreeDump}}.

 TestProcfsBasedProcessTree fails on Windows with Process-tree dump doesn't 
 start with a proper header
 -

 Key: MAPREDUCE-5056
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5056
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5056.trunk.2.patch, 
 MAPREDUCE-5056.trunk.3.patch, MAPREDUCE-5056.trunk.patch


 Test fails on the below assertion:
 Running org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree
 Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.266 sec  
 FAILURE!
 testProcessTreeDump(org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree)
   Time elapsed: 0 sec   FAILURE!
 junit.framework.AssertionFailedError: Process-tree dump doesn't start with a 
 proper header
   at junit.framework.Assert.fail(Assert.java:47)
   at junit.framework.Assert.assertTrue(Assert.java:20)
   at 
 org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree.testProcessTreeDump(TestProcfsBasedProcessTree.java:564)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at junit.framework.TestCase.runTest(TestCase.java:168)
   at junit.framework.TestCase.runBare(TestCase.java:134)
   at junit.framework.TestResult$1.protect(TestResult.java:110)
   at junit.framework.TestResult.runProtected(TestResult.java:128)
   at junit.framework.TestResult.run(TestResult.java:113)
   at junit.framework.TestCase.run(TestCase.java:124)
   at junit.framework.TestSuite.runTest(TestSuite.java:243)
   at junit.framework.TestSuite.run(TestSuite.java:238)
   at 
 org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
   at 
 org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
   at 
 org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
   at 
 org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
   at 
 org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5066) JobTracker should set a timeout when calling into job.end.notification.url

2013-03-31 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5066:
--

Attachment: MAPREDUCE-5066.patch
MAPREDUCE-5066.branch-1.patch
MAPREDUCE-5066.branch-1-win.3.patch

Hi Arun,

I am attaching branch-1-win/branch-1 and branch-2 compatible patches.

A few notes on the patches:
 - Fixed a test verification issue in branch-1-win.3.patch
 - branch-1 and branch-1-win patches are fully compatible (and equivalent)
 - Branch-2 codebase changed significantly and I did my best effort to find the 
appropriate forward patch. There are two implementations of the JobEndNotifier, 
mapred#JobEndNotifier (based on the one from branch-1 but simplified) and 
mapreduce.v2.app#JobEndNotifier (new implementation). The former is used in the 
LocalJobRunner and latter in the MR AppMaster. In my patch I did the following:
*a.* Applied the bugfixes to current state of mapred#JobEndNotifier and 
included the corresponding unittests
*b.* Given that mapreduce.v2.app#JobEndNotifier already sets the timeout to 
5 seconds, I did the same in mapred#JobEndNotifier. In other words, I did not 
introduce a config knob that would allow the timeout to be configurable. My 
reasoning was that in branch-1, people might see 5 second timeout as a 
regression and might want to change it to a different value. In trunk, given 
that the timeout is already set to 5 seconds, this should be fine until proved 
otherwise. Please advise if you think this is needed.

 JobTracker should set a timeout when calling into job.end.notification.url
 --

 Key: MAPREDUCE-5066
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5066
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1-win, 2.0.3-alpha, 1.3.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5066.branch-1.patch, 
 MAPREDUCE-5066.branch-1-win.2.patch, MAPREDUCE-5066.branch-1-win.3.patch, 
 MAPREDUCE-5066.branch-1-win.patch, MAPREDUCE-5066.patch


 In current code, timeout is not specified when JobTracker (JobEndNotifier) 
 calls into the notification URL. When the given URL points to a server that 
 will not respond for a long time, job notifications are completely stuck 
 (given that we have only a single thread processing all notifications). We've 
 seen this cause noticeable delays in job execution in components that rely on 
 job end notifications (like Oozie workflows). 
 I propose we introduce a configurable timeout option and set a default to a 
 reasonably small value.
 If we want, we can also introduce a configurable number of workers processing 
 the notification queue (not sure if this is needed though at this point).
 I will prepare a patch soon. Please comment back.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5066) JobTracker should set a timeout when calling into job.end.notification.url

2013-03-31 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5066:
--

Attachment: MAPREDUCE-5066.patch

 JobTracker should set a timeout when calling into job.end.notification.url
 --

 Key: MAPREDUCE-5066
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5066
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1-win, 2.0.3-alpha, 1.3.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5066.branch-1.patch, 
 MAPREDUCE-5066.branch-1-win.2.patch, MAPREDUCE-5066.branch-1-win.3.patch, 
 MAPREDUCE-5066.branch-1-win.patch, MAPREDUCE-5066.patch


 In current code, timeout is not specified when JobTracker (JobEndNotifier) 
 calls into the notification URL. When the given URL points to a server that 
 will not respond for a long time, job notifications are completely stuck 
 (given that we have only a single thread processing all notifications). We've 
 seen this cause noticeable delays in job execution in components that rely on 
 job end notifications (like Oozie workflows). 
 I propose we introduce a configurable timeout option and set a default to a 
 reasonably small value.
 If we want, we can also introduce a configurable number of workers processing 
 the notification queue (not sure if this is needed though at this point).
 I will prepare a patch soon. Please comment back.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5066) JobTracker should set a timeout when calling into job.end.notification.url

2013-03-31 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5066:
--

Attachment: (was: MAPREDUCE-5066.patch)

 JobTracker should set a timeout when calling into job.end.notification.url
 --

 Key: MAPREDUCE-5066
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5066
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1-win, 2.0.3-alpha, 1.3.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5066.branch-1.patch, 
 MAPREDUCE-5066.branch-1-win.2.patch, MAPREDUCE-5066.branch-1-win.3.patch, 
 MAPREDUCE-5066.branch-1-win.patch, MAPREDUCE-5066.patch


 In current code, timeout is not specified when JobTracker (JobEndNotifier) 
 calls into the notification URL. When the given URL points to a server that 
 will not respond for a long time, job notifications are completely stuck 
 (given that we have only a single thread processing all notifications). We've 
 seen this cause noticeable delays in job execution in components that rely on 
 job end notifications (like Oozie workflows). 
 I propose we introduce a configurable timeout option and set a default to a 
 reasonably small value.
 If we want, we can also introduce a configurable number of workers processing 
 the notification queue (not sure if this is needed though at this point).
 I will prepare a patch soon. Please comment back.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5066) JobTracker should set a timeout when calling into job.end.notification.url

2013-03-31 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5066:
--

Status: Patch Available  (was: Open)

 JobTracker should set a timeout when calling into job.end.notification.url
 --

 Key: MAPREDUCE-5066
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5066
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.0.3-alpha, 1-win, 1.3.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5066.branch-1.patch, 
 MAPREDUCE-5066.branch-1-win.2.patch, MAPREDUCE-5066.branch-1-win.3.patch, 
 MAPREDUCE-5066.branch-1-win.patch, MAPREDUCE-5066.patch


 In current code, timeout is not specified when JobTracker (JobEndNotifier) 
 calls into the notification URL. When the given URL points to a server that 
 will not respond for a long time, job notifications are completely stuck 
 (given that we have only a single thread processing all notifications). We've 
 seen this cause noticeable delays in job execution in components that rely on 
 job end notifications (like Oozie workflows). 
 I propose we introduce a configurable timeout option and set a default to a 
 reasonably small value.
 If we want, we can also introduce a configurable number of workers processing 
 the notification queue (not sure if this is needed though at this point).
 I will prepare a patch soon. Please comment back.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5066) JobTracker should set a timeout when calling into job.end.notification.url

2013-03-31 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5066:
--

Attachment: (was: MAPREDUCE-5066.branch-1.patch)

 JobTracker should set a timeout when calling into job.end.notification.url
 --

 Key: MAPREDUCE-5066
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5066
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1-win, 2.0.3-alpha, 1.3.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5066.branch-1-win.2.patch, 
 MAPREDUCE-5066.branch-1-win.3.patch, MAPREDUCE-5066.branch-1-win.patch, 
 MAPREDUCE-5066.patch


 In current code, timeout is not specified when JobTracker (JobEndNotifier) 
 calls into the notification URL. When the given URL points to a server that 
 will not respond for a long time, job notifications are completely stuck 
 (given that we have only a single thread processing all notifications). We've 
 seen this cause noticeable delays in job execution in components that rely on 
 job end notifications (like Oozie workflows). 
 I propose we introduce a configurable timeout option and set a default to a 
 reasonably small value.
 If we want, we can also introduce a configurable number of workers processing 
 the notification queue (not sure if this is needed though at this point).
 I will prepare a patch soon. Please comment back.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5066) JobTracker should set a timeout when calling into job.end.notification.url

2013-03-31 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5066:
--

Attachment: MAPREDUCE-5066.2.patch
MAPREDUCE-5066.branch-1-win.4.patch

Audit warning fix, missing the Apache header in the new test file. 

 JobTracker should set a timeout when calling into job.end.notification.url
 --

 Key: MAPREDUCE-5066
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5066
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1-win, 2.0.3-alpha, 1.3.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5066.2.patch, 
 MAPREDUCE-5066.branch-1-win.2.patch, MAPREDUCE-5066.branch-1-win.3.patch, 
 MAPREDUCE-5066.branch-1-win.4.patch, MAPREDUCE-5066.branch-1-win.patch, 
 MAPREDUCE-5066.patch


 In current code, timeout is not specified when JobTracker (JobEndNotifier) 
 calls into the notification URL. When the given URL points to a server that 
 will not respond for a long time, job notifications are completely stuck 
 (given that we have only a single thread processing all notifications). We've 
 seen this cause noticeable delays in job execution in components that rely on 
 job end notifications (like Oozie workflows). 
 I propose we introduce a configurable timeout option and set a default to a 
 reasonably small value.
 If we want, we can also introduce a configurable number of workers processing 
 the notification queue (not sure if this is needed though at this point).
 I will prepare a patch soon. Please comment back.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4987) TestMRJobs#testDistributedCache fails on Windows due to unexpected behavior of symlinks

2013-03-19 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13606136#comment-13606136
 ] 

Ivan Mitic commented on MAPREDUCE-4987:
---

bq. These tests were running pretty close to the timeouts in my environment, 
even on Mac. Here is a new patch that increases the timeouts.
Thanks, I verified that the test now passes, +1 on the patch 


 TestMRJobs#testDistributedCache fails on Windows due to unexpected behavior 
 of symlinks
 ---

 Key: MAPREDUCE-4987
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4987
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distributed-cache, nodemanager
Affects Versions: 3.0.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: MAPREDUCE-4987.1.patch, MAPREDUCE-4987.2.patch


 On Windows, {{TestMRJobs#testDistributedCache}} fails on an assertion while 
 checking the length of a symlink.  It expects to see the length of the target 
 of the symlink, but Java 6 on Windows always reports that a symlink has 
 length 0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5078) TestMRAppMaster fails on Windows due to mismatched path separators

2013-03-18 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605523#comment-13605523
 ] 

Ivan Mitic commented on MAPREDUCE-5078:
---

Thanks Chris, patch looks good, +1

 TestMRAppMaster fails on Windows due to mismatched path separators
 --

 Key: MAPREDUCE-5078
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5078
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Affects Versions: 3.0.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: MAPREDUCE-5078.1.patch


 The failing test is {{TestMRAppMaster#testMRAppMasterForDifferentUser}}.  
 There is an assertion about the AM staging directory, but the expected value 
 is constructed with a mix of forward and back slashes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4987) TestMRJobs#testDistributedCache fails on Windows due to unexpected behavior of symlinks

2013-03-18 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605755#comment-13605755
 ] 

Ivan Mitic commented on MAPREDUCE-4987:
---

Thanks Chris, patch looks good overall, +1

I noticed that TestMRJobs fails with timeout on my box. Does it consistently 
succeed for you? I see that the timeouts are set quite high (5 minutes). This 
is non blocking, I'll take a look when I get a chance, just thought I'll ask.

TestFileUtil passes fine.

 TestMRJobs#testDistributedCache fails on Windows due to unexpected behavior 
 of symlinks
 ---

 Key: MAPREDUCE-4987
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4987
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distributed-cache, nodemanager
Affects Versions: 3.0.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: MAPREDUCE-4987.1.patch


 On Windows, {{TestMRJobs#testDistributedCache}} fails on an assertion while 
 checking the length of a symlink.  It expects to see the length of the target 
 of the symlink, but Java 6 on Windows always reports that a symlink has 
 length 0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4987) TestMRJobs#testDistributedCache fails on Windows due to unexpected behavior of symlinks

2013-03-18 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605803#comment-13605803
 ] 

Ivan Mitic commented on MAPREDUCE-4987:
---

bq. Is there a particular test within the suite that is timing out consistently 
for you?
I tried to remove all timeouts from the test and it is passing now. Let me find 
the exact test case that is causing problems.

 TestMRJobs#testDistributedCache fails on Windows due to unexpected behavior 
 of symlinks
 ---

 Key: MAPREDUCE-4987
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4987
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distributed-cache, nodemanager
Affects Versions: 3.0.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: MAPREDUCE-4987.1.patch


 On Windows, {{TestMRJobs#testDistributedCache}} fails on an assertion while 
 checking the length of a symlink.  It expects to see the length of the target 
 of the symlink, but Java 6 on Windows always reports that a symlink has 
 length 0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4987) TestMRJobs#testDistributedCache fails on Windows due to unexpected behavior of symlinks

2013-03-18 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605921#comment-13605921
 ] 

Ivan Mitic commented on MAPREDUCE-4987:
---

Hi Chris, I played around with the test a bit, and the following tests fail 
because of the timeout on my box: testRandomWriter, testFailingMapper, 
testSleepJobWithSecurityOn.

 TestMRJobs#testDistributedCache fails on Windows due to unexpected behavior 
 of symlinks
 ---

 Key: MAPREDUCE-4987
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4987
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distributed-cache, nodemanager
Affects Versions: 3.0.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: MAPREDUCE-4987.1.patch


 On Windows, {{TestMRJobs#testDistributedCache}} fails on an assertion while 
 checking the length of a symlink.  It expects to see the length of the target 
 of the symlink, but Java 6 on Windows always reports that a symlink has 
 length 0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4987) TestMRJobs#testDistributedCache fails on Windows due to unexpected behavior of symlinks

2013-03-18 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605988#comment-13605988
 ] 

Ivan Mitic commented on MAPREDUCE-4987:
---

bq. I played around with the test a bit, and the following tests fail because 
of the timeout on my box: testRandomWriter, testFailingMapper, 
testSleepJobWithSecurityOn.
Increasing the test timeouts by the factor of 2 helped make the tests pass on 
my box.

 TestMRJobs#testDistributedCache fails on Windows due to unexpected behavior 
 of symlinks
 ---

 Key: MAPREDUCE-4987
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4987
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distributed-cache, nodemanager
Affects Versions: 3.0.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: MAPREDUCE-4987.1.patch


 On Windows, {{TestMRJobs#testDistributedCache}} fails on an assertion while 
 checking the length of a symlink.  It expects to see the length of the target 
 of the symlink, but Java 6 on Windows always reports that a symlink has 
 length 0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5066) JobTracker should set a timeout when calling into job.end.notification.url

2013-03-17 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5066:
--

Attachment: MAPREDUCE-5066.branch-1-win.patch

Attaching the branch-1 compatible patch. 

A few notes:
 - Introduced missing unittests for the JobEndNotifier that cover most of its 
functionality
 - Added a test case that targets the problem from the Jira
 - Fixed a bug in how retry count it computed (we had an extra retry attempt 
previously)

 JobTracker should set a timeout when calling into job.end.notification.url
 --

 Key: MAPREDUCE-5066
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5066
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1-win, 2.0.3-alpha, 1.3.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5066.branch-1-win.patch


 In current code, timeout is not specified when JobTracker (JobEndNotifier) 
 calls into the notification URL. When the given URL points to a server that 
 will not respond for a long time, job notifications are completely stuck 
 (given that we have only a single thread processing all notifications). We've 
 seen this cause noticeable delays in job execution in components that rely on 
 job end notifications (like Oozie workflows). 
 I propose we introduce a configurable timeout option and set a default to a 
 reasonably small value.
 If we want, we can also introduce a configurable number of workers processing 
 the notification queue (not sure if this is needed though at this point).
 I will prepare a patch soon. Please comment back.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5066) JobTracker should set a timeout when calling into job.end.notification.url

2013-03-17 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604764#comment-13604764
 ] 

Ivan Mitic commented on MAPREDUCE-5066:
---

bq. Job notification also exists in 2.x which may face the same set of issues.
Thanks Hitesh, it should be strait forward to rebase the patch for 2.x branch. 
Will do so once the current patch is reviewed.

 JobTracker should set a timeout when calling into job.end.notification.url
 --

 Key: MAPREDUCE-5066
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5066
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1-win, 2.0.3-alpha, 1.3.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5066.branch-1-win.patch


 In current code, timeout is not specified when JobTracker (JobEndNotifier) 
 calls into the notification URL. When the given URL points to a server that 
 will not respond for a long time, job notifications are completely stuck 
 (given that we have only a single thread processing all notifications). We've 
 seen this cause noticeable delays in job execution in components that rely on 
 job end notifications (like Oozie workflows). 
 I propose we introduce a configurable timeout option and set a default to a 
 reasonably small value.
 If we want, we can also introduce a configurable number of workers processing 
 the notification queue (not sure if this is needed though at this point).
 I will prepare a patch soon. Please comment back.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5066) JobTracker should set a timeout when calling into job.end.notification.url

2013-03-17 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5066:
--

Status: Patch Available  (was: Open)

 JobTracker should set a timeout when calling into job.end.notification.url
 --

 Key: MAPREDUCE-5066
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5066
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.0.3-alpha, 1-win, 1.3.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5066.branch-1-win.patch


 In current code, timeout is not specified when JobTracker (JobEndNotifier) 
 calls into the notification URL. When the given URL points to a server that 
 will not respond for a long time, job notifications are completely stuck 
 (given that we have only a single thread processing all notifications). We've 
 seen this cause noticeable delays in job execution in components that rely on 
 job end notifications (like Oozie workflows). 
 I propose we introduce a configurable timeout option and set a default to a 
 reasonably small value.
 If we want, we can also introduce a configurable number of workers processing 
 the notification queue (not sure if this is needed though at this point).
 I will prepare a patch soon. Please comment back.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5066) JobTracker should set a timeout when calling into job.end.notification.url

2013-03-17 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5066:
--

Status: Open  (was: Patch Available)

 JobTracker should set a timeout when calling into job.end.notification.url
 --

 Key: MAPREDUCE-5066
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5066
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.0.3-alpha, 1-win, 1.3.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5066.branch-1-win.patch


 In current code, timeout is not specified when JobTracker (JobEndNotifier) 
 calls into the notification URL. When the given URL points to a server that 
 will not respond for a long time, job notifications are completely stuck 
 (given that we have only a single thread processing all notifications). We've 
 seen this cause noticeable delays in job execution in components that rely on 
 job end notifications (like Oozie workflows). 
 I propose we introduce a configurable timeout option and set a default to a 
 reasonably small value.
 If we want, we can also introduce a configurable number of workers processing 
 the notification queue (not sure if this is needed though at this point).
 I will prepare a patch soon. Please comment back.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5066) JobTracker should set a timeout when calling into job.end.notification.url

2013-03-17 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5066:
--

Attachment: MAPREDUCE-5066.branch-1-win.2.patch

Minor patch update, factoring common unittest code into utility methods.

 JobTracker should set a timeout when calling into job.end.notification.url
 --

 Key: MAPREDUCE-5066
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5066
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1-win, 2.0.3-alpha, 1.3.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5066.branch-1-win.2.patch, 
 MAPREDUCE-5066.branch-1-win.patch


 In current code, timeout is not specified when JobTracker (JobEndNotifier) 
 calls into the notification URL. When the given URL points to a server that 
 will not respond for a long time, job notifications are completely stuck 
 (given that we have only a single thread processing all notifications). We've 
 seen this cause noticeable delays in job execution in components that rely on 
 job end notifications (like Oozie workflows). 
 I propose we introduce a configurable timeout option and set a default to a 
 reasonably small value.
 If we want, we can also introduce a configurable number of workers processing 
 the notification queue (not sure if this is needed though at this point).
 I will prepare a patch soon. Please comment back.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4885) streaming tests have multiple failures on Windows

2013-03-14 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13602101#comment-13602101
 ] 

Ivan Mitic commented on MAPREDUCE-4885:
---

Thanks for addressing the comments Chris, +1, patch looks good to me

 streaming tests have multiple failures on Windows
 -

 Key: MAPREDUCE-4885
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4885
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/streaming, test
Affects Versions: 3.0.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: MAPREDUCE-4885.1.patch, MAPREDUCE-4885.2.patch


 There are multiple test failures due to Queue configuration missing child 
 queue names for root.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5066) JobTracker should set a timeout when calling into job.end.notification.url

2013-03-14 Thread Ivan Mitic (JIRA)
Ivan Mitic created MAPREDUCE-5066:
-

 Summary: JobTracker should set a timeout when calling into 
job.end.notification.url
 Key: MAPREDUCE-5066
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5066
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1-win, 1.3.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic


In current code, timeout is not specified when JobTracker (JobEndNotifier) 
calls into the notification URL. When the given URL points to a server that 
will not respond for a long time, job notifications are completely stuck (given 
that we have only a single thread processing all notifications). We've seen 
this cause noticeable delays in job execution in components that rely on job 
end notifications (like Oozie workflows). 

I propose we introduce a configurable timeout option and set a default to a 
reasonably small value.

If we want, we can also introduce a configurable number of workers processing 
the notification queue (not sure if this is needed though at this point).

I will prepare a patch soon. Please comment back.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4885) streaming tests have multiple failures on Windows

2013-03-13 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600911#comment-13600911
 ] 

Ivan Mitic commented on MAPREDUCE-4885:
---

Patch looks really good Chris.

Just one minor comment, you are missing Apache headers in the newly added cmd 
scripts, otherwise +1 from me 

 streaming tests have multiple failures on Windows
 -

 Key: MAPREDUCE-4885
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4885
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/streaming, test
Affects Versions: 3.0.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: MAPREDUCE-4885.1.patch


 There are multiple test failures due to Queue configuration missing child 
 queue names for root.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5056) TestProcfsBasedProcessTree fails on Windows with Process-tree dump doesn't start with a proper header

2013-03-11 Thread Ivan Mitic (JIRA)
Ivan Mitic created MAPREDUCE-5056:
-

 Summary: TestProcfsBasedProcessTree fails on Windows with 
Process-tree dump doesn't start with a proper header
 Key: MAPREDUCE-5056
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5056
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic


Test fails on the below assertion:

Running org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree
Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.266 sec  
FAILURE!
testProcessTreeDump(org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree)
  Time elapsed: 0 sec   FAILURE!
junit.framework.AssertionFailedError: Process-tree dump doesn't start with a 
proper header
at junit.framework.Assert.fail(Assert.java:47)
at junit.framework.Assert.assertTrue(Assert.java:20)
at 
org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree.testProcessTreeDump(TestProcfsBasedProcessTree.java:564)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at junit.framework.TestCase.runTest(TestCase.java:168)
at junit.framework.TestCase.runBare(TestCase.java:134)
at junit.framework.TestResult$1.protect(TestResult.java:110)
at junit.framework.TestResult.runProtected(TestResult.java:128)
at junit.framework.TestResult.run(TestResult.java:113)
at junit.framework.TestCase.run(TestCase.java:124)
at junit.framework.TestSuite.runTest(TestSuite.java:243)
at junit.framework.TestSuite.run(TestSuite.java:238)
at 
org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
at 
org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
at 
org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5056) TestProcfsBasedProcessTree fails on Windows with Process-tree dump doesn't start with a proper header

2013-03-11 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5056:
--

Attachment: MAPREDUCE-5056.trunk.patch

Attaching the patch.

An easy one, the test fails to match the process dump header because of a line 
ending (test expects: \n, and it is: \r\n).

 TestProcfsBasedProcessTree fails on Windows with Process-tree dump doesn't 
 start with a proper header
 -

 Key: MAPREDUCE-5056
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5056
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5056.trunk.patch


 Test fails on the below assertion:
 Running org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree
 Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.266 sec  
 FAILURE!
 testProcessTreeDump(org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree)
   Time elapsed: 0 sec   FAILURE!
 junit.framework.AssertionFailedError: Process-tree dump doesn't start with a 
 proper header
   at junit.framework.Assert.fail(Assert.java:47)
   at junit.framework.Assert.assertTrue(Assert.java:20)
   at 
 org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree.testProcessTreeDump(TestProcfsBasedProcessTree.java:564)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at junit.framework.TestCase.runTest(TestCase.java:168)
   at junit.framework.TestCase.runBare(TestCase.java:134)
   at junit.framework.TestResult$1.protect(TestResult.java:110)
   at junit.framework.TestResult.runProtected(TestResult.java:128)
   at junit.framework.TestResult.run(TestResult.java:113)
   at junit.framework.TestCase.run(TestCase.java:124)
   at junit.framework.TestSuite.runTest(TestSuite.java:243)
   at junit.framework.TestSuite.run(TestSuite.java:238)
   at 
 org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
   at 
 org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
   at 
 org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
   at 
 org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
   at 
 org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5056) TestProcfsBasedProcessTree fails on Windows with Process-tree dump doesn't start with a proper header

2013-03-11 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5056:
--

Status: Patch Available  (was: Open)

 TestProcfsBasedProcessTree fails on Windows with Process-tree dump doesn't 
 start with a proper header
 -

 Key: MAPREDUCE-5056
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5056
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5056.trunk.patch


 Test fails on the below assertion:
 Running org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree
 Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.266 sec  
 FAILURE!
 testProcessTreeDump(org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree)
   Time elapsed: 0 sec   FAILURE!
 junit.framework.AssertionFailedError: Process-tree dump doesn't start with a 
 proper header
   at junit.framework.Assert.fail(Assert.java:47)
   at junit.framework.Assert.assertTrue(Assert.java:20)
   at 
 org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree.testProcessTreeDump(TestProcfsBasedProcessTree.java:564)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at junit.framework.TestCase.runTest(TestCase.java:168)
   at junit.framework.TestCase.runBare(TestCase.java:134)
   at junit.framework.TestResult$1.protect(TestResult.java:110)
   at junit.framework.TestResult.runProtected(TestResult.java:128)
   at junit.framework.TestResult.run(TestResult.java:113)
   at junit.framework.TestCase.run(TestCase.java:124)
   at junit.framework.TestSuite.runTest(TestSuite.java:243)
   at junit.framework.TestSuite.run(TestSuite.java:238)
   at 
 org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
   at 
 org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
   at 
 org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
   at 
 org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
   at 
 org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5056) TestProcfsBasedProcessTree fails on Windows with Process-tree dump doesn't start with a proper header

2013-03-11 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599045#comment-13599045
 ] 

Ivan Mitic commented on MAPREDUCE-5056:
---

Just to add, not sure how valuable it is to run TestProcfsBasedProcessTree on 
Windows, given that we actually use WindowsBasedProcessTree. Alternative is to 
skip the test as a whole if ProcfsBasedProcessTree is not available.

 TestProcfsBasedProcessTree fails on Windows with Process-tree dump doesn't 
 start with a proper header
 -

 Key: MAPREDUCE-5056
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5056
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5056.trunk.patch


 Test fails on the below assertion:
 Running org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree
 Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.266 sec  
 FAILURE!
 testProcessTreeDump(org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree)
   Time elapsed: 0 sec   FAILURE!
 junit.framework.AssertionFailedError: Process-tree dump doesn't start with a 
 proper header
   at junit.framework.Assert.fail(Assert.java:47)
   at junit.framework.Assert.assertTrue(Assert.java:20)
   at 
 org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree.testProcessTreeDump(TestProcfsBasedProcessTree.java:564)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at junit.framework.TestCase.runTest(TestCase.java:168)
   at junit.framework.TestCase.runBare(TestCase.java:134)
   at junit.framework.TestResult$1.protect(TestResult.java:110)
   at junit.framework.TestResult.runProtected(TestResult.java:128)
   at junit.framework.TestResult.run(TestResult.java:113)
   at junit.framework.TestCase.run(TestCase.java:124)
   at junit.framework.TestSuite.runTest(TestSuite.java:243)
   at junit.framework.TestSuite.run(TestSuite.java:238)
   at 
 org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
   at 
 org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
   at 
 org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
   at 
 org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
   at 
 org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5056) TestProcfsBasedProcessTree fails on Windows with Process-tree dump doesn't start with a proper header

2013-03-11 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5056:
--

Attachment: MAPREDUCE-5056.trunk.2.patch

Attaching a slightly better patch.

 TestProcfsBasedProcessTree fails on Windows with Process-tree dump doesn't 
 start with a proper header
 -

 Key: MAPREDUCE-5056
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5056
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5056.trunk.2.patch, MAPREDUCE-5056.trunk.patch


 Test fails on the below assertion:
 Running org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree
 Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.266 sec  
 FAILURE!
 testProcessTreeDump(org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree)
   Time elapsed: 0 sec   FAILURE!
 junit.framework.AssertionFailedError: Process-tree dump doesn't start with a 
 proper header
   at junit.framework.Assert.fail(Assert.java:47)
   at junit.framework.Assert.assertTrue(Assert.java:20)
   at 
 org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree.testProcessTreeDump(TestProcfsBasedProcessTree.java:564)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at junit.framework.TestCase.runTest(TestCase.java:168)
   at junit.framework.TestCase.runBare(TestCase.java:134)
   at junit.framework.TestResult$1.protect(TestResult.java:110)
   at junit.framework.TestResult.runProtected(TestResult.java:128)
   at junit.framework.TestResult.run(TestResult.java:113)
   at junit.framework.TestCase.run(TestCase.java:124)
   at junit.framework.TestSuite.runTest(TestSuite.java:243)
   at junit.framework.TestSuite.run(TestSuite.java:238)
   at 
 org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
   at 
 org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
   at 
 org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
   at 
 org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
   at 
 org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5056) TestProcfsBasedProcessTree fails on Windows with Process-tree dump doesn't start with a proper header

2013-03-11 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599090#comment-13599090
 ] 

Ivan Mitic commented on MAPREDUCE-5056:
---

bq. I am in favor of disabling the test with if(Shell.Linux).
Thanks Bikas for the quick response, I agree, will attach the new patch shortly.

 TestProcfsBasedProcessTree fails on Windows with Process-tree dump doesn't 
 start with a proper header
 -

 Key: MAPREDUCE-5056
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5056
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5056.trunk.2.patch, MAPREDUCE-5056.trunk.patch


 Test fails on the below assertion:
 Running org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree
 Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.266 sec  
 FAILURE!
 testProcessTreeDump(org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree)
   Time elapsed: 0 sec   FAILURE!
 junit.framework.AssertionFailedError: Process-tree dump doesn't start with a 
 proper header
   at junit.framework.Assert.fail(Assert.java:47)
   at junit.framework.Assert.assertTrue(Assert.java:20)
   at 
 org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree.testProcessTreeDump(TestProcfsBasedProcessTree.java:564)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at junit.framework.TestCase.runTest(TestCase.java:168)
   at junit.framework.TestCase.runBare(TestCase.java:134)
   at junit.framework.TestResult$1.protect(TestResult.java:110)
   at junit.framework.TestResult.runProtected(TestResult.java:128)
   at junit.framework.TestResult.run(TestResult.java:113)
   at junit.framework.TestCase.run(TestCase.java:124)
   at junit.framework.TestSuite.runTest(TestSuite.java:243)
   at junit.framework.TestSuite.run(TestSuite.java:238)
   at 
 org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
   at 
 org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
   at 
 org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
   at 
 org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
   at 
 org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5056) TestProcfsBasedProcessTree fails on Windows with Process-tree dump doesn't start with a proper header

2013-03-11 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-5056:
--

Attachment: MAPREDUCE-5056.trunk.3.patch

Attaching the updated patch.

A few notes:
 - I kept the previous fix for newline to achieve symmetry with the 
implementation from ProcfsBasedProcessTree#getProcessTreeDump
 - I changed ProcfsBasedProcessTree#isAvailable to use the existing API to 
check if running on Linux (not to duplicate the code)

 TestProcfsBasedProcessTree fails on Windows with Process-tree dump doesn't 
 start with a proper header
 -

 Key: MAPREDUCE-5056
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5056
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-5056.trunk.2.patch, 
 MAPREDUCE-5056.trunk.3.patch, MAPREDUCE-5056.trunk.patch


 Test fails on the below assertion:
 Running org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree
 Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.266 sec  
 FAILURE!
 testProcessTreeDump(org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree)
   Time elapsed: 0 sec   FAILURE!
 junit.framework.AssertionFailedError: Process-tree dump doesn't start with a 
 proper header
   at junit.framework.Assert.fail(Assert.java:47)
   at junit.framework.Assert.assertTrue(Assert.java:20)
   at 
 org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree.testProcessTreeDump(TestProcfsBasedProcessTree.java:564)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at junit.framework.TestCase.runTest(TestCase.java:168)
   at junit.framework.TestCase.runBare(TestCase.java:134)
   at junit.framework.TestResult$1.protect(TestResult.java:110)
   at junit.framework.TestResult.runProtected(TestResult.java:128)
   at junit.framework.TestResult.run(TestResult.java:113)
   at junit.framework.TestCase.run(TestCase.java:124)
   at junit.framework.TestSuite.runTest(TestSuite.java:243)
   at junit.framework.TestSuite.run(TestSuite.java:238)
   at 
 org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
   at 
 org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
   at 
 org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
   at 
 org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
   at 
 org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4987) TestMRJobs#testDistributedCache fails on Windows due to unexpected behavior of symlinks

2013-02-07 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13574097#comment-13574097
 ] 

Ivan Mitic commented on MAPREDUCE-4987:
---

Thanks for reporting this Chris.

bq. But symlinks CAN be used for functional purposes i.e linking to libraries 
etc. ?
Hi Vinod. I believe we'll have to port the branch-1-win semantic to trunk to 
properly support symlinks on both Java6 and Java7 on Windows. Yes, symlinks can 
be created to point to folders and files, however, Java6 does not interpret 
them correctly. We've seen so many issues with symlinks on Java6, and the only 
option that worked fine (and was signed off on) is to do a file copy in case of 
Java6. HADOOP-9061 talks about some of these problems.

bq. If so we can just do the platform check in the test-case.
We also initially thought this would be fine (you can check thru branch-1-win 
history :)). However, the real problem comes when someone tries to access the 
symlink thru Java APIs. Examples of problems are, File#length on symlinks 
returns zero. This means that RLFS does not work on top of symlinks. 
Additionally, File#renameTo on symlink renames the target file instead of the 
symlink (really strange I know :)). 

Hope this helps

 TestMRJobs#testDistributedCache fails on Windows due to unexpected behavior 
 of symlinks
 ---

 Key: MAPREDUCE-4987
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4987
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distributed-cache, nodemanager
Affects Versions: trunk-win
Reporter: Chris Nauroth

 On Windows, {{TestMRJobs#testDistributedCache}} fails on an assertion while 
 checking the length of a symlink.  It expects to see the length of the target 
 of the symlink, but Java 6 on Windows always reports that a symlink has 
 length 0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4396) Make LocalJobRunner work with private distributed cache

2012-12-10 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528577#comment-13528577
 ] 

Ivan Mitic commented on MAPREDUCE-4396:
---

This was fixed with HADOOP-8734 in branch-1-win. Maybe just integrate the same 
patch to branch-1?

 Make LocalJobRunner work with private distributed cache
 ---

 Key: MAPREDUCE-4396
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4396
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Affects Versions: 1.0.3
Reporter: Luke Lu
Assignee: Yu Gao
Priority: Minor
 Attachments: mapreduce-4396-branch-1.patch, test-afterpatch.result, 
 test-beforepatch.result, test-patch.result


 Some LocalJobRunner related unit tests fails if user directory permission 
 and/or umask is too restrictive.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4768) yarn cmd line scripts for windows

2012-11-02 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated MAPREDUCE-4768:
--

Status: Patch Available  (was: Open)

 yarn cmd line scripts for windows
 -

 Key: MAPREDUCE-4768
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4768
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: trunk-win
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: MAPREDUCE-4768.branch-trunk-win.scripts.patch


 Jira tracking addition of windows equivalents for yarn, yarn-config and 
 yarn-env shell scripts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


  1   2   >