[jira] [Commented] (MAPREDUCE-1639) Grouping using hashing instead of sorting

2012-06-25 Thread alex gemini (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400327#comment-13400327
 ] 

alex gemini commented on MAPREDUCE-1639:


I guess main point is we need a per-chunk comparison instead of a per-record 
comparison whether is based on hash (like this jira suggested) or minor range 
(like google tenzing's block shuffle).

 Grouping using hashing instead of sorting
 -

 Key: MAPREDUCE-1639
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1639
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: Joydeep Sen Sarma

 most applications of map-reduce care about grouping and not sorting. Sorting 
 is a (relatively expensive) way to achieve grouping. In order to achieve just 
 grouping - one can:
 - replace the sort on the Mappers with a HashTable - and maintain lists of 
 key-values against each hash-bucket.
 - key-value tuples inside each hash bucket are sorted - before spilling or 
 sending to Reducer. Anytime this is done - Combiner can be invoked.
 - HashTable is serialized by hash-bucketid. So merges (of either spills or 
 Map Outputs) works similar to today (at least there's no change in overall 
 compute complexity of merge)
 Of course this hashtable has nothing to do with partitioning. it's just a 
 replacement for map-side sort.
 --
 this is (pretty much) straight from the MARS project paper: 
 http://www.cse.ust.hk/catalac/papers/mars_pact08.pdf. They report a 45% 
 speedup in inverted index calculation using hashing instead of sorting 
 (reference implementation is NOT against Hadoop though).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4365) Shipping Profiler Libraries by DistributedCache

2012-06-25 Thread Jie Li (JIRA)
Jie Li created MAPREDUCE-4365:
-

 Summary: Shipping Profiler Libraries by DistributedCache
 Key: MAPREDUCE-4365
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4365
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 1.0.3
Reporter: Jie Li


Hadoop profiling is great for performance tuning and debugging, but currently 
we can only use Java built-in profilers such as HProf, and for other profilers 
we need to install them on all slave nodes first, which is inconvenient for 
large clusters and sometimes impossible for production clusters. 

Supporting shipping profiler libraries using DistributedCache will solve this 
problem. For example, in mapred.task.profile.params, we specify a profiler 
library from the DistributedCache using special place holders such as 
foo.jar, and Hadoop can look at the DistributedCache to replace foo.jar 
with the localized path before launching the child jvm.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4256) Improve resource scheduling

2012-06-25 Thread Radim Kolar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Radim Kolar updated MAPREDUCE-4256:
---

Attachment: (was: hadoop-jdk.tools.txt)

 Improve resource scheduling
 ---

 Key: MAPREDUCE-4256
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4256
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: resourcemanager
Reporter: Radim Kolar

 Currently resource manager supports only Memory resource during container 
 allocation.
 I propose following improvements:
 1. add support for CPU utilization. Node CPU used information can be obtained 
 by ResourceCalculatorPlugin.
 2. add support for custom resources. In node configuration will be something 
 like:
 name=node.resource.GPU, value=1 (node has 1 GPU).
 If job will need to use GPU for computation, it will add GPU=1 requirement 
 to its job config and Resource Manager will allocate container on node with 
 GPU available.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4256) Improve resource scheduling

2012-06-25 Thread Radim Kolar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Radim Kolar updated MAPREDUCE-4256:
---

Attachment: hadoop-jdk.tools.txt

 Improve resource scheduling
 ---

 Key: MAPREDUCE-4256
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4256
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: resourcemanager
Reporter: Radim Kolar

 Currently resource manager supports only Memory resource during container 
 allocation.
 I propose following improvements:
 1. add support for CPU utilization. Node CPU used information can be obtained 
 by ResourceCalculatorPlugin.
 2. add support for custom resources. In node configuration will be something 
 like:
 name=node.resource.GPU, value=1 (node has 1 GPU).
 If job will need to use GPU for computation, it will add GPU=1 requirement 
 to its job config and Resource Manager will allocate container on node with 
 GPU available.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4365) Shipping Profiler Libraries by DistributedCache

2012-06-25 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400488#comment-13400488
 ] 

Robert Joseph Evans commented on MAPREDUCE-4365:


Why can't you just use specify your distributed cache entry with something like 
hdfs://path/to/profiler.lib#profiler-link.lib?  I know it is a little ugly but 
it will add a symbolic name link named profiler-link.lib in the current working 
directory of your task to wherever it is in the distributed cache.

You can do this to with a tgz or zip.

hdfs://path/to/archive.tgz#profiler

Now if you want to access lib/profiler.so from inside of archive.tgz you would 
use a path of profiler/lib/profiler.so.

 Shipping Profiler Libraries by DistributedCache
 ---

 Key: MAPREDUCE-4365
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4365
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 1.0.3
Reporter: Jie Li

 Hadoop profiling is great for performance tuning and debugging, but currently 
 we can only use Java built-in profilers such as HProf, and for other 
 profilers we need to install them on all slave nodes first, which is 
 inconvenient for large clusters and sometimes impossible for production 
 clusters. 
 Supporting shipping profiler libraries using DistributedCache will solve this 
 problem. For example, in mapred.task.profile.params, we specify a profiler 
 library from the DistributedCache using special place holders such as 
 foo.jar, and Hadoop can look at the DistributedCache to replace foo.jar 
 with the localized path before launching the child jvm.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4361) Fix detailed metrics for protobuf-based RPC on 0.23

2012-06-25 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400507#comment-13400507
 ] 

Thomas Graves commented on MAPREDUCE-4361:
--

+1 lgtm.  Thanks Jason!

 Fix detailed metrics for protobuf-based RPC on 0.23
 ---

 Key: MAPREDUCE-4361
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4361
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: MAPREDUCE-4361.patch


 RPC detailed metrics for any protobuf-based RPC ports are always zero.  
 ProtoOverHadoopRpcEngine needs the same detailed metric logic as in 
 WritableRpcEngine.  This is effectively the same change as in HADOOP-8085 
 except tailored for branch-0.23 which didn't take the full protobuf branch 
 changes that went into branch-2 and trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4361) Fix detailed metrics for protobuf-based RPC on 0.23

2012-06-25 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated MAPREDUCE-4361:
-

   Resolution: Fixed
Fix Version/s: 0.23.3
   Status: Resolved  (was: Patch Available)

I committed this to branch 0.23, the change is not applicable to trunk or 
branch-2 since it already works there.

 Fix detailed metrics for protobuf-based RPC on 0.23
 ---

 Key: MAPREDUCE-4361
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4361
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Fix For: 0.23.3

 Attachments: MAPREDUCE-4361.patch


 RPC detailed metrics for any protobuf-based RPC ports are always zero.  
 ProtoOverHadoopRpcEngine needs the same detailed metric logic as in 
 WritableRpcEngine.  This is effectively the same change as in HADOOP-8085 
 except tailored for branch-0.23 which didn't take the full protobuf branch 
 changes that went into branch-2 and trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4366) mapred metrics shows negative count of waiting maps and reduces

2012-06-25 Thread Thomas Graves (JIRA)
Thomas Graves created MAPREDUCE-4366:


 Summary: mapred metrics shows negative count of waiting maps and 
reduces
 Key: MAPREDUCE-4366
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4366
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 1.0.2
Reporter: Thomas Graves


Negative waiting_maps and waiting_reduces count is observed in the mapred 
metrics.  MAPREDUCE-1238 partially fixed this but it appears there is still 
issues as we are seeing it, but not as bad.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (MAPREDUCE-4366) mapred metrics shows negative count of waiting maps and reduces

2012-06-25 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves reassigned MAPREDUCE-4366:


Assignee: Thomas Graves

 mapred metrics shows negative count of waiting maps and reduces
 ---

 Key: MAPREDUCE-4366
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4366
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 1.0.2
Reporter: Thomas Graves
Assignee: Thomas Graves

 Negative waiting_maps and waiting_reduces count is observed in the mapred 
 metrics.  MAPREDUCE-1238 partially fixed this but it appears there is still 
 issues as we are seeing it, but not as bad.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4366) mapred metrics shows negative count of waiting maps and reduces

2012-06-25 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400542#comment-13400542
 ] 

Thomas Graves commented on MAPREDUCE-4366:
--

I was able to reproduce this using speculative execution and killing the job at 
the right point when it has speculative tasks running.  

 mapred metrics shows negative count of waiting maps and reduces
 ---

 Key: MAPREDUCE-4366
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4366
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 1.0.2
Reporter: Thomas Graves
Assignee: Thomas Graves

 Negative waiting_maps and waiting_reduces count is observed in the mapred 
 metrics.  MAPREDUCE-1238 partially fixed this but it appears there is still 
 issues as we are seeing it, but not as bad.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4336) Distributed Shell fails when used with the CapacityScheduler

2012-06-25 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated MAPREDUCE-4336:
--

   Resolution: Fixed
Fix Version/s: 2.0.1-alpha
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Thanks Ahmed. Committed to trunk and branch-2

 Distributed Shell fails when used with the CapacityScheduler
 

 Key: MAPREDUCE-4336
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4336
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.0-alpha
Reporter: Siddharth Seth
Assignee: Ahmed Radwan
 Fix For: 2.0.1-alpha

 Attachments: MAPREDUCE-4336.patch, MAPREDUCE-4336_rev2.patch


 DistributedShell attempts to get queue info without providing a queue name - 
 which ends up in an NPE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4336) Distributed Shell fails when used with the CapacityScheduler

2012-06-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400574#comment-13400574
 ] 

Hudson commented on MAPREDUCE-4336:
---

Integrated in Hadoop-Hdfs-trunk-Commit #2452 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2452/])
MAPREDUCE-4336. Distributed Shell fails when used with the 
CapacityScheduler (ahmed via tucu) (Revision 1353625)

 Result = SUCCESS
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1353625
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java


 Distributed Shell fails when used with the CapacityScheduler
 

 Key: MAPREDUCE-4336
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4336
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.0-alpha
Reporter: Siddharth Seth
Assignee: Ahmed Radwan
 Fix For: 2.0.1-alpha

 Attachments: MAPREDUCE-4336.patch, MAPREDUCE-4336_rev2.patch


 DistributedShell attempts to get queue info without providing a queue name - 
 which ends up in an NPE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4336) Distributed Shell fails when used with the CapacityScheduler

2012-06-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400575#comment-13400575
 ] 

Hudson commented on MAPREDUCE-4336:
---

Integrated in Hadoop-Common-trunk-Commit #2382 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2382/])
MAPREDUCE-4336. Distributed Shell fails when used with the 
CapacityScheduler (ahmed via tucu) (Revision 1353625)

 Result = SUCCESS
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1353625
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java


 Distributed Shell fails when used with the CapacityScheduler
 

 Key: MAPREDUCE-4336
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4336
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.0-alpha
Reporter: Siddharth Seth
Assignee: Ahmed Radwan
 Fix For: 2.0.1-alpha

 Attachments: MAPREDUCE-4336.patch, MAPREDUCE-4336_rev2.patch


 DistributedShell attempts to get queue info without providing a queue name - 
 which ends up in an NPE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient

2012-06-25 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400587#comment-13400587
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4346:
---

The JobStatus.setRetired(..) method should be package private, as it does not 
make sense for a client to set that value.

Please add a comment in the new method (where you are catching the exception) 
to state that this is done to ensure client API compatibility within this 
release.

What is the intended logic between status  retired? I would have assume it is 
an AND but it seems that retired TRUE trumps status (both in JT and client 
filtering logic).



 Adding a refined version of JobTracker.getAllJobs() and exposing through the 
 JobClient
 --

 Key: MAPREDUCE-4346
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch


 The current implementation for JobTracker.getAllJobs() returns all submitted 
 jobs in any state, in addition to retired jobs. This list can be long and 
 represents an unneeded overhead especially in the case of clients only 
 interested in jobs in specific state(s). 
 It is beneficial to include a refined version where only jobs having specific 
 statuses are returned and retired jobs are optional to include. 
 I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4367) mapred job -kill tries to connect to history server

2012-06-25 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-4367:
-

 Summary: mapred job -kill tries to connect to history server
 Key: MAPREDUCE-4367
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4367
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client, mrv2
Affects Versions: 0.23.3
Reporter: Jason Lowe
Priority: Minor


The {{mapred job -kill}} command attempts to connect to the history server, 
even though it is unrelated to the process of killing a job.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4367) mapred job -kill tries to connect to history server

2012-06-25 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400604#comment-13400604
 ] 

Jason Lowe commented on MAPREDUCE-4367:
---

If the history server isn't running or there are issues connecting to the 
history server, the kill command produces many retry messages.  For example:

{noformat}
$ mapred job -kill job_1340642510012_0003
2012-06-25 16:42:26,626 INFO  mapred.ClientServiceDelegate 
(ClientServiceDelegate.java:getProxy(254)) - Application state is completed. 
FinalApplicationStatus=FAILED. Redirecting to job history server
2012-06-25 16:42:27,629 INFO  ipc.Client 
(Client.java:handleConnectionFailure(714)) - Retrying connect to server: 
xx:10020. Already tried 0 time(s); retry policy is 
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2012-06-25 16:42:28,630 INFO  ipc.Client 
(Client.java:handleConnectionFailure(714)) - Retrying connect to server: 
xx:10020. Already tried 1 time(s); retry policy is 
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2012-06-25 16:42:29,631 INFO  ipc.Client 
(Client.java:handleConnectionFailure(714)) - Retrying connect to server: 
xx:10020. Already tried 2 time(s); retry policy is 
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2012-06-25 16:42:30,632 INFO  ipc.Client 
(Client.java:handleConnectionFailure(714)) - Retrying connect to server: 
xx:10020. Already tried 3 time(s); retry policy is 
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2012-06-25 16:42:31,633 INFO  ipc.Client 
(Client.java:handleConnectionFailure(714)) - Retrying connect to server: 
xx:10020. Already tried 4 time(s); retry policy is 
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2012-06-25 16:42:32,633 INFO  ipc.Client 
(Client.java:handleConnectionFailure(714)) - Retrying connect to server: 
xx:10020. Already tried 5 time(s); retry policy is 
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2012-06-25 16:42:33,634 INFO  ipc.Client 
(Client.java:handleConnectionFailure(714)) - Retrying connect to server: 
xx:10020. Already tried 6 time(s); retry policy is 
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2012-06-25 16:42:34,635 INFO  ipc.Client 
(Client.java:handleConnectionFailure(714)) - Retrying connect to server: 
xx:10020. Already tried 7 time(s); retry policy is 
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2012-06-25 16:42:35,636 INFO  ipc.Client 
(Client.java:handleConnectionFailure(714)) - Retrying connect to server: 
xx:10020. Already tried 8 time(s); retry policy is 
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2012-06-25 16:42:36,637 INFO  ipc.Client 
(Client.java:handleConnectionFailure(714)) - Retrying connect to server: 
xx:10020. Already tried 9 time(s); retry policy is 
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2012-06-25 16:42:36,642 INFO  mapred.ClientServiceDelegate 
(ClientServiceDelegate.java:getProxy(254)) - Application state is completed. 
FinalApplicationStatus=FAILED. Redirecting to job history server
2012-06-25 16:42:37,643 INFO  ipc.Client 
(Client.java:handleConnectionFailure(714)) - Retrying connect to server: 
xx:10020. Already tried 0 time(s); retry policy is 
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2012-06-25 16:42:38,644 INFO  ipc.Client 
(Client.java:handleConnectionFailure(714)) - Retrying connect to server: 
xx:10020. Already tried 1 time(s); retry policy is 
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2012-06-25 16:42:39,644 INFO  ipc.Client 
(Client.java:handleConnectionFailure(714)) - Retrying connect to server: 
xx:10020. Already tried 2 time(s); retry policy is 
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2012-06-25 16:42:40,645 INFO  ipc.Client 
(Client.java:handleConnectionFailure(714)) - Retrying connect to server: 
xx:10020. Already tried 3 time(s); retry policy is 
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2012-06-25 16:42:41,646 INFO  ipc.Client 
(Client.java:handleConnectionFailure(714)) - Retrying connect to server: 
xx:10020. Already tried 4 time(s); retry policy is 
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2012-06-25 16:42:42,647 INFO  ipc.Client 
(Client.java:handleConnectionFailure(714)) - Retrying connect to server: 
xx:10020. Already tried 5 time(s); retry policy is 
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2012-06-25 16:42:43,648 INFO  ipc.Client 
(Client.java:handleConnectionFailure(714)) - Retrying connect to server: 
xx:10020. Already tried 6 time(s); retry policy is 
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2012-06-25 16:42:44,649 INFO  ipc.Client 

[jira] [Commented] (MAPREDUCE-4290) JobStatus.getState() API is giving ambiguous values

2012-06-25 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400718#comment-13400718
 ] 

Siddharth Seth commented on MAPREDUCE-4290:
---

+1. Looks good. The javadoc warnings are not related.

 JobStatus.getState() API is giving ambiguous values
 ---

 Key: MAPREDUCE-4290
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4290
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Nishan Shetty
Assignee: Devaraj K
 Attachments: MAPREDUCE-4290-1.patch, MAPREDUCE-4290-1.patch, 
 MAPREDUCE-4290.patch


 For failed job getState() API is giving status as SUCCEEDED if we use 
 JobClient.getAllJobs() for retrieving all jobs info from RM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4290) JobStatus.getState() API is giving ambiguous values

2012-06-25 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated MAPREDUCE-4290:
--

   Resolution: Fixed
Fix Version/s: 2.0.1-alpha
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to trunk and branch-2. Thanks Devaraj!

 JobStatus.getState() API is giving ambiguous values
 ---

 Key: MAPREDUCE-4290
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4290
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Nishan Shetty
Assignee: Devaraj K
 Fix For: 2.0.1-alpha

 Attachments: MAPREDUCE-4290-1.patch, MAPREDUCE-4290-1.patch, 
 MAPREDUCE-4290.patch


 For failed job getState() API is giving status as SUCCEEDED if we use 
 JobClient.getAllJobs() for retrieving all jobs info from RM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4290) JobStatus.getState() API is giving ambiguous values

2012-06-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400744#comment-13400744
 ] 

Hudson commented on MAPREDUCE-4290:
---

Integrated in Hadoop-Hdfs-trunk-Commit #2454 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2454/])
MAPREDUCE-4290. Fix Yarn Applicaiton Status to MR JobState conversion. 
(Contributed by Devaraj K) (Revision 1353684)

 Result = SUCCESS
sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1353684
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/TypeConverter.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapreduce/TestTypeConverter.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestResourceMgrDelegate.java


 JobStatus.getState() API is giving ambiguous values
 ---

 Key: MAPREDUCE-4290
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4290
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Nishan Shetty
Assignee: Devaraj K
 Fix For: 2.0.1-alpha

 Attachments: MAPREDUCE-4290-1.patch, MAPREDUCE-4290-1.patch, 
 MAPREDUCE-4290.patch


 For failed job getState() API is giving status as SUCCEEDED if we use 
 JobClient.getAllJobs() for retrieving all jobs info from RM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4290) JobStatus.getState() API is giving ambiguous values

2012-06-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400754#comment-13400754
 ] 

Hudson commented on MAPREDUCE-4290:
---

Integrated in Hadoop-Common-trunk-Commit #2384 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2384/])
MAPREDUCE-4290. Fix Yarn Applicaiton Status to MR JobState conversion. 
(Contributed by Devaraj K) (Revision 1353684)

 Result = SUCCESS
sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1353684
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/TypeConverter.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapreduce/TestTypeConverter.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestResourceMgrDelegate.java


 JobStatus.getState() API is giving ambiguous values
 ---

 Key: MAPREDUCE-4290
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4290
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Nishan Shetty
Assignee: Devaraj K
 Fix For: 2.0.1-alpha

 Attachments: MAPREDUCE-4290-1.patch, MAPREDUCE-4290-1.patch, 
 MAPREDUCE-4290.patch


 For failed job getState() API is giving status as SUCCEEDED if we use 
 JobClient.getAllJobs() for retrieving all jobs info from RM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4365) Shipping Profiler Libraries by DistributedCache

2012-06-25 Thread Jie Li (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400766#comment-13400766
 ] 

Jie Li commented on MAPREDUCE-4365:
---

Hi Robert,

I don't quite understand your approach, because we need to provide the
path of the profiler libraries to the TaskTracker instead of the
tasks. So if the libraries appear in the task' working directory, how
can the TaskTracker find it when launching the task? And currently TT
doesn't look into the profile parameters to see if there is any
distributed cache entry.

 Shipping Profiler Libraries by DistributedCache
 ---

 Key: MAPREDUCE-4365
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4365
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 1.0.3
Reporter: Jie Li

 Hadoop profiling is great for performance tuning and debugging, but currently 
 we can only use Java built-in profilers such as HProf, and for other 
 profilers we need to install them on all slave nodes first, which is 
 inconvenient for large clusters and sometimes impossible for production 
 clusters. 
 Supporting shipping profiler libraries using DistributedCache will solve this 
 problem. For example, in mapred.task.profile.params, we specify a profiler 
 library from the DistributedCache using special place holders such as 
 foo.jar, and Hadoop can look at the DistributedCache to replace foo.jar 
 with the localized path before launching the child jvm.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4365) Shipping Profiler Libraries by DistributedCache

2012-06-25 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400780#comment-13400780
 ] 

Arun C Murthy commented on MAPREDUCE-4365:
--

Jie - You can just add the profiler params to 
mapred.(map,reduce).child.java.opts and the TT will add it to the tasks' jvm 
launch cmd.

 Shipping Profiler Libraries by DistributedCache
 ---

 Key: MAPREDUCE-4365
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4365
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 1.0.3
Reporter: Jie Li

 Hadoop profiling is great for performance tuning and debugging, but currently 
 we can only use Java built-in profilers such as HProf, and for other 
 profilers we need to install them on all slave nodes first, which is 
 inconvenient for large clusters and sometimes impossible for production 
 clusters. 
 Supporting shipping profiler libraries using DistributedCache will solve this 
 problem. For example, in mapred.task.profile.params, we specify a profiler 
 library from the DistributedCache using special place holders such as 
 foo.jar, and Hadoop can look at the DistributedCache to replace foo.jar 
 with the localized path before launching the child jvm.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4290) JobStatus.getState() API is giving ambiguous values

2012-06-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400799#comment-13400799
 ] 

Hudson commented on MAPREDUCE-4290:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #2402 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2402/])
MAPREDUCE-4290. Fix Yarn Applicaiton Status to MR JobState conversion. 
(Contributed by Devaraj K) (Revision 1353684)

 Result = FAILURE
sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1353684
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/TypeConverter.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapreduce/TestTypeConverter.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestResourceMgrDelegate.java


 JobStatus.getState() API is giving ambiguous values
 ---

 Key: MAPREDUCE-4290
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4290
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Nishan Shetty
Assignee: Devaraj K
 Fix For: 2.0.1-alpha

 Attachments: MAPREDUCE-4290-1.patch, MAPREDUCE-4290-1.patch, 
 MAPREDUCE-4290.patch


 For failed job getState() API is giving status as SUCCEEDED if we use 
 JobClient.getAllJobs() for retrieving all jobs info from RM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4365) Shipping Profiler Libraries by DistributedCache

2012-06-25 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400806#comment-13400806
 ] 

Robert Joseph Evans commented on MAPREDUCE-4365:


Jie,

I am confused too.  Do you want to profile the task or the task tracker?  If 
you want to profile the task you can do a combination of what I said and what 
Arun is saying.

{noformat}
hadoop jar ... -Dmapred.map.child.java.opts=... -agentlib:yjpagent 
-Dmapred.reduce.child.java.opts=... -agentlib:yjpagent 
-Dmapred.child.env=... LD_LIBRARY_PATH=yourkit/bin/linux-x86-32 -archive 
'/path/to/yourkit.tgz#yourkit' ...
{noformat}

This is just thrown together from memory and from 
http://www.yourkit.com/docs/80/help/agent.jsp so some of the parameter options 
may be wrong, but it should point you down the correct path.

 Shipping Profiler Libraries by DistributedCache
 ---

 Key: MAPREDUCE-4365
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4365
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 1.0.3
Reporter: Jie Li

 Hadoop profiling is great for performance tuning and debugging, but currently 
 we can only use Java built-in profilers such as HProf, and for other 
 profilers we need to install them on all slave nodes first, which is 
 inconvenient for large clusters and sometimes impossible for production 
 clusters. 
 Supporting shipping profiler libraries using DistributedCache will solve this 
 problem. For example, in mapred.task.profile.params, we specify a profiler 
 library from the DistributedCache using special place holders such as 
 foo.jar, and Hadoop can look at the DistributedCache to replace foo.jar 
 with the localized path before launching the child jvm.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3837) Hadoop 22 Job tracker is not able to recover job in case of crash and after that no user can submit job.

2012-06-25 Thread Mayank Bansal (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400808#comment-13400808
 ] 

Mayank Bansal commented on MAPREDUCE-3837:
--

Hi Tom,

I just took the latest 1.1 code base and ran the two testcases which you 
mentioned abobe, without my patch and they are still failing.

Thanks,
Mayank

 Hadoop 22 Job tracker is not able to recover job in case of crash and after 
 that no user can submit job.
 

 Key: MAPREDUCE-3837
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3837
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Mayank Bansal
Assignee: Mayank Bansal
 Fix For: 0.24.0, 0.22.1, 0.23.2

 Attachments: PATCH-HADOOP-1-MAPREDUCE-3837-1.patch, 
 PATCH-HADOOP-1-MAPREDUCE-3837-2.patch, PATCH-HADOOP-1-MAPREDUCE-3837.patch, 
 PATCH-MAPREDUCE-3837.patch, PATCH-TRUNK-MAPREDUCE-3837.patch


 If job tracker is crashed while running , and there were some jobs are 
 running , so if job tracker's property mapreduce.jobtracker.restart.recover 
 is true then it should recover the job.
 However the current behavior is as follows
 jobtracker try to restore the jobs but it can not . And after that jobtracker 
 closes its handle to hdfs and nobody else can submit job. 
 Thanks,
 Mayank

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2454) Allow external sorter plugin for MR

2012-06-25 Thread Mariappan Asokan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mariappan Asokan updated MAPREDUCE-2454:


Status: Open  (was: Patch Available)

 Allow external sorter plugin for MR
 ---

 Key: MAPREDUCE-2454
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2454
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Mariappan Asokan
Priority: Minor
  Labels: features, performance, plugin, sort
 Attachments: HadoopSortPlugin.pdf, KeyValueIterator.java, 
 MR-2454-trunkPatchPreview.gz, MapOutputSorter.java, 
 MapOutputSorterAbstract.java, ReduceInputSorter.java, mapreduce-2454.patch, 
 mr-2454-on-mr-279-build82.patch.gz


 Define interfaces and some abstract classes in the Hadoop framework to 
 facilitate external sorter plugins both on the Map and Reduce sides.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2454) Allow external sorter plugin for MR

2012-06-25 Thread Mariappan Asokan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mariappan Asokan updated MAPREDUCE-2454:


Attachment: (was: mapreduce-2454.patch)

 Allow external sorter plugin for MR
 ---

 Key: MAPREDUCE-2454
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2454
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Mariappan Asokan
Priority: Minor
  Labels: features, performance, plugin, sort
 Attachments: HadoopSortPlugin.pdf, KeyValueIterator.java, 
 MR-2454-trunkPatchPreview.gz, MapOutputSorter.java, 
 MapOutputSorterAbstract.java, ReduceInputSorter.java, mapreduce-2454.patch, 
 mr-2454-on-mr-279-build82.patch.gz


 Define interfaces and some abstract classes in the Hadoop framework to 
 facilitate external sorter plugins both on the Map and Reduce sides.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2454) Allow external sorter plugin for MR

2012-06-25 Thread Mariappan Asokan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mariappan Asokan updated MAPREDUCE-2454:


Attachment: mapreduce-2454.patch

 Allow external sorter plugin for MR
 ---

 Key: MAPREDUCE-2454
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2454
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Mariappan Asokan
Priority: Minor
  Labels: features, performance, plugin, sort
 Attachments: HadoopSortPlugin.pdf, KeyValueIterator.java, 
 MR-2454-trunkPatchPreview.gz, MapOutputSorter.java, 
 MapOutputSorterAbstract.java, ReduceInputSorter.java, mapreduce-2454.patch, 
 mr-2454-on-mr-279-build82.patch.gz


 Define interfaces and some abstract classes in the Hadoop framework to 
 facilitate external sorter plugins both on the Map and Reduce sides.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2454) Allow external sorter plugin for MR

2012-06-25 Thread Mariappan Asokan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mariappan Asokan updated MAPREDUCE-2454:


Status: Patch Available  (was: Open)

 Allow external sorter plugin for MR
 ---

 Key: MAPREDUCE-2454
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2454
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Mariappan Asokan
Priority: Minor
  Labels: features, performance, plugin, sort
 Attachments: HadoopSortPlugin.pdf, KeyValueIterator.java, 
 MR-2454-trunkPatchPreview.gz, MapOutputSorter.java, 
 MapOutputSorterAbstract.java, ReduceInputSorter.java, mapreduce-2454.patch, 
 mr-2454-on-mr-279-build82.patch.gz


 Define interfaces and some abstract classes in the Hadoop framework to 
 facilitate external sorter plugins both on the Map and Reduce sides.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2454) Allow external sorter plugin for MR

2012-06-25 Thread Mariappan Asokan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400823#comment-13400823
 ] 

Mariappan Asokan commented on MAPREDUCE-2454:
-

Trying one more time...


 Allow external sorter plugin for MR
 ---

 Key: MAPREDUCE-2454
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2454
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Mariappan Asokan
Priority: Minor
  Labels: features, performance, plugin, sort
 Attachments: HadoopSortPlugin.pdf, KeyValueIterator.java, 
 MR-2454-trunkPatchPreview.gz, MapOutputSorter.java, 
 MapOutputSorterAbstract.java, ReduceInputSorter.java, mapreduce-2454.patch, 
 mr-2454-on-mr-279-build82.patch.gz


 Define interfaces and some abstract classes in the Hadoop framework to 
 facilitate external sorter plugins both on the Map and Reduce sides.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4300) OOM in AM can turn it into a zombie.

2012-06-25 Thread Robert Joseph Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated MAPREDUCE-4300:
---

Attachment: MR-4300.txt

Someone else added in the same method I wanted, in the ShutownHooksManager :).  
I removed my copy so now it should compile.

 OOM in AM can turn it into a zombie.
 

 Key: MAPREDUCE-4300
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4300
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Affects Versions: 0.23.3
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Attachments: MR-4300.txt, MR-4300.txt, MR-4300.txt, StackDump.txt


 It looks like 4 threads in the AM died with OOM but not the one pinging the 
 RM.
 stderr for this AM
 {noformat}
 WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use 
 org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files.
 May 30, 2012 4:49:55 AM 
 com.google.inject.servlet.InternalServletModule$BackwardsCompatibleServletContextProvider
  get
 WARNING: You are attempting to use a deprecated API (specifically, attempting 
 to @Inject ServletContext inside an eagerly created singleton. While we allow 
 this for backwards compatibility, be warned that this MAY have unexpected 
 behavior if you have more than one injector (with ServletModule) running in 
 the same JVM. Please consult the Guice documentation at 
 http://code.google.com/p/google-guice/wiki/Servlets for more information.
 May 30, 2012 4:49:55 AM 
 com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
 INFO: Registering 
 org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver as a provider 
 class
 May 30, 2012 4:49:55 AM 
 com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
 INFO: Registering org.apache.hadoop.yarn.webapp.GenericExceptionHandler as a 
 provider class
 May 30, 2012 4:49:55 AM 
 com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
 INFO: Registering org.apache.hadoop.mapreduce.v2.app.webapp.AMWebServices as 
 a root resource class
 May 30, 2012 4:49:55 AM 
 com.sun.jersey.server.impl.application.WebApplicationImpl _initiate
 INFO: Initiating Jersey application, version 'Jersey: 1.8 06/24/2011 12:17 PM'
 May 30, 2012 4:49:55 AM 
 com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory 
 getComponentProvider
 INFO: Binding org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver 
 to GuiceManagedComponentProvider with the scope Singleton
 May 30, 2012 4:49:56 AM 
 com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory 
 getComponentProvider
 INFO: Binding org.apache.hadoop.yarn.webapp.GenericExceptionHandler to 
 GuiceManagedComponentProvider with the scope Singleton
 May 30, 2012 4:49:56 AM 
 com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory 
 getComponentProvider
 INFO: Binding org.apache.hadoop.mapreduce.v2.app.webapp.AMWebServices to 
 GuiceManagedComponentProvider with the scope PerRequest
 Exception in thread ResponseProcessor for block 
 BP-1114822160-IP-1322528669066:blk_-6528896407411719649_34227308 
 java.lang.OutOfMemoryError: Java heap space
   at com.google.protobuf.CodedInputStream.(CodedInputStream.java:538)
   at 
 com.google.protobuf.CodedInputStream.newInstance(CodedInputStream.java:55)
   at 
 com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:201)
   at 
 com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:738)
   at 
 org.apache.hadoop.hdfs.protocol.proto.DataTransferProtos$PipelineAckProto.parseFrom(DataTransferProtos.java:7287)
   at 
 org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:95)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:656)
 Exception in thread DefaultSpeculator background processing 
 java.lang.OutOfMemoryError: Java heap space
   at java.util.HashMap.resize(HashMap.java:462)
   at java.util.HashMap.addEntry(HashMap.java:755)
   at java.util.HashMap.put(HashMap.java:385)
   at 
 org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.getTasks(JobImpl.java:632)
   at 
 org.apache.hadoop.mapreduce.v2.app.speculate.DefaultSpeculator.maybeScheduleASpeculation(DefaultSpeculator.java:465)
   at 
 org.apache.hadoop.mapreduce.v2.app.speculate.DefaultSpeculator.maybeScheduleAMapSpeculation(DefaultSpeculator.java:433)
   at 
 org.apache.hadoop.mapreduce.v2.app.speculate.DefaultSpeculator.computeSpeculations(DefaultSpeculator.java:509)
   at 
 org.apache.hadoop.mapreduce.v2.app.speculate.DefaultSpeculator.access$100(DefaultSpeculator.java:56)
   at 
 

[jira] [Assigned] (MAPREDUCE-2289) Permissions race can make getStagingDir fail on local filesystem

2012-06-25 Thread Ahmed Radwan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Radwan reassigned MAPREDUCE-2289:
---

Assignee: Ahmed Radwan  (was: Todd Lipcon)

 Permissions race can make getStagingDir fail on local filesystem
 

 Key: MAPREDUCE-2289
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2289
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Ahmed Radwan
 Fix For: 0.22.1

 Attachments: MAPREDUCE-2289_branch-1.0.patch, 
 MAPREDUCE-2289_trunk.patch, mapreduce-2289.txt


 I've observed the following race condition in TestFairSchedulerSystem which 
 uses a MiniMRCluster on top of RawLocalFileSystem:
 - two threads call getStagingDir at the same time
 - Thread A checks fs.exists(stagingArea) and sees false
 -- Calls mkdirs(stagingArea, JOB_DIR_PERMISSIONS)
 --- mkdirs calls the Java mkdir API which makes the file with umask-based 
 permissions
 - Thread B runs, checks fs.exists(stagingArea) and sees true
 -- checks permissions, sees the default permissions, and throws IOE
 - Thread A resumes and sets correct permissions

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4300) OOM in AM can turn it into a zombie.

2012-06-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400857#comment-13400857
 ] 

Hadoop QA commented on MAPREDUCE-4300:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12533365/MR-4300.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs 
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common 
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2506//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2506//console

This message is automatically generated.

 OOM in AM can turn it into a zombie.
 

 Key: MAPREDUCE-4300
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4300
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Affects Versions: 0.23.3
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Attachments: MR-4300.txt, MR-4300.txt, MR-4300.txt, StackDump.txt


 It looks like 4 threads in the AM died with OOM but not the one pinging the 
 RM.
 stderr for this AM
 {noformat}
 WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use 
 org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files.
 May 30, 2012 4:49:55 AM 
 com.google.inject.servlet.InternalServletModule$BackwardsCompatibleServletContextProvider
  get
 WARNING: You are attempting to use a deprecated API (specifically, attempting 
 to @Inject ServletContext inside an eagerly created singleton. While we allow 
 this for backwards compatibility, be warned that this MAY have unexpected 
 behavior if you have more than one injector (with ServletModule) running in 
 the same JVM. Please consult the Guice documentation at 
 http://code.google.com/p/google-guice/wiki/Servlets for more information.
 May 30, 2012 4:49:55 AM 
 com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
 INFO: Registering 
 org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver as a provider 
 class
 May 30, 2012 4:49:55 AM 
 com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
 INFO: Registering org.apache.hadoop.yarn.webapp.GenericExceptionHandler as a 
 provider class
 May 30, 2012 4:49:55 AM 
 com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
 INFO: Registering org.apache.hadoop.mapreduce.v2.app.webapp.AMWebServices as 
 a root resource class
 May 30, 2012 4:49:55 AM 
 com.sun.jersey.server.impl.application.WebApplicationImpl _initiate
 INFO: Initiating Jersey application, version 'Jersey: 1.8 06/24/2011 12:17 PM'
 May 30, 2012 4:49:55 AM 
 com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory 
 getComponentProvider
 INFO: Binding org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver 
 to GuiceManagedComponentProvider with the scope Singleton
 May 30, 2012 4:49:56 AM 
 com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory 
 getComponentProvider
 INFO: Binding org.apache.hadoop.yarn.webapp.GenericExceptionHandler to 
 GuiceManagedComponentProvider with the scope Singleton
 May 30, 2012 4:49:56 AM 
 com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory 
 getComponentProvider
 INFO: Binding org.apache.hadoop.mapreduce.v2.app.webapp.AMWebServices to 
 GuiceManagedComponentProvider with the scope PerRequest
 Exception in thread ResponseProcessor for block 
 BP-1114822160-IP-1322528669066:blk_-6528896407411719649_34227308 
 

[jira] [Commented] (MAPREDUCE-4300) OOM in AM can turn it into a zombie.

2012-06-25 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400871#comment-13400871
 ] 

Robert Joseph Evans commented on MAPREDUCE-4300:


Like I said the -1 for not adding in any tests is expected.  I don't want to 
run tests that all they do is test that java does what it says it is going to 
do, or have the test call System.exit.

 OOM in AM can turn it into a zombie.
 

 Key: MAPREDUCE-4300
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4300
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Affects Versions: 0.23.3
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans
 Attachments: MR-4300.txt, MR-4300.txt, MR-4300.txt, StackDump.txt


 It looks like 4 threads in the AM died with OOM but not the one pinging the 
 RM.
 stderr for this AM
 {noformat}
 WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use 
 org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files.
 May 30, 2012 4:49:55 AM 
 com.google.inject.servlet.InternalServletModule$BackwardsCompatibleServletContextProvider
  get
 WARNING: You are attempting to use a deprecated API (specifically, attempting 
 to @Inject ServletContext inside an eagerly created singleton. While we allow 
 this for backwards compatibility, be warned that this MAY have unexpected 
 behavior if you have more than one injector (with ServletModule) running in 
 the same JVM. Please consult the Guice documentation at 
 http://code.google.com/p/google-guice/wiki/Servlets for more information.
 May 30, 2012 4:49:55 AM 
 com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
 INFO: Registering 
 org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver as a provider 
 class
 May 30, 2012 4:49:55 AM 
 com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
 INFO: Registering org.apache.hadoop.yarn.webapp.GenericExceptionHandler as a 
 provider class
 May 30, 2012 4:49:55 AM 
 com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
 INFO: Registering org.apache.hadoop.mapreduce.v2.app.webapp.AMWebServices as 
 a root resource class
 May 30, 2012 4:49:55 AM 
 com.sun.jersey.server.impl.application.WebApplicationImpl _initiate
 INFO: Initiating Jersey application, version 'Jersey: 1.8 06/24/2011 12:17 PM'
 May 30, 2012 4:49:55 AM 
 com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory 
 getComponentProvider
 INFO: Binding org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver 
 to GuiceManagedComponentProvider with the scope Singleton
 May 30, 2012 4:49:56 AM 
 com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory 
 getComponentProvider
 INFO: Binding org.apache.hadoop.yarn.webapp.GenericExceptionHandler to 
 GuiceManagedComponentProvider with the scope Singleton
 May 30, 2012 4:49:56 AM 
 com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory 
 getComponentProvider
 INFO: Binding org.apache.hadoop.mapreduce.v2.app.webapp.AMWebServices to 
 GuiceManagedComponentProvider with the scope PerRequest
 Exception in thread ResponseProcessor for block 
 BP-1114822160-IP-1322528669066:blk_-6528896407411719649_34227308 
 java.lang.OutOfMemoryError: Java heap space
   at com.google.protobuf.CodedInputStream.(CodedInputStream.java:538)
   at 
 com.google.protobuf.CodedInputStream.newInstance(CodedInputStream.java:55)
   at 
 com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:201)
   at 
 com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:738)
   at 
 org.apache.hadoop.hdfs.protocol.proto.DataTransferProtos$PipelineAckProto.parseFrom(DataTransferProtos.java:7287)
   at 
 org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:95)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:656)
 Exception in thread DefaultSpeculator background processing 
 java.lang.OutOfMemoryError: Java heap space
   at java.util.HashMap.resize(HashMap.java:462)
   at java.util.HashMap.addEntry(HashMap.java:755)
   at java.util.HashMap.put(HashMap.java:385)
   at 
 org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.getTasks(JobImpl.java:632)
   at 
 org.apache.hadoop.mapreduce.v2.app.speculate.DefaultSpeculator.maybeScheduleASpeculation(DefaultSpeculator.java:465)
   at 
 org.apache.hadoop.mapreduce.v2.app.speculate.DefaultSpeculator.maybeScheduleAMapSpeculation(DefaultSpeculator.java:433)
   at 
 org.apache.hadoop.mapreduce.v2.app.speculate.DefaultSpeculator.computeSpeculations(DefaultSpeculator.java:509)
   at 
 

[jira] [Updated] (MAPREDUCE-4317) Job view ACL checks are too permissive

2012-06-25 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4317:


Attachment: MR-4317.patch

Cleaned the patch up to remove unused imports. Ready.

 Job view ACL checks are too permissive
 --

 Key: MAPREDUCE-4317
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4317
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.0.3
Reporter: Harsh J
Assignee: Karthik Kambatla
 Attachments: MR-4317.patch


 The class that does view-based checks, JSPUtil.JobWithViewAccessCheck, has 
 the following internal member:
 {code}private boolean isViewAllowed = true;{code}
 Note that its true.
 Now, in the method that sets proper view-allowed rights, has:
 {code}
 if (user != null  job != null  jt.areACLsEnabled()) {
   final UserGroupInformation ugi =
 UserGroupInformation.createRemoteUser(user);
   try {
 ugi.doAs(new PrivilegedExceptionActionVoid() {
   public Void run() throws IOException, ServletException {
 // checks job view permission
 jt.getACLsManager().checkAccess(job, ugi,
 Operation.VIEW_JOB_DETAILS);
 return null;
   }
 });
   } catch (AccessControlException e) {
 String errMsg = User  + ugi.getShortUserName() +
  failed to view  + jobid + !brbr + e.getMessage() +
 hra href=\jobtracker.jsp\Go back to JobTracker/abr;
 JSPUtil.setErrorAndForward(errMsg, request, response);
 myJob.setViewAccess(false);
   } catch (InterruptedException e) {
 String errMsg =  Interrupted while trying to access  + jobid +
 hra href=\jobtracker.jsp\Go back to JobTracker/abr;
 JSPUtil.setErrorAndForward(errMsg, request, response);
 myJob.setViewAccess(false);
   }
 }
 return myJob;
 {code}
 In the above snippet, you can notice that if user==null, which can happen if 
 user is not http-authenticated (as its got via request.getRemoteUser()), can 
 lead to the view being visible since the default is true and we didn't toggle 
 the view to false for user == null case.
 Ideally the default of the view job ACL must be false, or we need an else 
 clause that sets the view rights to false in case of a failure to find the 
 user ID.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4317) Job view ACL checks are too permissive

2012-06-25 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4317:


Attachment: (was: MR-4317.patch)

 Job view ACL checks are too permissive
 --

 Key: MAPREDUCE-4317
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4317
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.0.3
Reporter: Harsh J
Assignee: Karthik Kambatla
 Attachments: MR-4317.patch


 The class that does view-based checks, JSPUtil.JobWithViewAccessCheck, has 
 the following internal member:
 {code}private boolean isViewAllowed = true;{code}
 Note that its true.
 Now, in the method that sets proper view-allowed rights, has:
 {code}
 if (user != null  job != null  jt.areACLsEnabled()) {
   final UserGroupInformation ugi =
 UserGroupInformation.createRemoteUser(user);
   try {
 ugi.doAs(new PrivilegedExceptionActionVoid() {
   public Void run() throws IOException, ServletException {
 // checks job view permission
 jt.getACLsManager().checkAccess(job, ugi,
 Operation.VIEW_JOB_DETAILS);
 return null;
   }
 });
   } catch (AccessControlException e) {
 String errMsg = User  + ugi.getShortUserName() +
  failed to view  + jobid + !brbr + e.getMessage() +
 hra href=\jobtracker.jsp\Go back to JobTracker/abr;
 JSPUtil.setErrorAndForward(errMsg, request, response);
 myJob.setViewAccess(false);
   } catch (InterruptedException e) {
 String errMsg =  Interrupted while trying to access  + jobid +
 hra href=\jobtracker.jsp\Go back to JobTracker/abr;
 JSPUtil.setErrorAndForward(errMsg, request, response);
 myJob.setViewAccess(false);
   }
 }
 return myJob;
 {code}
 In the above snippet, you can notice that if user==null, which can happen if 
 user is not http-authenticated (as its got via request.getRemoteUser()), can 
 lead to the view being visible since the default is true and we didn't toggle 
 the view to false for user == null case.
 Ideally the default of the view job ACL must be false, or we need an else 
 clause that sets the view rights to false in case of a failure to find the 
 user ID.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4332) Add a yarn-client module

2012-06-25 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-4332:
---

Attachment: MAPREDUCE-4332-20120625.txt

 Add a yarn-client module
 

 Key: MAPREDUCE-4332
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4332
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client, mrv2
Affects Versions: 2.0.0-alpha
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Fix For: 2.0.1-alpha

 Attachments: MAPREDUCE-4332-20120621-with-common-changes.txt, 
 MAPREDUCE-4332-20120621.txt, MAPREDUCE-4332-20120622.txt, 
 MAPREDUCE-4332-20120625.txt


 I see that we are duplicating (some) code for talking to RM via client API. 
 In this light, a yarn-client module will be useful so that clients of all 
 frameworks can use/extend it.
 And that same module can be the destination for all the YARN's command line 
 tools.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4332) Add a yarn-client module

2012-06-25 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-4332:
---

Status: Patch Available  (was: Open)

 Add a yarn-client module
 

 Key: MAPREDUCE-4332
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4332
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client, mrv2
Affects Versions: 2.0.0-alpha
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Fix For: 2.0.1-alpha

 Attachments: MAPREDUCE-4332-20120621-with-common-changes.txt, 
 MAPREDUCE-4332-20120621.txt, MAPREDUCE-4332-20120622.txt, 
 MAPREDUCE-4332-20120625.txt


 I see that we are duplicating (some) code for talking to RM via client API. 
 In this light, a yarn-client module will be useful so that clients of all 
 frameworks can use/extend it.
 And that same module can be the destination for all the YARN's command line 
 tools.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3837) Hadoop 22 Job tracker is not able to recover job in case of crash and after that no user can submit job.

2012-06-25 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400894#comment-13400894
 ] 

Tom White commented on MAPREDUCE-3837:
--

Mayank - thanks for pointing that out. I just tried and they fail for me on the 
latest branch-1 code too. We do need tests for job tracker recovery though, so 
they should be fixed to ensure that the code in this patch is tested and 
doesn't regress, don't you think?

 Hadoop 22 Job tracker is not able to recover job in case of crash and after 
 that no user can submit job.
 

 Key: MAPREDUCE-3837
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3837
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Mayank Bansal
Assignee: Mayank Bansal
 Fix For: 0.24.0, 0.22.1, 0.23.2

 Attachments: PATCH-HADOOP-1-MAPREDUCE-3837-1.patch, 
 PATCH-HADOOP-1-MAPREDUCE-3837-2.patch, PATCH-HADOOP-1-MAPREDUCE-3837.patch, 
 PATCH-MAPREDUCE-3837.patch, PATCH-TRUNK-MAPREDUCE-3837.patch


 If job tracker is crashed while running , and there were some jobs are 
 running , so if job tracker's property mapreduce.jobtracker.restart.recover 
 is true then it should recover the job.
 However the current behavior is as follows
 jobtracker try to restore the jobs but it can not . And after that jobtracker 
 closes its handle to hdfs and nobody else can submit job. 
 Thanks,
 Mayank

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4317) Job view ACL checks are too permissive

2012-06-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400898#comment-13400898
 ] 

Hadoop QA commented on MAPREDUCE-4317:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12533376/MR-4317.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2509//console

This message is automatically generated.

 Job view ACL checks are too permissive
 --

 Key: MAPREDUCE-4317
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4317
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.0.3
Reporter: Harsh J
Assignee: Karthik Kambatla
 Attachments: MR-4317.patch


 The class that does view-based checks, JSPUtil.JobWithViewAccessCheck, has 
 the following internal member:
 {code}private boolean isViewAllowed = true;{code}
 Note that its true.
 Now, in the method that sets proper view-allowed rights, has:
 {code}
 if (user != null  job != null  jt.areACLsEnabled()) {
   final UserGroupInformation ugi =
 UserGroupInformation.createRemoteUser(user);
   try {
 ugi.doAs(new PrivilegedExceptionActionVoid() {
   public Void run() throws IOException, ServletException {
 // checks job view permission
 jt.getACLsManager().checkAccess(job, ugi,
 Operation.VIEW_JOB_DETAILS);
 return null;
   }
 });
   } catch (AccessControlException e) {
 String errMsg = User  + ugi.getShortUserName() +
  failed to view  + jobid + !brbr + e.getMessage() +
 hra href=\jobtracker.jsp\Go back to JobTracker/abr;
 JSPUtil.setErrorAndForward(errMsg, request, response);
 myJob.setViewAccess(false);
   } catch (InterruptedException e) {
 String errMsg =  Interrupted while trying to access  + jobid +
 hra href=\jobtracker.jsp\Go back to JobTracker/abr;
 JSPUtil.setErrorAndForward(errMsg, request, response);
 myJob.setViewAccess(false);
   }
 }
 return myJob;
 {code}
 In the above snippet, you can notice that if user==null, which can happen if 
 user is not http-authenticated (as its got via request.getRemoteUser()), can 
 lead to the view being visible since the default is true and we didn't toggle 
 the view to false for user == null case.
 Ideally the default of the view job ACL must be false, or we need an else 
 clause that sets the view rights to false in case of a failure to find the 
 user ID.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3837) Hadoop 22 Job tracker is not able to recover job in case of crash and after that no user can submit job.

2012-06-25 Thread Mayank Bansal (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400901#comment-13400901
 ] 

Mayank Bansal commented on MAPREDUCE-3837:
--

Agree, working on it will update soon.

Thanks,
Mayank

 Hadoop 22 Job tracker is not able to recover job in case of crash and after 
 that no user can submit job.
 

 Key: MAPREDUCE-3837
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3837
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Mayank Bansal
Assignee: Mayank Bansal
 Fix For: 0.24.0, 0.22.1, 0.23.2

 Attachments: PATCH-HADOOP-1-MAPREDUCE-3837-1.patch, 
 PATCH-HADOOP-1-MAPREDUCE-3837-2.patch, PATCH-HADOOP-1-MAPREDUCE-3837.patch, 
 PATCH-MAPREDUCE-3837.patch, PATCH-TRUNK-MAPREDUCE-3837.patch


 If job tracker is crashed while running , and there were some jobs are 
 running , so if job tracker's property mapreduce.jobtracker.restart.recover 
 is true then it should recover the job.
 However the current behavior is as follows
 jobtracker try to restore the jobs but it can not . And after that jobtracker 
 closes its handle to hdfs and nobody else can submit job. 
 Thanks,
 Mayank

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3837) Hadoop 22 Job tracker is not able to recover job in case of crash and after that no user can submit job.

2012-06-25 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400918#comment-13400918
 ] 

Arun C Murthy commented on MAPREDUCE-3837:
--

Mayank, as we briefly discussed you'll need to fix the re-submit to read 
jobtokens from HDFS and pass them along (i.e. Credentials object) to the 
submitJob api. Sorry, I've been traveling a lot and missed commenting here, my 
bad.

Other nits:

# You've removed the call to JobClient.isJobDirValid which is dangerous. Since 
the contents have changed in hadoop-1 post security, please add a private 
isJobDirValid method to the JT and use it. This method should check for jobInfo 
file on HDFS (JobTracker.JOB_INFO_FILE) and the jobTokens file 
(TokenCache.JOB_TOKEN_HDFS_FILE).
# Also, since we only care about jobIds now for JT recovery, it's better to add 
a SetJobId jobIdsToRecover rather than rely on SetJobInfo jobsToRecover. 
This way we can avoid all the unnecessary translations b/w o.a.h.mapred.JobId 
and o.a.h.mapreduce.JobId.

 Hadoop 22 Job tracker is not able to recover job in case of crash and after 
 that no user can submit job.
 

 Key: MAPREDUCE-3837
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3837
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Mayank Bansal
Assignee: Mayank Bansal
 Fix For: 0.24.0, 0.22.1, 0.23.2

 Attachments: PATCH-HADOOP-1-MAPREDUCE-3837-1.patch, 
 PATCH-HADOOP-1-MAPREDUCE-3837-2.patch, PATCH-HADOOP-1-MAPREDUCE-3837.patch, 
 PATCH-MAPREDUCE-3837.patch, PATCH-TRUNK-MAPREDUCE-3837.patch


 If job tracker is crashed while running , and there were some jobs are 
 running , so if job tracker's property mapreduce.jobtracker.restart.recover 
 is true then it should recover the job.
 However the current behavior is as follows
 jobtracker try to restore the jobs but it can not . And after that jobtracker 
 closes its handle to hdfs and nobody else can submit job. 
 Thanks,
 Mayank

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2289) Permissions race can make getStagingDir fail on local filesystem

2012-06-25 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400926#comment-13400926
 ] 

Alejandro Abdelnur commented on MAPREDUCE-2289:
---

+1

 Permissions race can make getStagingDir fail on local filesystem
 

 Key: MAPREDUCE-2289
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2289
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Ahmed Radwan
 Fix For: 0.22.1

 Attachments: MAPREDUCE-2289_branch-1.0.patch, 
 MAPREDUCE-2289_trunk.patch, mapreduce-2289.txt


 I've observed the following race condition in TestFairSchedulerSystem which 
 uses a MiniMRCluster on top of RawLocalFileSystem:
 - two threads call getStagingDir at the same time
 - Thread A checks fs.exists(stagingArea) and sees false
 -- Calls mkdirs(stagingArea, JOB_DIR_PERMISSIONS)
 --- mkdirs calls the Java mkdir API which makes the file with umask-based 
 permissions
 - Thread B runs, checks fs.exists(stagingArea) and sees true
 -- checks permissions, sees the default permissions, and throws IOE
 - Thread A resumes and sets correct permissions

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2289) Permissions race can make getStagingDir fail on local filesystem

2012-06-25 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated MAPREDUCE-2289:
--

   Resolution: Fixed
Fix Version/s: (was: 0.22.1)
   2.0.1-alpha
   1.1.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Thanks Ahmed. Committed to trunk, branch-1  branch-2.

 Permissions race can make getStagingDir fail on local filesystem
 

 Key: MAPREDUCE-2289
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2289
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Ahmed Radwan
 Fix For: 1.1.0, 2.0.1-alpha

 Attachments: MAPREDUCE-2289_branch-1.0.patch, 
 MAPREDUCE-2289_trunk.patch, mapreduce-2289.txt


 I've observed the following race condition in TestFairSchedulerSystem which 
 uses a MiniMRCluster on top of RawLocalFileSystem:
 - two threads call getStagingDir at the same time
 - Thread A checks fs.exists(stagingArea) and sees false
 -- Calls mkdirs(stagingArea, JOB_DIR_PERMISSIONS)
 --- mkdirs calls the Java mkdir API which makes the file with umask-based 
 permissions
 - Thread B runs, checks fs.exists(stagingArea) and sees true
 -- checks permissions, sees the default permissions, and throws IOE
 - Thread A resumes and sets correct permissions

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4355) Add JobStatus getJobStatus(JobID) to JobClient.

2012-06-25 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400938#comment-13400938
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4355:
---

+1

 Add JobStatus getJobStatus(JobID) to JobClient.
 ---

 Key: MAPREDUCE-4355
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv1, mrv2
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: MR-4355_mr1.patch, MR-4355_mr2.patch


 To read the start-time of a particular job, one should not need to 
 getAllJobs() and iterate through them.
 getJob(JobID) returns RunningJob, which doesn't hold the job's start time.
 Hence, we need to add getJobStatus(JobID) to the API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2289) Permissions race can make getStagingDir fail on local filesystem

2012-06-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400947#comment-13400947
 ] 

Hudson commented on MAPREDUCE-2289:
---

Integrated in Hadoop-Common-trunk-Commit #2386 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2386/])
MAPREDUCE-2289. Permissions race can make getStagingDir fail on local 
filesystem (ahmed via tucu) (Revision 1353750)

 Result = SUCCESS
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1353750
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobSubmissionFiles.java


 Permissions race can make getStagingDir fail on local filesystem
 

 Key: MAPREDUCE-2289
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2289
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Ahmed Radwan
 Fix For: 1.1.0, 2.0.1-alpha

 Attachments: MAPREDUCE-2289_branch-1.0.patch, 
 MAPREDUCE-2289_trunk.patch, mapreduce-2289.txt


 I've observed the following race condition in TestFairSchedulerSystem which 
 uses a MiniMRCluster on top of RawLocalFileSystem:
 - two threads call getStagingDir at the same time
 - Thread A checks fs.exists(stagingArea) and sees false
 -- Calls mkdirs(stagingArea, JOB_DIR_PERMISSIONS)
 --- mkdirs calls the Java mkdir API which makes the file with umask-based 
 permissions
 - Thread B runs, checks fs.exists(stagingArea) and sees true
 -- checks permissions, sees the default permissions, and throws IOE
 - Thread A resumes and sets correct permissions

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4355) Add JobStatus getJobStatus(JobID) to JobClient.

2012-06-25 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated MAPREDUCE-4355:
--

   Resolution: Fixed
Fix Version/s: 2.0.1-alpha
   1.1.0
 Hadoop Flags: Incompatible change,Reviewed  (was: Incompatible change)
   Status: Resolved  (was: Patch Available)

Thanks Karthik. Committed to trunk, branch-1  branch-2.

 Add JobStatus getJobStatus(JobID) to JobClient.
 ---

 Key: MAPREDUCE-4355
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv1, mrv2
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Fix For: 1.1.0, 2.0.1-alpha

 Attachments: MR-4355_mr1.patch, MR-4355_mr2.patch


 To read the start-time of a particular job, one should not need to 
 getAllJobs() and iterate through them.
 getJob(JobID) returns RunningJob, which doesn't hold the job's start time.
 Hence, we need to add getJobStatus(JobID) to the API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4332) Add a yarn-client module

2012-06-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400954#comment-13400954
 ] 

Hadoop QA commented on MAPREDUCE-4332:
--

+1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12533377/MAPREDUCE-4332-20120625.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 4 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient
 
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell
 hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-client.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2508//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2508//console

This message is automatically generated.

 Add a yarn-client module
 

 Key: MAPREDUCE-4332
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4332
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client, mrv2
Affects Versions: 2.0.0-alpha
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Fix For: 2.0.1-alpha

 Attachments: MAPREDUCE-4332-20120621-with-common-changes.txt, 
 MAPREDUCE-4332-20120621.txt, MAPREDUCE-4332-20120622.txt, 
 MAPREDUCE-4332-20120625.txt


 I see that we are duplicating (some) code for talking to RM via client API. 
 In this light, a yarn-client module will be useful so that clients of all 
 frameworks can use/extend it.
 And that same module can be the destination for all the YARN's command line 
 tools.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2454) Allow external sorter plugin for MR

2012-06-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400970#comment-13400970
 ] 

Hadoop QA commented on MAPREDUCE-2454:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12533362/mapreduce-2454.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient:

  org.apache.hadoop.io.file.tfile.TestTFileByteArrays
  
org.apache.hadoop.io.file.tfile.TestTFileJClassComparatorByteArrays
  org.apache.hadoop.mapred.TestReduceFetch

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2507//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2507//console

This message is automatically generated.

 Allow external sorter plugin for MR
 ---

 Key: MAPREDUCE-2454
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2454
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Mariappan Asokan
Priority: Minor
  Labels: features, performance, plugin, sort
 Attachments: HadoopSortPlugin.pdf, KeyValueIterator.java, 
 MR-2454-trunkPatchPreview.gz, MapOutputSorter.java, 
 MapOutputSorterAbstract.java, ReduceInputSorter.java, mapreduce-2454.patch, 
 mr-2454-on-mr-279-build82.patch.gz


 Define interfaces and some abstract classes in the Hadoop framework to 
 facilitate external sorter plugins both on the Map and Reduce sides.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4355) Add JobStatus getJobStatus(JobID) to JobClient.

2012-06-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400971#comment-13400971
 ] 

Hudson commented on MAPREDUCE-4355:
---

Integrated in Hadoop-Hdfs-trunk-Commit #2456 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2456/])
MAPREDUCE-4355. Add JobStatus getJobStatus(JobID) to JobClient. (kkambatl 
via tucu) (Revision 1353757)

 Result = SUCCESS
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1353757
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapred/TestJobClient.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapred/TestJobClientGetJob.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/JobClient.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/Cluster.java


 Add JobStatus getJobStatus(JobID) to JobClient.
 ---

 Key: MAPREDUCE-4355
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv1, mrv2
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Fix For: 1.1.0, 2.0.1-alpha

 Attachments: MR-4355_mr1.patch, MR-4355_mr2.patch


 To read the start-time of a particular job, one should not need to 
 getAllJobs() and iterate through them.
 getJob(JobID) returns RunningJob, which doesn't hold the job's start time.
 Hence, we need to add getJobStatus(JobID) to the API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Comment Edited] (MAPREDUCE-4317) Job view ACL checks are too permissive

2012-06-25 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400986#comment-13400986
 ] 

Alejandro Abdelnur edited comment on MAPREDUCE-4317 at 6/25/12 10:36 PM:
-

Karthik, why the check *if (user == null || user.equals(null)) {*, checking 
for *user == null* should be enough, no? Else you a voiding the user named 
'*null*'

  was (Author: tucu00):
Karthik, why the check *if (user == null || user.equals(null)) {*, 
checking for *user == null* should be enough, no? Else you a voiding the user 
name *null*
  
 Job view ACL checks are too permissive
 --

 Key: MAPREDUCE-4317
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4317
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.0.3
Reporter: Harsh J
Assignee: Karthik Kambatla
 Attachments: MR-4317.patch


 The class that does view-based checks, JSPUtil.JobWithViewAccessCheck, has 
 the following internal member:
 {code}private boolean isViewAllowed = true;{code}
 Note that its true.
 Now, in the method that sets proper view-allowed rights, has:
 {code}
 if (user != null  job != null  jt.areACLsEnabled()) {
   final UserGroupInformation ugi =
 UserGroupInformation.createRemoteUser(user);
   try {
 ugi.doAs(new PrivilegedExceptionActionVoid() {
   public Void run() throws IOException, ServletException {
 // checks job view permission
 jt.getACLsManager().checkAccess(job, ugi,
 Operation.VIEW_JOB_DETAILS);
 return null;
   }
 });
   } catch (AccessControlException e) {
 String errMsg = User  + ugi.getShortUserName() +
  failed to view  + jobid + !brbr + e.getMessage() +
 hra href=\jobtracker.jsp\Go back to JobTracker/abr;
 JSPUtil.setErrorAndForward(errMsg, request, response);
 myJob.setViewAccess(false);
   } catch (InterruptedException e) {
 String errMsg =  Interrupted while trying to access  + jobid +
 hra href=\jobtracker.jsp\Go back to JobTracker/abr;
 JSPUtil.setErrorAndForward(errMsg, request, response);
 myJob.setViewAccess(false);
   }
 }
 return myJob;
 {code}
 In the above snippet, you can notice that if user==null, which can happen if 
 user is not http-authenticated (as its got via request.getRemoteUser()), can 
 lead to the view being visible since the default is true and we didn't toggle 
 the view to false for user == null case.
 Ideally the default of the view job ACL must be false, or we need an else 
 clause that sets the view rights to false in case of a failure to find the 
 user ID.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4317) Job view ACL checks are too permissive

2012-06-25 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400986#comment-13400986
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4317:
---

Karthik, why the check *if (user == null || user.equals(null)) {*, checking 
for *user == null* should be enough, no? Else you a voiding the user name *null*

 Job view ACL checks are too permissive
 --

 Key: MAPREDUCE-4317
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4317
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.0.3
Reporter: Harsh J
Assignee: Karthik Kambatla
 Attachments: MR-4317.patch


 The class that does view-based checks, JSPUtil.JobWithViewAccessCheck, has 
 the following internal member:
 {code}private boolean isViewAllowed = true;{code}
 Note that its true.
 Now, in the method that sets proper view-allowed rights, has:
 {code}
 if (user != null  job != null  jt.areACLsEnabled()) {
   final UserGroupInformation ugi =
 UserGroupInformation.createRemoteUser(user);
   try {
 ugi.doAs(new PrivilegedExceptionActionVoid() {
   public Void run() throws IOException, ServletException {
 // checks job view permission
 jt.getACLsManager().checkAccess(job, ugi,
 Operation.VIEW_JOB_DETAILS);
 return null;
   }
 });
   } catch (AccessControlException e) {
 String errMsg = User  + ugi.getShortUserName() +
  failed to view  + jobid + !brbr + e.getMessage() +
 hra href=\jobtracker.jsp\Go back to JobTracker/abr;
 JSPUtil.setErrorAndForward(errMsg, request, response);
 myJob.setViewAccess(false);
   } catch (InterruptedException e) {
 String errMsg =  Interrupted while trying to access  + jobid +
 hra href=\jobtracker.jsp\Go back to JobTracker/abr;
 JSPUtil.setErrorAndForward(errMsg, request, response);
 myJob.setViewAccess(false);
   }
 }
 return myJob;
 {code}
 In the above snippet, you can notice that if user==null, which can happen if 
 user is not http-authenticated (as its got via request.getRemoteUser()), can 
 lead to the view being visible since the default is true and we didn't toggle 
 the view to false for user == null case.
 Ideally the default of the view job ACL must be false, or we need an else 
 clause that sets the view rights to false in case of a failure to find the 
 user ID.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4317) Job view ACL checks are too permissive

2012-06-25 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400988#comment-13400988
 ] 

Karthik Kambatla commented on MAPREDUCE-4317:
-

Alejandro, 

The user string is read from the HTTPRequest. Both when the user == null and 
user == null, the request reads it as null. Hence, the check for both. And, 
I agree the user named null is forcibly not allowed.

 Job view ACL checks are too permissive
 --

 Key: MAPREDUCE-4317
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4317
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.0.3
Reporter: Harsh J
Assignee: Karthik Kambatla
 Attachments: MR-4317.patch


 The class that does view-based checks, JSPUtil.JobWithViewAccessCheck, has 
 the following internal member:
 {code}private boolean isViewAllowed = true;{code}
 Note that its true.
 Now, in the method that sets proper view-allowed rights, has:
 {code}
 if (user != null  job != null  jt.areACLsEnabled()) {
   final UserGroupInformation ugi =
 UserGroupInformation.createRemoteUser(user);
   try {
 ugi.doAs(new PrivilegedExceptionActionVoid() {
   public Void run() throws IOException, ServletException {
 // checks job view permission
 jt.getACLsManager().checkAccess(job, ugi,
 Operation.VIEW_JOB_DETAILS);
 return null;
   }
 });
   } catch (AccessControlException e) {
 String errMsg = User  + ugi.getShortUserName() +
  failed to view  + jobid + !brbr + e.getMessage() +
 hra href=\jobtracker.jsp\Go back to JobTracker/abr;
 JSPUtil.setErrorAndForward(errMsg, request, response);
 myJob.setViewAccess(false);
   } catch (InterruptedException e) {
 String errMsg =  Interrupted while trying to access  + jobid +
 hra href=\jobtracker.jsp\Go back to JobTracker/abr;
 JSPUtil.setErrorAndForward(errMsg, request, response);
 myJob.setViewAccess(false);
   }
 }
 return myJob;
 {code}
 In the above snippet, you can notice that if user==null, which can happen if 
 user is not http-authenticated (as its got via request.getRemoteUser()), can 
 lead to the view being visible since the default is true and we didn't toggle 
 the view to false for user == null case.
 Ideally the default of the view job ACL must be false, or we need an else 
 clause that sets the view rights to false in case of a failure to find the 
 user ID.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4355) Add JobStatus getJobStatus(JobID) to JobClient.

2012-06-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400992#comment-13400992
 ] 

Hudson commented on MAPREDUCE-4355:
---

Integrated in Hadoop-Common-trunk-Commit #2387 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2387/])
MAPREDUCE-4355. Add JobStatus getJobStatus(JobID) to JobClient. (kkambatl 
via tucu) (Revision 1353757)

 Result = SUCCESS
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1353757
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapred/TestJobClient.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapred/TestJobClientGetJob.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/JobClient.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/Cluster.java


 Add JobStatus getJobStatus(JobID) to JobClient.
 ---

 Key: MAPREDUCE-4355
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv1, mrv2
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Fix For: 1.1.0, 2.0.1-alpha

 Attachments: MR-4355_mr1.patch, MR-4355_mr2.patch


 To read the start-time of a particular job, one should not need to 
 getAllJobs() and iterate through them.
 getJob(JobID) returns RunningJob, which doesn't hold the job's start time.
 Hence, we need to add getJobStatus(JobID) to the API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2289) Permissions race can make getStagingDir fail on local filesystem

2012-06-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401001#comment-13401001
 ] 

Hudson commented on MAPREDUCE-2289:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #2404 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2404/])
MAPREDUCE-2289. Permissions race can make getStagingDir fail on local 
filesystem (ahmed via tucu) (Revision 1353750)

 Result = FAILURE
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1353750
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobSubmissionFiles.java


 Permissions race can make getStagingDir fail on local filesystem
 

 Key: MAPREDUCE-2289
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2289
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: job submission
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Ahmed Radwan
 Fix For: 1.1.0, 2.0.1-alpha

 Attachments: MAPREDUCE-2289_branch-1.0.patch, 
 MAPREDUCE-2289_trunk.patch, mapreduce-2289.txt


 I've observed the following race condition in TestFairSchedulerSystem which 
 uses a MiniMRCluster on top of RawLocalFileSystem:
 - two threads call getStagingDir at the same time
 - Thread A checks fs.exists(stagingArea) and sees false
 -- Calls mkdirs(stagingArea, JOB_DIR_PERMISSIONS)
 --- mkdirs calls the Java mkdir API which makes the file with umask-based 
 permissions
 - Thread B runs, checks fs.exists(stagingArea) and sees true
 -- checks permissions, sees the default permissions, and throws IOE
 - Thread A resumes and sets correct permissions

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4317) Job view ACL checks are too permissive

2012-06-25 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401004#comment-13401004
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4317:
---

but the check for user '*null*' was not there, why are you introducing it? if 
the user is undefined it will be null, when do you expect the user to be the 
string '*null*'. Checking for *user == null* should be enough, no?

 Job view ACL checks are too permissive
 --

 Key: MAPREDUCE-4317
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4317
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.0.3
Reporter: Harsh J
Assignee: Karthik Kambatla
 Attachments: MR-4317.patch


 The class that does view-based checks, JSPUtil.JobWithViewAccessCheck, has 
 the following internal member:
 {code}private boolean isViewAllowed = true;{code}
 Note that its true.
 Now, in the method that sets proper view-allowed rights, has:
 {code}
 if (user != null  job != null  jt.areACLsEnabled()) {
   final UserGroupInformation ugi =
 UserGroupInformation.createRemoteUser(user);
   try {
 ugi.doAs(new PrivilegedExceptionActionVoid() {
   public Void run() throws IOException, ServletException {
 // checks job view permission
 jt.getACLsManager().checkAccess(job, ugi,
 Operation.VIEW_JOB_DETAILS);
 return null;
   }
 });
   } catch (AccessControlException e) {
 String errMsg = User  + ugi.getShortUserName() +
  failed to view  + jobid + !brbr + e.getMessage() +
 hra href=\jobtracker.jsp\Go back to JobTracker/abr;
 JSPUtil.setErrorAndForward(errMsg, request, response);
 myJob.setViewAccess(false);
   } catch (InterruptedException e) {
 String errMsg =  Interrupted while trying to access  + jobid +
 hra href=\jobtracker.jsp\Go back to JobTracker/abr;
 JSPUtil.setErrorAndForward(errMsg, request, response);
 myJob.setViewAccess(false);
   }
 }
 return myJob;
 {code}
 In the above snippet, you can notice that if user==null, which can happen if 
 user is not http-authenticated (as its got via request.getRemoteUser()), can 
 lead to the view being visible since the default is true and we didn't toggle 
 the view to false for user == null case.
 Ideally the default of the view job ACL must be false, or we need an else 
 clause that sets the view rights to false in case of a failure to find the 
 user ID.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4317) Job view ACL checks are too permissive

2012-06-25 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401015#comment-13401015
 ] 

Karthik Kambatla commented on MAPREDUCE-4317:
-

Let me explain in more detail:

JSPUtil.checkAccessAndGetJob() takes HTTPServletRequest as input. 
HTTPServletRequest.getRemoteUser() returns null if the user is not 
authenticated. For this case, check for user == null should suffice.

However, I have noticed while testing 
(TestWebUIAuthorization.validateTaskGraphServletAccess() in the patch) that 
when we build the HTTPServletRequest with user == null, the corresponding url 
captures the user as null. For such cases, where the client mistakenly 
captures the user as null, we need to check user.equals(null) as well. 

Do you think we should change the way client builds the HTTPServletRequest?

Many thanks.

 Job view ACL checks are too permissive
 --

 Key: MAPREDUCE-4317
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4317
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.0.3
Reporter: Harsh J
Assignee: Karthik Kambatla
 Attachments: MR-4317.patch


 The class that does view-based checks, JSPUtil.JobWithViewAccessCheck, has 
 the following internal member:
 {code}private boolean isViewAllowed = true;{code}
 Note that its true.
 Now, in the method that sets proper view-allowed rights, has:
 {code}
 if (user != null  job != null  jt.areACLsEnabled()) {
   final UserGroupInformation ugi =
 UserGroupInformation.createRemoteUser(user);
   try {
 ugi.doAs(new PrivilegedExceptionActionVoid() {
   public Void run() throws IOException, ServletException {
 // checks job view permission
 jt.getACLsManager().checkAccess(job, ugi,
 Operation.VIEW_JOB_DETAILS);
 return null;
   }
 });
   } catch (AccessControlException e) {
 String errMsg = User  + ugi.getShortUserName() +
  failed to view  + jobid + !brbr + e.getMessage() +
 hra href=\jobtracker.jsp\Go back to JobTracker/abr;
 JSPUtil.setErrorAndForward(errMsg, request, response);
 myJob.setViewAccess(false);
   } catch (InterruptedException e) {
 String errMsg =  Interrupted while trying to access  + jobid +
 hra href=\jobtracker.jsp\Go back to JobTracker/abr;
 JSPUtil.setErrorAndForward(errMsg, request, response);
 myJob.setViewAccess(false);
   }
 }
 return myJob;
 {code}
 In the above snippet, you can notice that if user==null, which can happen if 
 user is not http-authenticated (as its got via request.getRemoteUser()), can 
 lead to the view being visible since the default is true and we didn't toggle 
 the view to false for user == null case.
 Ideally the default of the view job ACL must be false, or we need an else 
 clause that sets the view rights to false in case of a failure to find the 
 user ID.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4317) Job view ACL checks are too permissive

2012-06-25 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401023#comment-13401023
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4317:
---

what do you mean by +the client mistakenly captures the user as null+ ?

 Job view ACL checks are too permissive
 --

 Key: MAPREDUCE-4317
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4317
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.0.3
Reporter: Harsh J
Assignee: Karthik Kambatla
 Attachments: MR-4317.patch


 The class that does view-based checks, JSPUtil.JobWithViewAccessCheck, has 
 the following internal member:
 {code}private boolean isViewAllowed = true;{code}
 Note that its true.
 Now, in the method that sets proper view-allowed rights, has:
 {code}
 if (user != null  job != null  jt.areACLsEnabled()) {
   final UserGroupInformation ugi =
 UserGroupInformation.createRemoteUser(user);
   try {
 ugi.doAs(new PrivilegedExceptionActionVoid() {
   public Void run() throws IOException, ServletException {
 // checks job view permission
 jt.getACLsManager().checkAccess(job, ugi,
 Operation.VIEW_JOB_DETAILS);
 return null;
   }
 });
   } catch (AccessControlException e) {
 String errMsg = User  + ugi.getShortUserName() +
  failed to view  + jobid + !brbr + e.getMessage() +
 hra href=\jobtracker.jsp\Go back to JobTracker/abr;
 JSPUtil.setErrorAndForward(errMsg, request, response);
 myJob.setViewAccess(false);
   } catch (InterruptedException e) {
 String errMsg =  Interrupted while trying to access  + jobid +
 hra href=\jobtracker.jsp\Go back to JobTracker/abr;
 JSPUtil.setErrorAndForward(errMsg, request, response);
 myJob.setViewAccess(false);
   }
 }
 return myJob;
 {code}
 In the above snippet, you can notice that if user==null, which can happen if 
 user is not http-authenticated (as its got via request.getRemoteUser()), can 
 lead to the view being visible since the default is true and we didn't toggle 
 the view to false for user == null case.
 Ideally the default of the view job ACL must be false, or we need an else 
 clause that sets the view rights to false in case of a failure to find the 
 user ID.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4317) Job view ACL checks are too permissive

2012-06-25 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401032#comment-13401032
 ] 

Karthik Kambatla commented on MAPREDUCE-4317:
-

In the following code snippet from TestWebUIAuthorization, new URL() takes in 
userName. When the userName is null, the URL sees it as null. In 
checkAccessAndGetJob, we read the username from this string and get null and 
not null.

The test fails when I check only for user == null.

{code}
  static int getHttpStatusCode(String urlstring, String userName,
  String method) throws IOException {
LOG.info(Accessing  + urlstring +  as user  + userName);
URL url = new URL(urlstring + user.name= + userName);
HttpURLConnection connection = (HttpURLConnection)url.openConnection();
connection.setRequestMethod(method);
if (method.equals(POST)) {
  String encodedData = action=killuser.name= + userName;  
  connection.setRequestProperty(Content-Type,
application/x-www-form-urlencoded);
  connection.setRequestProperty(Content-Length,
Integer.toString(encodedData.length()));
  connection.setDoOutput(true);

  OutputStream os = connection.getOutputStream();
  os.write(encodedData.getBytes());
}
connection.connect();

return connection.getResponseCode();
  }

{code}

 Job view ACL checks are too permissive
 --

 Key: MAPREDUCE-4317
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4317
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.0.3
Reporter: Harsh J
Assignee: Karthik Kambatla
 Attachments: MR-4317.patch


 The class that does view-based checks, JSPUtil.JobWithViewAccessCheck, has 
 the following internal member:
 {code}private boolean isViewAllowed = true;{code}
 Note that its true.
 Now, in the method that sets proper view-allowed rights, has:
 {code}
 if (user != null  job != null  jt.areACLsEnabled()) {
   final UserGroupInformation ugi =
 UserGroupInformation.createRemoteUser(user);
   try {
 ugi.doAs(new PrivilegedExceptionActionVoid() {
   public Void run() throws IOException, ServletException {
 // checks job view permission
 jt.getACLsManager().checkAccess(job, ugi,
 Operation.VIEW_JOB_DETAILS);
 return null;
   }
 });
   } catch (AccessControlException e) {
 String errMsg = User  + ugi.getShortUserName() +
  failed to view  + jobid + !brbr + e.getMessage() +
 hra href=\jobtracker.jsp\Go back to JobTracker/abr;
 JSPUtil.setErrorAndForward(errMsg, request, response);
 myJob.setViewAccess(false);
   } catch (InterruptedException e) {
 String errMsg =  Interrupted while trying to access  + jobid +
 hra href=\jobtracker.jsp\Go back to JobTracker/abr;
 JSPUtil.setErrorAndForward(errMsg, request, response);
 myJob.setViewAccess(false);
   }
 }
 return myJob;
 {code}
 In the above snippet, you can notice that if user==null, which can happen if 
 user is not http-authenticated (as its got via request.getRemoteUser()), can 
 lead to the view being visible since the default is true and we didn't toggle 
 the view to false for user == null case.
 Ideally the default of the view job ACL must be false, or we need an else 
 clause that sets the view rights to false in case of a failure to find the 
 user ID.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient

2012-06-25 Thread Ahmed Radwan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401039#comment-13401039
 ] 

Ahmed Radwan commented on MAPREDUCE-4346:
-

Many tanks Tucu for the review! I have updated the patch per your comments:

- Changed JobStatus.setRetired(..) to package-private.
- Added description comment for the code handling the incompatibility case.
- For the intended logic of status and retired: I agree, it can be confusing. I 
have added more description in the method comments in both the JobClient and 
JobSubmissionProtocol to clarify the exact semantics of the method.

Please let me know if you have any further comments.

 Adding a refined version of JobTracker.getAllJobs() and exposing through the 
 JobClient
 --

 Key: MAPREDUCE-4346
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, 
 MAPREDUCE-4346_rev3.patch


 The current implementation for JobTracker.getAllJobs() returns all submitted 
 jobs in any state, in addition to retired jobs. This list can be long and 
 represents an unneeded overhead especially in the case of clients only 
 interested in jobs in specific state(s). 
 It is beneficial to include a refined version where only jobs having specific 
 statuses are returned and retired jobs are optional to include. 
 I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient

2012-06-25 Thread Ahmed Radwan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Radwan updated MAPREDUCE-4346:


Attachment: MAPREDUCE-4346_rev3.patch

 Adding a refined version of JobTracker.getAllJobs() and exposing through the 
 JobClient
 --

 Key: MAPREDUCE-4346
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, 
 MAPREDUCE-4346_rev3.patch


 The current implementation for JobTracker.getAllJobs() returns all submitted 
 jobs in any state, in addition to retired jobs. This list can be long and 
 represents an unneeded overhead especially in the case of clients only 
 interested in jobs in specific state(s). 
 It is beneficial to include a refined version where only jobs having specific 
 statuses are returned and retired jobs are optional to include. 
 I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4317) Job view ACL checks are too permissive

2012-06-25 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401040#comment-13401040
 ] 

Karthik Kambatla commented on MAPREDUCE-4317:
-

Makes sense. Didn't realize it. I shall modify the test to not include a 
userName and see if I can omit the check for user.equals(null). Thanks for 
the help.

 Job view ACL checks are too permissive
 --

 Key: MAPREDUCE-4317
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4317
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.0.3
Reporter: Harsh J
Assignee: Karthik Kambatla
 Attachments: MR-4317.patch


 The class that does view-based checks, JSPUtil.JobWithViewAccessCheck, has 
 the following internal member:
 {code}private boolean isViewAllowed = true;{code}
 Note that its true.
 Now, in the method that sets proper view-allowed rights, has:
 {code}
 if (user != null  job != null  jt.areACLsEnabled()) {
   final UserGroupInformation ugi =
 UserGroupInformation.createRemoteUser(user);
   try {
 ugi.doAs(new PrivilegedExceptionActionVoid() {
   public Void run() throws IOException, ServletException {
 // checks job view permission
 jt.getACLsManager().checkAccess(job, ugi,
 Operation.VIEW_JOB_DETAILS);
 return null;
   }
 });
   } catch (AccessControlException e) {
 String errMsg = User  + ugi.getShortUserName() +
  failed to view  + jobid + !brbr + e.getMessage() +
 hra href=\jobtracker.jsp\Go back to JobTracker/abr;
 JSPUtil.setErrorAndForward(errMsg, request, response);
 myJob.setViewAccess(false);
   } catch (InterruptedException e) {
 String errMsg =  Interrupted while trying to access  + jobid +
 hra href=\jobtracker.jsp\Go back to JobTracker/abr;
 JSPUtil.setErrorAndForward(errMsg, request, response);
 myJob.setViewAccess(false);
   }
 }
 return myJob;
 {code}
 In the above snippet, you can notice that if user==null, which can happen if 
 user is not http-authenticated (as its got via request.getRemoteUser()), can 
 lead to the view being visible since the default is true and we didn't toggle 
 the view to false for user == null case.
 Ideally the default of the view job ACL must be false, or we need an else 
 clause that sets the view rights to false in case of a failure to find the 
 user ID.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4368) TaskRunner fails to start jars when the java.library.path contains a quoted path with embedded spaces

2012-06-25 Thread John Gordon (JIRA)
John Gordon created MAPREDUCE-4368:
--

 Summary: TaskRunner fails to start jars when the java.library.path 
contains a quoted path with embedded spaces
 Key: MAPREDUCE-4368
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4368
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 1-win
 Environment: on Windows: 
set PATH=%PATH%;C:\this memorable place.
Reporter: John Gordon
 Fix For: 1-win


TaskRunner splits arguments by space before it adds them back to the vargs 
list, so it loses all context of quote escaped strings with embedded spaces.  
This gets fixed up later by wrapping all arguments with  -- so you get 
something like java -Dopt=value.  This is problematic for paths with 
embedded spaces, where we end up creating -Dopt=first part last part.  
To java, the jar being run is last part.  So with the environment above, you 
will see ClassNoDefFoundError: memorable and the jar will fail to start.  In 
this particular case, we know that java.libarary.path contains paths and the 
tests often use %PATH% to seed this, so the fix is to remove embedded quotes in 
listed path elements because we know the aggregate will be quoted when the JVM 
is started.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4368) TaskRunner fails to start jars when the java.library.path contains a quoted path with embedded spaces

2012-06-25 Thread John Gordon (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Gordon updated MAPREDUCE-4368:
---

   Fix Version/s: (was: 1-win)
  Labels: patch  (was: )
Target Version/s: 1-win
  Status: Patch Available  (was: Open)

Remove embedded quotes specifically for java.library.path and refactor double 
for loop to reduce overhead.

 TaskRunner fails to start jars when the java.library.path contains a quoted 
 path with embedded spaces
 -

 Key: MAPREDUCE-4368
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4368
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 1-win
 Environment: on Windows: 
 set PATH=%PATH%;C:\this memorable place.
Reporter: John Gordon
  Labels: patch
   Original Estimate: 24h
  Remaining Estimate: 24h

 TaskRunner splits arguments by space before it adds them back to the vargs 
 list, so it loses all context of quote escaped strings with embedded spaces.  
 This gets fixed up later by wrapping all arguments with  -- so you get 
 something like java -Dopt=value.  This is problematic for paths with 
 embedded spaces, where we end up creating -Dopt=first part last part. 
  To java, the jar being run is last part.  So with the environment above, you 
 will see ClassNoDefFoundError: memorable and the jar will fail to start.  
 In this particular case, we know that java.libarary.path contains paths and 
 the tests often use %PATH% to seed this, so the fix is to remove embedded 
 quotes in listed path elements because we know the aggregate will be quoted 
 when the JVM is started.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4368) TaskRunner fails to start jars when the java.library.path contains a quoted path with embedded spaces

2012-06-25 Thread John Gordon (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Gordon updated MAPREDUCE-4368:
---

Attachment: TaskRunner.patch

Patch for TaskRunner.java in 1-win

 TaskRunner fails to start jars when the java.library.path contains a quoted 
 path with embedded spaces
 -

 Key: MAPREDUCE-4368
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4368
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 1-win
 Environment: on Windows: 
 set PATH=%PATH%;C:\this memorable place.
Reporter: John Gordon
  Labels: patch
 Attachments: TaskRunner.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 TaskRunner splits arguments by space before it adds them back to the vargs 
 list, so it loses all context of quote escaped strings with embedded spaces.  
 This gets fixed up later by wrapping all arguments with  -- so you get 
 something like java -Dopt=value.  This is problematic for paths with 
 embedded spaces, where we end up creating -Dopt=first part last part. 
  To java, the jar being run is last part.  So with the environment above, you 
 will see ClassNoDefFoundError: memorable and the jar will fail to start.  
 In this particular case, we know that java.libarary.path contains paths and 
 the tests often use %PATH% to seed this, so the fix is to remove embedded 
 quotes in listed path elements because we know the aggregate will be quoted 
 when the JVM is started.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4368) TaskRunner fails to start jars when the java.library.path contains a quoted path with embedded spaces

2012-06-25 Thread John Gordon (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Gordon updated MAPREDUCE-4368:
---

Labels: newbie patch  (was: patch)

 TaskRunner fails to start jars when the java.library.path contains a quoted 
 path with embedded spaces
 -

 Key: MAPREDUCE-4368
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4368
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 1-win
 Environment: on Windows: 
 set PATH=%PATH%;C:\this memorable place.
Reporter: John Gordon
  Labels: newbie, patch
 Attachments: TaskRunner.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 TaskRunner splits arguments by space before it adds them back to the vargs 
 list, so it loses all context of quote escaped strings with embedded spaces.  
 This gets fixed up later by wrapping all arguments with  -- so you get 
 something like java -Dopt=value.  This is problematic for paths with 
 embedded spaces, where we end up creating -Dopt=first part last part. 
  To java, the jar being run is last part.  So with the environment above, you 
 will see ClassNoDefFoundError: memorable and the jar will fail to start.  
 In this particular case, we know that java.libarary.path contains paths and 
 the tests often use %PATH% to seed this, so the fix is to remove embedded 
 quotes in listed path elements because we know the aggregate will be quoted 
 when the JVM is started.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4355) Add JobStatus getJobStatus(JobID) to JobClient.

2012-06-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401054#comment-13401054
 ] 

Hudson commented on MAPREDUCE-4355:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #2405 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2405/])
MAPREDUCE-4355. Add JobStatus getJobStatus(JobID) to JobClient. (kkambatl 
via tucu) (Revision 1353757)

 Result = FAILURE
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1353757
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapred/TestJobClient.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapred/TestJobClientGetJob.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/JobClient.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/Cluster.java


 Add JobStatus getJobStatus(JobID) to JobClient.
 ---

 Key: MAPREDUCE-4355
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv1, mrv2
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Fix For: 1.1.0, 2.0.1-alpha

 Attachments: MR-4355_mr1.patch, MR-4355_mr2.patch


 To read the start-time of a particular job, one should not need to 
 getAllJobs() and iterate through them.
 getJob(JobID) returns RunningJob, which doesn't hold the job's start time.
 Hence, we need to add getJobStatus(JobID) to the API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Moved] (MAPREDUCE-4369) Fix streaming job failures with WindowsResourceCalculatorPlugin

2012-06-25 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha moved HDFS-3565 to MAPREDUCE-4369:
-

Key: MAPREDUCE-4369  (was: HDFS-3565)
Project: Hadoop Map/Reduce  (was: Hadoop HDFS)

 Fix streaming job failures with WindowsResourceCalculatorPlugin
 ---

 Key: MAPREDUCE-4369
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4369
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Bikas Saha
Assignee: Bikas Saha

 Some streaming jobs use local mode job runs that do not start tasks trackers. 
 In these cases, the jvm context is not setup and hence local mode execution 
 causes the code to crash.
 Fix is to not not use ResourceCalculatorPlugin in such cases or make the 
 local job run creating dummy jvm contexts. Choosing the first option because 
 thats the current implicit behavior in Linux. The ProcfsBasedProcessTree 
 (used inside the LinuxResourceCalculatorPlugin) does no real work when the 
 process pid is not setup correctly. This is what happens when local job mode 
 runs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4369) Fix streaming job failures with WindowsResourceCalculatorPlugin

2012-06-25 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated MAPREDUCE-4369:
--

Attachment: MAPREDUCE-4369.branch-1-win.1.patch

 Fix streaming job failures with WindowsResourceCalculatorPlugin
 ---

 Key: MAPREDUCE-4369
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4369
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: MAPREDUCE-4369.branch-1-win.1.patch


 Some streaming jobs use local mode job runs that do not start tasks trackers. 
 In these cases, the jvm context is not setup and hence local mode execution 
 causes the code to crash.
 Fix is to not not use ResourceCalculatorPlugin in such cases or make the 
 local job run creating dummy jvm contexts. Choosing the first option because 
 thats the current implicit behavior in Linux. The ProcfsBasedProcessTree 
 (used inside the LinuxResourceCalculatorPlugin) does no real work when the 
 process pid is not setup correctly. This is what happens when local job mode 
 runs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4317) Job view ACL checks are too permissive

2012-06-25 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4317:


Attachment: MR-4317.patch

Updated the patch as per Alejandro's suggestions:

1. Corrected test for unauthenticated user.
2. Removed the check for user.equals(null).

Sorry for not realizing this earlier.

Thanks.

 Job view ACL checks are too permissive
 --

 Key: MAPREDUCE-4317
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4317
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.0.3
Reporter: Harsh J
Assignee: Karthik Kambatla
 Attachments: MR-4317.patch, MR-4317.patch


 The class that does view-based checks, JSPUtil.JobWithViewAccessCheck, has 
 the following internal member:
 {code}private boolean isViewAllowed = true;{code}
 Note that its true.
 Now, in the method that sets proper view-allowed rights, has:
 {code}
 if (user != null  job != null  jt.areACLsEnabled()) {
   final UserGroupInformation ugi =
 UserGroupInformation.createRemoteUser(user);
   try {
 ugi.doAs(new PrivilegedExceptionActionVoid() {
   public Void run() throws IOException, ServletException {
 // checks job view permission
 jt.getACLsManager().checkAccess(job, ugi,
 Operation.VIEW_JOB_DETAILS);
 return null;
   }
 });
   } catch (AccessControlException e) {
 String errMsg = User  + ugi.getShortUserName() +
  failed to view  + jobid + !brbr + e.getMessage() +
 hra href=\jobtracker.jsp\Go back to JobTracker/abr;
 JSPUtil.setErrorAndForward(errMsg, request, response);
 myJob.setViewAccess(false);
   } catch (InterruptedException e) {
 String errMsg =  Interrupted while trying to access  + jobid +
 hra href=\jobtracker.jsp\Go back to JobTracker/abr;
 JSPUtil.setErrorAndForward(errMsg, request, response);
 myJob.setViewAccess(false);
   }
 }
 return myJob;
 {code}
 In the above snippet, you can notice that if user==null, which can happen if 
 user is not http-authenticated (as its got via request.getRemoteUser()), can 
 lead to the view being visible since the default is true and we didn't toggle 
 the view to false for user == null case.
 Ideally the default of the view job ACL must be false, or we need an else 
 clause that sets the view rights to false in case of a failure to find the 
 user ID.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient

2012-06-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401102#comment-13401102
 ] 

Hadoop QA commented on MAPREDUCE-4346:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12533406/MAPREDUCE-4346_rev3.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2512//console

This message is automatically generated.

 Adding a refined version of JobTracker.getAllJobs() and exposing through the 
 JobClient
 --

 Key: MAPREDUCE-4346
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, 
 MAPREDUCE-4346_rev3.patch


 The current implementation for JobTracker.getAllJobs() returns all submitted 
 jobs in any state, in addition to retired jobs. This list can be long and 
 represents an unneeded overhead especially in the case of clients only 
 interested in jobs in specific state(s). 
 It is beneficial to include a refined version where only jobs having specific 
 statuses are returned and retired jobs are optional to include. 
 I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4368) TaskRunner fails to start jars when the java.library.path contains a quoted path with embedded spaces

2012-06-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401104#comment-13401104
 ] 

Hadoop QA commented on MAPREDUCE-4368:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12533410/TaskRunner.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2511//console

This message is automatically generated.

 TaskRunner fails to start jars when the java.library.path contains a quoted 
 path with embedded spaces
 -

 Key: MAPREDUCE-4368
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4368
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 1-win
 Environment: on Windows: 
 set PATH=%PATH%;C:\this memorable place.
Reporter: John Gordon
  Labels: newbie, patch
 Attachments: TaskRunner.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 TaskRunner splits arguments by space before it adds them back to the vargs 
 list, so it loses all context of quote escaped strings with embedded spaces.  
 This gets fixed up later by wrapping all arguments with  -- so you get 
 something like java -Dopt=value.  This is problematic for paths with 
 embedded spaces, where we end up creating -Dopt=first part last part. 
  To java, the jar being run is last part.  So with the environment above, you 
 will see ClassNoDefFoundError: memorable and the jar will fail to start.  
 In this particular case, we know that java.libarary.path contains paths and 
 the tests often use %PATH% to seed this, so the fix is to remove embedded 
 quotes in listed path elements because we know the aggregate will be quoted 
 when the JVM is started.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4317) Job view ACL checks are too permissive

2012-06-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401103#comment-13401103
 ] 

Hadoop QA commented on MAPREDUCE-4317:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12533419/MR-4317.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2510//console

This message is automatically generated.

 Job view ACL checks are too permissive
 --

 Key: MAPREDUCE-4317
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4317
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Affects Versions: 1.0.3
Reporter: Harsh J
Assignee: Karthik Kambatla
 Attachments: MR-4317.patch, MR-4317.patch


 The class that does view-based checks, JSPUtil.JobWithViewAccessCheck, has 
 the following internal member:
 {code}private boolean isViewAllowed = true;{code}
 Note that its true.
 Now, in the method that sets proper view-allowed rights, has:
 {code}
 if (user != null  job != null  jt.areACLsEnabled()) {
   final UserGroupInformation ugi =
 UserGroupInformation.createRemoteUser(user);
   try {
 ugi.doAs(new PrivilegedExceptionActionVoid() {
   public Void run() throws IOException, ServletException {
 // checks job view permission
 jt.getACLsManager().checkAccess(job, ugi,
 Operation.VIEW_JOB_DETAILS);
 return null;
   }
 });
   } catch (AccessControlException e) {
 String errMsg = User  + ugi.getShortUserName() +
  failed to view  + jobid + !brbr + e.getMessage() +
 hra href=\jobtracker.jsp\Go back to JobTracker/abr;
 JSPUtil.setErrorAndForward(errMsg, request, response);
 myJob.setViewAccess(false);
   } catch (InterruptedException e) {
 String errMsg =  Interrupted while trying to access  + jobid +
 hra href=\jobtracker.jsp\Go back to JobTracker/abr;
 JSPUtil.setErrorAndForward(errMsg, request, response);
 myJob.setViewAccess(false);
   }
 }
 return myJob;
 {code}
 In the above snippet, you can notice that if user==null, which can happen if 
 user is not http-authenticated (as its got via request.getRemoteUser()), can 
 lead to the view being visible since the default is true and we didn't toggle 
 the view to false for user == null case.
 Ideally the default of the view job ACL must be false, or we need an else 
 clause that sets the view rights to false in case of a failure to find the 
 user ID.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4368) TaskRunner fails to start jars when the java.library.path contains a quoted path with embedded spaces

2012-06-25 Thread John Gordon (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Gordon updated MAPREDUCE-4368:
---

Attachment: (was: TaskRunner.patch)

 TaskRunner fails to start jars when the java.library.path contains a quoted 
 path with embedded spaces
 -

 Key: MAPREDUCE-4368
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4368
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 1-win
 Environment: on Windows: 
 set PATH=%PATH%;C:\this memorable place.
Reporter: John Gordon
  Labels: newbie, patch
 Attachments: TaskRunner.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 TaskRunner splits arguments by space before it adds them back to the vargs 
 list, so it loses all context of quote escaped strings with embedded spaces.  
 This gets fixed up later by wrapping all arguments with  -- so you get 
 something like java -Dopt=value.  This is problematic for paths with 
 embedded spaces, where we end up creating -Dopt=first part last part. 
  To java, the jar being run is last part.  So with the environment above, you 
 will see ClassNoDefFoundError: memorable and the jar will fail to start.  
 In this particular case, we know that java.libarary.path contains paths and 
 the tests often use %PATH% to seed this, so the fix is to remove embedded 
 quotes in listed path elements because we know the aggregate will be quoted 
 when the JVM is started.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4368) TaskRunner fails to start jars when the java.library.path contains a quoted path with embedded spaces

2012-06-25 Thread John Gordon (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Gordon updated MAPREDUCE-4368:
---

Attachment: TaskRunner.patch

Updated patch file with full paths, previous patch was taken from the mapred 
subdirectory.

 TaskRunner fails to start jars when the java.library.path contains a quoted 
 path with embedded spaces
 -

 Key: MAPREDUCE-4368
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4368
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 1-win
 Environment: on Windows: 
 set PATH=%PATH%;C:\this memorable place.
Reporter: John Gordon
  Labels: newbie, patch
 Attachments: TaskRunner.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 TaskRunner splits arguments by space before it adds them back to the vargs 
 list, so it loses all context of quote escaped strings with embedded spaces.  
 This gets fixed up later by wrapping all arguments with  -- so you get 
 something like java -Dopt=value.  This is problematic for paths with 
 embedded spaces, where we end up creating -Dopt=first part last part. 
  To java, the jar being run is last part.  So with the environment above, you 
 will see ClassNoDefFoundError: memorable and the jar will fail to start.  
 In this particular case, we know that java.libarary.path contains paths and 
 the tests often use %PATH% to seed this, so the fix is to remove embedded 
 quotes in listed path elements because we know the aggregate will be quoted 
 when the JVM is started.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4368) TaskRunner fails to start jars when the java.library.path contains a quoted path with embedded spaces

2012-06-25 Thread John Gordon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401116#comment-13401116
 ] 

John Gordon commented on MAPREDUCE-4368:


To explain why there are no tests added with this change -- existing commit 
tests find this issue before the patch on Windows if you add a quoted, 
space-separated, string to your PATH before running the tests.

 TaskRunner fails to start jars when the java.library.path contains a quoted 
 path with embedded spaces
 -

 Key: MAPREDUCE-4368
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4368
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 1-win
 Environment: on Windows: 
 set PATH=%PATH%;C:\this memorable place.
Reporter: John Gordon
  Labels: newbie, patch
 Attachments: TaskRunner.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 TaskRunner splits arguments by space before it adds them back to the vargs 
 list, so it loses all context of quote escaped strings with embedded spaces.  
 This gets fixed up later by wrapping all arguments with  -- so you get 
 something like java -Dopt=value.  This is problematic for paths with 
 embedded spaces, where we end up creating -Dopt=first part last part. 
  To java, the jar being run is last part.  So with the environment above, you 
 will see ClassNoDefFoundError: memorable and the jar will fail to start.  
 In this particular case, we know that java.libarary.path contains paths and 
 the tests often use %PATH% to seed this, so the fix is to remove embedded 
 quotes in listed path elements because we know the aggregate will be quoted 
 when the JVM is started.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient

2012-06-25 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-4346:
-

Status: Open  (was: Patch Available)

Ahmed, what is the use case? It's not your fault, but the API signature seems 
very odd for us to add as a public API to the JobClient, particularly since 
it's going to be hard to support this for YARN as a public api.

 Adding a refined version of JobTracker.getAllJobs() and exposing through the 
 JobClient
 --

 Key: MAPREDUCE-4346
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, 
 MAPREDUCE-4346_rev3.patch


 The current implementation for JobTracker.getAllJobs() returns all submitted 
 jobs in any state, in addition to retired jobs. This list can be long and 
 represents an unneeded overhead especially in the case of clients only 
 interested in jobs in specific state(s). 
 It is beneficial to include a refined version where only jobs having specific 
 statuses are returned and retired jobs are optional to include. 
 I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient

2012-06-25 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401142#comment-13401142
 ] 

Alejandro Abdelnur commented on MAPREDUCE-4346:
---

It looks good, I just wonder why the retired flag overrides the status filter, 
wouldn't make sense to still respect the status filter and if you want all jobs 
just use an 'all' status filter?

 Adding a refined version of JobTracker.getAllJobs() and exposing through the 
 JobClient
 --

 Key: MAPREDUCE-4346
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, 
 MAPREDUCE-4346_rev3.patch


 The current implementation for JobTracker.getAllJobs() returns all submitted 
 jobs in any state, in addition to retired jobs. This list can be long and 
 represents an unneeded overhead especially in the case of clients only 
 interested in jobs in specific state(s). 
 It is beneficial to include a refined version where only jobs having specific 
 statuses are returned and retired jobs are optional to include. 
 I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient

2012-06-25 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401143#comment-13401143
 ] 

Arun C Murthy commented on MAPREDUCE-4346:
--

IAC, it seems the implementation could be improved a fair bit by removing the 
inner loop - put the job states in a HashSet and check for existence, rather 
than an inner loop.

 Adding a refined version of JobTracker.getAllJobs() and exposing through the 
 JobClient
 --

 Key: MAPREDUCE-4346
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, 
 MAPREDUCE-4346_rev3.patch


 The current implementation for JobTracker.getAllJobs() returns all submitted 
 jobs in any state, in addition to retired jobs. This list can be long and 
 represents an unneeded overhead especially in the case of clients only 
 interested in jobs in specific state(s). 
 It is beneficial to include a refined version where only jobs having specific 
 statuses are returned and retired jobs are optional to include. 
 I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Reopened] (MAPREDUCE-4355) Add JobStatus getJobStatus(JobID) to JobClient.

2012-06-25 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy reopened MAPREDUCE-4355:
--


I'm sorry, but we *cannot* make an incompatible change to JobClient which is a 
public API, at least in hadoop-1.x

-1 on this change.

This will break a number of existing apis.

It seems we cud just add start-time to RunningJob if necessary.

Alejandro - do you mind reverting this change since it breaks compatibility? 
Thanks.

 Add JobStatus getJobStatus(JobID) to JobClient.
 ---

 Key: MAPREDUCE-4355
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv1, mrv2
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Fix For: 1.1.0, 2.0.1-alpha

 Attachments: MR-4355_mr1.patch, MR-4355_mr2.patch


 To read the start-time of a particular job, one should not need to 
 getAllJobs() and iterate through them.
 getJob(JobID) returns RunningJob, which doesn't hold the job's start time.
 Hence, we need to add getJobStatus(JobID) to the API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4355) Add JobStatus getJobStatus(JobID) to JobClient.

2012-06-25 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401146#comment-13401146
 ] 

Arun C Murthy commented on MAPREDUCE-4355:
--

My bad, I read the patch wrong as removing getJob. Apologies for the noise.

 Add JobStatus getJobStatus(JobID) to JobClient.
 ---

 Key: MAPREDUCE-4355
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv1, mrv2
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Fix For: 1.1.0, 2.0.1-alpha

 Attachments: MR-4355_mr1.patch, MR-4355_mr2.patch


 To read the start-time of a particular job, one should not need to 
 getAllJobs() and iterate through them.
 getJob(JobID) returns RunningJob, which doesn't hold the job's start time.
 Hence, we need to add getJobStatus(JobID) to the API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4355) Add JobStatus getJobStatus(JobID) to JobClient.

2012-06-25 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401147#comment-13401147
 ] 

Arun C Murthy commented on MAPREDUCE-4355:
--

IAC, we could avoid the new API by adding startTime to RunningJob if that is 
the current drawback?

 Add JobStatus getJobStatus(JobID) to JobClient.
 ---

 Key: MAPREDUCE-4355
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv1, mrv2
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Fix For: 1.1.0, 2.0.1-alpha

 Attachments: MR-4355_mr1.patch, MR-4355_mr2.patch


 To read the start-time of a particular job, one should not need to 
 getAllJobs() and iterate through them.
 getJob(JobID) returns RunningJob, which doesn't hold the job's start time.
 Hence, we need to add getJobStatus(JobID) to the API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4368) TaskRunner fails to start jars when the java.library.path contains a quoted path with embedded spaces

2012-06-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401177#comment-13401177
 ] 

Hadoop QA commented on MAPREDUCE-4368:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12533422/TaskRunner.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2513//console

This message is automatically generated.

 TaskRunner fails to start jars when the java.library.path contains a quoted 
 path with embedded spaces
 -

 Key: MAPREDUCE-4368
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4368
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Affects Versions: 1-win
 Environment: on Windows: 
 set PATH=%PATH%;C:\this memorable place.
Reporter: John Gordon
  Labels: newbie, patch
 Attachments: TaskRunner.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 TaskRunner splits arguments by space before it adds them back to the vargs 
 list, so it loses all context of quote escaped strings with embedded spaces.  
 This gets fixed up later by wrapping all arguments with  -- so you get 
 something like java -Dopt=value.  This is problematic for paths with 
 embedded spaces, where we end up creating -Dopt=first part last part. 
  To java, the jar being run is last part.  So with the environment above, you 
 will see ClassNoDefFoundError: memorable and the jar will fail to start.  
 In this particular case, we know that java.libarary.path contains paths and 
 the tests often use %PATH% to seed this, so the fix is to remove embedded 
 quotes in listed path elements because we know the aggregate will be quoted 
 when the JVM is started.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira