[jira] [Created] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient

2012-06-18 Thread Ahmed Radwan (JIRA)
Ahmed Radwan created MAPREDUCE-4346:
---

 Summary: Adding a refined version of JobTracker.getAllJobs() and 
exposing through the JobClient
 Key: MAPREDUCE-4346
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan


The current implementation for JobTracker.getAllJobs() returns all submitted 
jobs in any state, in addition to retired jobs. This list can be long and 
represents an unneeded overhead especially in the case of clients only 
interested in jobs in specific state(s). 

It is beneficial to include a refined version where only jobs having specific 
statuses are returned and retired jobs are optional to include. 

I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient

2012-06-18 Thread Ahmed Radwan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Radwan updated MAPREDUCE-4346:


Attachment: MAPREDUCE-4346.patch

 Adding a refined version of JobTracker.getAllJobs() and exposing through the 
 JobClient
 --

 Key: MAPREDUCE-4346
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4346.patch


 The current implementation for JobTracker.getAllJobs() returns all submitted 
 jobs in any state, in addition to retired jobs. This list can be long and 
 represents an unneeded overhead especially in the case of clients only 
 interested in jobs in specific state(s). 
 It is beneficial to include a refined version where only jobs having specific 
 statuses are returned and retired jobs are optional to include. 
 I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient

2012-06-18 Thread Ahmed Radwan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Radwan updated MAPREDUCE-4346:


Status: Patch Available  (was: Open)

 Adding a refined version of JobTracker.getAllJobs() and exposing through the 
 JobClient
 --

 Key: MAPREDUCE-4346
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4346.patch


 The current implementation for JobTracker.getAllJobs() returns all submitted 
 jobs in any state, in addition to retired jobs. This list can be long and 
 represents an unneeded overhead especially in the case of clients only 
 interested in jobs in specific state(s). 
 It is beneficial to include a refined version where only jobs having specific 
 statuses are returned and retired jobs are optional to include. 
 I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient

2012-06-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13395486#comment-13395486
 ] 

Hadoop QA commented on MAPREDUCE-4346:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12532379/MAPREDUCE-4346.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2465//console

This message is automatically generated.

 Adding a refined version of JobTracker.getAllJobs() and exposing through the 
 JobClient
 --

 Key: MAPREDUCE-4346
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4346.patch


 The current implementation for JobTracker.getAllJobs() returns all submitted 
 jobs in any state, in addition to retired jobs. This list can be long and 
 represents an unneeded overhead especially in the case of clients only 
 interested in jobs in specific state(s). 
 It is beneficial to include a refined version where only jobs having specific 
 statuses are returned and retired jobs are optional to include. 
 I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient

2012-06-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13395490#comment-13395490
 ] 

Hadoop QA commented on MAPREDUCE-4346:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12532379/MAPREDUCE-4346.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2466//console

This message is automatically generated.

 Adding a refined version of JobTracker.getAllJobs() and exposing through the 
 JobClient
 --

 Key: MAPREDUCE-4346
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4346.patch


 The current implementation for JobTracker.getAllJobs() returns all submitted 
 jobs in any state, in addition to retired jobs. This list can be long and 
 represents an unneeded overhead especially in the case of clients only 
 interested in jobs in specific state(s). 
 It is beneficial to include a refined version where only jobs having specific 
 statuses are returned and retired jobs are optional to include. 
 I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-4114) saveVersion.sh fails if build directory contains space

2012-06-18 Thread Radim Kolar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Radim Kolar resolved MAPREDUCE-4114.


Resolution: Duplicate

 saveVersion.sh fails if build directory contains space 
 ---

 Key: MAPREDUCE-4114
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4114
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: build
Affects Versions: 0.23.2
 Environment: FreeBSD 8.2, 64bit
Reporter: Radim Kolar

 if you rename build directory to something without space like /tmp/hadoop 
 then it works
 [INFO]
  
 [INFO] 
 
 [INFO] Building hadoop-yarn-common 0.23.3-SNAPSHOT
 [INFO] 
 
 [INFO] 
 [INFO] --- maven-antrun-plugin:1.6:run (create-testdirs) @ hadoop-yarn-common 
 ---
 [INFO] Executing tasks
 main:
 [INFO] Executed tasks
 [INFO] 
 [INFO] --- maven-antrun-plugin:1.6:run 
 (create-protobuf-generated-sources-directory) @ hadoop-yarn-common ---
 [INFO] Executing tasks
 main:
 [INFO] Executed tasks
 [INFO] 
 [INFO] --- exec-maven-plugin:1.2:exec (generate-sources) @ hadoop-yarn-common 
 ---
 [INFO] 
 [INFO] --- exec-maven-plugin:1.2:exec (generate-version) @ hadoop-yarn-common 
 ---
 scripts/saveVersion.sh: cannot create /usr/local/jboss/.jenkins/jobs/Hadoop 
 0.23 
 branch/workspace/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/target/generated-sources/version/org/apache/hadoop/yarn/package-info.java:
  No such file or directory
 [JENKINS] Archiving /usr/local/jboss/.jenkins/jobs/Hadoop 0.23 
 branch/workspace/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/pom.xml
  to /usr/local/jboss/.jenkins/jobs/Hadoop 0.23 
 branch/modules/org.apache.hadoop$hadoop-yarn-common/builds/2012-04-05_19-44-16/archive/org.apache.hadoop/hadoop-yarn-common/0.23.3-SNAPSHOT/hadoop-yarn-common-0.23.3-SNAPSHOT.pom
 [INFO] 
 
 [INFO] Reactor Summary:
 [INFO] 
 [INFO] Apache Hadoop Main  SUCCESS [5.124s]
 [INFO] Apache Hadoop Project POM . SUCCESS [1.692s]
 [INFO] Apache Hadoop Annotations . SUCCESS [1.672s]
 [INFO] Apache Hadoop Project Dist POM  SUCCESS [1.823s]
 [INFO] Apache Hadoop Assemblies .. SUCCESS [0.796s]
 [INFO] Apache Hadoop Auth  SUCCESS [2.456s]
 [INFO] Apache Hadoop Auth Examples ... SUCCESS [1.093s]
 [INFO] Apache Hadoop Common .. SUCCESS [23.648s]
 [INFO] Apache Hadoop Common Project .. SUCCESS [0.434s]
 [INFO] Apache Hadoop HDFS  SUCCESS [22.124s]
 [INFO] Apache Hadoop HttpFS .. SUCCESS [3.251s]
 [INFO] Apache Hadoop HDFS Project  SUCCESS [0.443s]
 [INFO] hadoop-yarn ... SUCCESS [1.175s]
 [INFO] hadoop-yarn-api ... SUCCESS [7.049s]
 [INFO] hadoop-yarn-common  FAILURE [5.565s]
 [INFO] hadoop-yarn-server  SKIPPED
 [INFO] hadoop-yarn-server-common . SKIPPED
 [INFO] hadoop-yarn-server-nodemanager  SKIPPED
 [INFO] hadoop-yarn-server-web-proxy .. SKIPPED
 [INFO] hadoop-yarn-server-resourcemanager  SKIPPED
 [INFO] hadoop-yarn-server-tests .. SKIPPED
 [INFO] hadoop-mapreduce-client ... SKIPPED
 [INFO] hadoop-mapreduce-client-core .. SKIPPED
 [INFO] hadoop-yarn-applications .. SKIPPED
 [INFO] hadoop-yarn-applications-distributedshell . SKIPPED
 [INFO] hadoop-yarn-site .. SKIPPED
 [INFO] hadoop-mapreduce-client-common  SKIPPED
 [INFO] hadoop-mapreduce-client-shuffle ... SKIPPED
 [INFO] hadoop-mapreduce-client-app ... SKIPPED
 [INFO] hadoop-mapreduce-client-hs  SKIPPED
 [INFO] hadoop-mapreduce-client-jobclient . SKIPPED
 [INFO] Apache Hadoop MapReduce Examples .. SKIPPED
 [INFO] hadoop-mapreduce .. SKIPPED
 [INFO] Apache Hadoop MapReduce Streaming . SKIPPED
 [INFO] Apache Hadoop Distributed Copy  SKIPPED
 [INFO] Apache Hadoop Archives  SKIPPED
 [INFO] Apache Hadoop Rumen ... SKIPPED
 [INFO] Apache Hadoop 

[jira] [Commented] (MAPREDUCE-3968) add support for getNumMapTasks() into mapreduce JobContext

2012-06-18 Thread Radim Kolar (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13395756#comment-13395756
 ] 

Radim Kolar commented on MAPREDUCE-3968:


Yes, i need to know number of splits.

 add support for getNumMapTasks() into mapreduce JobContext
 --

 Key: MAPREDUCE-3968
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3968
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Affects Versions: trunk
 Environment: hadoop 0.22
Reporter: Radim Kolar
Priority: Minor
 Attachments: MAPREDUCE-3968.patch


 In old mapred api there was way to query number of mappers:
 job.getNumMapTasks())
 No such function exists in new mapreduce api

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4031) Node Manager hangs on shut down

2012-06-18 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated MAPREDUCE-4031:
-

Status: Open  (was: Patch Available)

 Node Manager hangs on shut down
 ---

 Key: MAPREDUCE-4031
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4031
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, nodemanager
Affects Versions: 0.23.2, 2.0.1-alpha, 3.0.0
Reporter: Devaraj K
Assignee: Devaraj K
Priority: Critical
 Attachments: MAPREDUCE-4031.patch, MAPREDUCE-4031.patch, 
 nm-threaddump.out


 I have the MAPREDUCE-3862 changes which fixed this issue earlier and 
 yarn.nodemanager.delete.debug-delay-sec set to default value but still 
 getting this issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4031) Node Manager hangs on shut down

2012-06-18 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated MAPREDUCE-4031:
-

Attachment: MAPREDUCE-4031.patch

 Node Manager hangs on shut down
 ---

 Key: MAPREDUCE-4031
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4031
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, nodemanager
Affects Versions: 0.23.2, 2.0.1-alpha, 3.0.0
Reporter: Devaraj K
Assignee: Devaraj K
Priority: Critical
 Attachments: MAPREDUCE-4031.patch, MAPREDUCE-4031.patch, 
 MAPREDUCE-4031.patch, nm-threaddump.out


 I have the MAPREDUCE-3862 changes which fixed this issue earlier and 
 yarn.nodemanager.delete.debug-delay-sec set to default value but still 
 getting this issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4031) Node Manager hangs on shut down

2012-06-18 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated MAPREDUCE-4031:
-

Status: Patch Available  (was: Open)

Thanks a lot Sid for looking into the patch.

The above test failures are not related to the patch. Resubmitting the same 
patch to trigger Jenkins.

 Node Manager hangs on shut down
 ---

 Key: MAPREDUCE-4031
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4031
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, nodemanager
Affects Versions: 0.23.2, 2.0.1-alpha, 3.0.0
Reporter: Devaraj K
Assignee: Devaraj K
Priority: Critical
 Attachments: MAPREDUCE-4031.patch, MAPREDUCE-4031.patch, 
 MAPREDUCE-4031.patch, nm-threaddump.out


 I have the MAPREDUCE-3862 changes which fixed this issue earlier and 
 yarn.nodemanager.delete.debug-delay-sec set to default value but still 
 getting this issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4347) joined PhD. Intrested to do research in cloud especially in Hadoop

2012-06-18 Thread Suresh S (JIRA)
Suresh S created MAPREDUCE-4347:
---

 Summary: joined PhD. Intrested to do research in cloud especially 
in Hadoop
 Key: MAPREDUCE-4347
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4347
 Project: Hadoop Map/Reduce
  Issue Type: Wish
Reporter: Suresh S




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4347) joined PhD. Intrested to do research in cloud especially in Hadoop. need suggession for problems to work.

2012-06-18 Thread Suresh S (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh S updated MAPREDUCE-4347:


Summary: joined PhD. Intrested to do research in cloud especially in 
Hadoop. need suggession for problems to work.  (was: joined PhD. Intrested to 
do research in cloud especially in Hadoop)

 joined PhD. Intrested to do research in cloud especially in Hadoop. need 
 suggession for problems to work.
 -

 Key: MAPREDUCE-4347
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4347
 Project: Hadoop Map/Reduce
  Issue Type: Wish
Reporter: Suresh S



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4328) Add the option to quiesce the JobTracker

2012-06-18 Thread Kang Xiao (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13395841#comment-13395841
 ] 

Kang Xiao commented on MAPREDUCE-4328:
--

It is useful in some condition such as NN is down. Actually we find a way to 
achieve the first goal by updating the fair scheduler's conf set each pool's 
max share to be zero. 
The second goal will protect the job from going to FAILED. But it seems so 
possible for a job to go to FAILED since no more task scheduled.

It may be more simple to just not invoke assignTasks() in JobTracker to 
implement the first goal. And it will not burden the scheduler implementation 
since 'safemode' is a small probability event.

 Add the option to quiesce the JobTracker
 

 Key: MAPREDUCE-4328
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4328
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Affects Versions: 1.0.3
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Attachments: MAPREDUCE-4328.patch


 In several failure scenarios it would be very handy to have an option to 
 quiesce the JobTracker.
 Recently, we saw a case where the NameNode had to be rebooted at a customer 
 due to a random hardware failure - in such a case it would have been nice to 
 not lose jobs by quiescing the JobTracker.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-4347) joined PhD. Intrested to do research in cloud especially in Hadoop. need suggession for problems to work.

2012-06-18 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved MAPREDUCE-4347.


Resolution: Invalid

The JIRA exists to track issues with the project, not for discussions such as 
these.

Please send your email to mapreduce-...@hadoop.apache.org instead. Thanks!

 joined PhD. Intrested to do research in cloud especially in Hadoop. need 
 suggession for problems to work.
 -

 Key: MAPREDUCE-4347
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4347
 Project: Hadoop Map/Reduce
  Issue Type: Wish
Reporter: Suresh S



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4039) Sort Avoidance

2012-06-18 Thread Kang Xiao (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13395855#comment-13395855
 ] 

Kang Xiao commented on MAPREDUCE-4039:
--

@Schubert, could you give some typical applications that benefit from sort 
avoidance? It seems that using this feature simple aggregation app such as 
wordcount will use more memory to wait for all keys processed.

 Sort Avoidance
 --

 Key: MAPREDUCE-4039
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4039
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv2
Affects Versions: 0.23.2
Reporter: anty.rao
Assignee: anty
Priority: Minor
 Fix For: 0.23.2

 Attachments: MAPREDUCE-4039-branch-0.23.2.patch, 
 MAPREDUCE-4039-branch-0.23.2.patch


 Inspired by 
 [Tenzing|http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//pubs/archive/37200.pdf],
  in 5.1 MapReduce Enhanceemtns:
 {quote}*Sort Avoidance*. Certain operators such as hash join
 and hash aggregation require shuffling, but not sorting. The
 MapReduce API was enhanced to automatically turn off
 sorting for these operations. When sorting is turned off, the
 mapper feeds data to the reducer which directly passes the
 data to the Reduce() function bypassing the intermediate
 sorting step. This makes many SQL operators significantly
 more ecient.{quote}
 There are a lot of applications which need aggregation only, not 
 sorting.Using sorting to achieve aggregation is costly and inefficient. 
 Without sorting, up application can make use of hash table or hash map to do 
 aggregation efficiently.But application should bear in mind that reduce 
 memory is limited, itself is committed to manage memory of reduce, guard 
 against out of memory. Map-side combiner is not supported, you can also do 
 hash aggregation in map side  as a workaround.
 the following is the main points of sort avoidance implementation
 # add a configuration parameter ??mapreduce.sort.avoidance??, boolean type, 
 to turn on/off sort avoidance workflow.Two type of workflow are coexist 
 together.
 # key/value pairs emitted by map function is sorted by partition only, using 
 a more efficient sorting algorithm: counting sort.
 # map-side merge, use a kind of byte merge, which just concatenate bytes from 
 generated spills, read in bytes, write out bytes, without overhead of 
 key/value serialization/deserailization, comparison, which current version 
 incurs.
 # reduce can start up as soon as there is any map output available, in 
 contrast to sort workflow which must wait until all map outputs are fetched 
 and merged.
 # map output in memory can be directly consumed by reduce.When reduce can't 
 catch up with the speed of incoming map outputs, in-memory merge thread will 
 kick in, merging in-memory map outputs onto disk.
 # sequentially read in on-disk files to feed reduce, in contrast to currently 
 implementation which read multiple files concurrently, result in many disk 
 seek. Map output in memory take precedence over on disk files in feeding 
 reduce function.
 I have already implement this feature based on hadoop CDH3U3 and done some 
 performance evaluation, you can reference to 
 [https://github.com/hanborq/hadoop] for details. Now,I'm willing to port it 
 into yarn. Welcome for commenting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4343) ZK recovery support for ResourceManager

2012-06-18 Thread Sharad Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13395869#comment-13395869
 ] 

Sharad Agarwal commented on MAPREDUCE-4343:
---

There is already MAPREDUCE-2713 for this. Some ZK code may be lying around but 
it is not implemented as yet.

can this be marked as duplicate ?

 ZK recovery support for ResourceManager
 ---

 Key: MAPREDUCE-4343
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4343
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Harsh J
 Attachments: MR-4343.1.patch


 MAPREDUCE-279 included bits and pieces of possible ZK integration for YARN's 
 RM, but looks like it failed to complete it (for scalability reasons? etc?) 
 and there seems to be no JIRA tracking this feature that has been already 
 claimed publicly as a good part about YARN.
 If it did complete it, we should document how to use it. Setting the 
 following only yields:
 {code}
 property
 nameyarn.resourcemanager.store.class/name
 valueorg.apache.hadoop.yarn.server.resourcemanager.recovery.ZKStore/value
 /property
 property
 nameyarn.resourcemanager.zookeeper-store.address/name
 valuetest.vm:2181/yarn-recovery-store/value
 /property
 {code}
 {code}
 Error starting ResourceManager
 java.lang.RuntimeException: java.lang.NoSuchMethodException: 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKStore.init()
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:128)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.StoreFactory.getStore(StoreFactory.java:32)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:621)
 Caused by: java.lang.NoSuchMethodException: 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKStore.init()
 at java.lang.Class.getConstructor0(Class.java:2706)
 at java.lang.Class.getDeclaredConstructor(Class.java:1985)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:122)
 ... 2 more
 {code}
 This JIRA is hence filed to track the addition/completion of recovery via ZK.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4348) JobSubmissionProtocol should be made public, not package private

2012-06-18 Thread Steve Loughran (JIRA)
Steve Loughran created MAPREDUCE-4348:
-

 Summary: JobSubmissionProtocol should be made public, not package 
private
 Key: MAPREDUCE-4348
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4348
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Affects Versions: 1.0.3
Reporter: Steve Loughran
Priority: Minor


The JobSubmissionProtocol interface is package private, yet it is the only way 
to remotely query the status of the JT or the cluster. 

Even if Job Submission is considered private, probing JT state shouldn't be.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4341) add types to capacity scheduler properties documentation

2012-06-18 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13395887#comment-13395887
 ] 

Thomas Graves commented on MAPREDUCE-4341:
--

can you add it for max capacity also please.

 add types to capacity scheduler properties documentation
 

 Key: MAPREDUCE-4341
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4341
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/capacity-sched, mrv2
Affects Versions: 0.23.3
Reporter: Thomas Graves
Assignee: Karthik Kambatla
 Attachments: MR-4341.patch


 MAPREDUCE-4311 is changing capacity/max capacity configuration to be floats. 
 We should document that in the capacity scheduler properties docs 
 (http://hadoop.apache.org/common/docs/r0.23.1/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html#Configuration).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4039) Sort Avoidance

2012-06-18 Thread anty.rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anty.rao updated MAPREDUCE-4039:


Attachment: IndexedCountingSortable.java

the missing file.

 Sort Avoidance
 --

 Key: MAPREDUCE-4039
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4039
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv2
Affects Versions: 0.23.2
Reporter: anty.rao
Assignee: anty
Priority: Minor
 Fix For: 0.23.2

 Attachments: IndexedCountingSortable.java, 
 MAPREDUCE-4039-branch-0.23.2.patch, MAPREDUCE-4039-branch-0.23.2.patch


 Inspired by 
 [Tenzing|http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//pubs/archive/37200.pdf],
  in 5.1 MapReduce Enhanceemtns:
 {quote}*Sort Avoidance*. Certain operators such as hash join
 and hash aggregation require shuffling, but not sorting. The
 MapReduce API was enhanced to automatically turn off
 sorting for these operations. When sorting is turned off, the
 mapper feeds data to the reducer which directly passes the
 data to the Reduce() function bypassing the intermediate
 sorting step. This makes many SQL operators significantly
 more ecient.{quote}
 There are a lot of applications which need aggregation only, not 
 sorting.Using sorting to achieve aggregation is costly and inefficient. 
 Without sorting, up application can make use of hash table or hash map to do 
 aggregation efficiently.But application should bear in mind that reduce 
 memory is limited, itself is committed to manage memory of reduce, guard 
 against out of memory. Map-side combiner is not supported, you can also do 
 hash aggregation in map side  as a workaround.
 the following is the main points of sort avoidance implementation
 # add a configuration parameter ??mapreduce.sort.avoidance??, boolean type, 
 to turn on/off sort avoidance workflow.Two type of workflow are coexist 
 together.
 # key/value pairs emitted by map function is sorted by partition only, using 
 a more efficient sorting algorithm: counting sort.
 # map-side merge, use a kind of byte merge, which just concatenate bytes from 
 generated spills, read in bytes, write out bytes, without overhead of 
 key/value serialization/deserailization, comparison, which current version 
 incurs.
 # reduce can start up as soon as there is any map output available, in 
 contrast to sort workflow which must wait until all map outputs are fetched 
 and merged.
 # map output in memory can be directly consumed by reduce.When reduce can't 
 catch up with the speed of incoming map outputs, in-memory merge thread will 
 kick in, merging in-memory map outputs onto disk.
 # sequentially read in on-disk files to feed reduce, in contrast to currently 
 implementation which read multiple files concurrently, result in many disk 
 seek. Map output in memory take precedence over on disk files in feeding 
 reduce function.
 I have already implement this feature based on hadoop CDH3U3 and done some 
 performance evaluation, you can reference to 
 [https://github.com/hanborq/hadoop] for details. Now,I'm willing to port it 
 into yarn. Welcome for commenting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4348) JobSubmissionProtocol should be made public, not package private

2012-06-18 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated MAPREDUCE-4348:
--

Attachment: MAPREDUCE-4348.patch

makes i/f public but marks as private and evolving. 

 JobSubmissionProtocol should be made public, not package private
 

 Key: MAPREDUCE-4348
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4348
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Affects Versions: 1.0.3
Reporter: Steve Loughran
Priority: Minor
 Attachments: MAPREDUCE-4348.patch

   Original Estimate: 0.5h
  Remaining Estimate: 0.5h

 The JobSubmissionProtocol interface is package private, yet it is the only 
 way to remotely query the status of the JT or the cluster. 
 Even if Job Submission is considered private, probing JT state shouldn't be.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4348) JobSubmissionProtocol should be made public, not package private

2012-06-18 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated MAPREDUCE-4348:
--

Assignee: Steve Loughran
Target Version/s: 1.1.0
  Status: Patch Available  (was: Open)

 JobSubmissionProtocol should be made public, not package private
 

 Key: MAPREDUCE-4348
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4348
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Affects Versions: 1.0.3
Reporter: Steve Loughran
Assignee: Steve Loughran
Priority: Minor
 Attachments: MAPREDUCE-4348.patch

   Original Estimate: 0.5h
  Remaining Estimate: 0.5h

 The JobSubmissionProtocol interface is package private, yet it is the only 
 way to remotely query the status of the JT or the cluster. 
 Even if Job Submission is considered private, probing JT state shouldn't be.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4039) Sort Avoidance

2012-06-18 Thread anty.rao (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13395916#comment-13395916
 ] 

anty.rao commented on MAPREDUCE-4039:
-

@Kang
Yes, you are right.
Using merge-sort to achieve aggregation maybe don't use so much memory as hash 
aggregation with this feature.But the process of merge-sort require much 
useless work to done, consume more resources, e.g. CPU, disk, network.
it's just a tradeoff according to your usecase, latency requirement, etc.

 Sort Avoidance
 --

 Key: MAPREDUCE-4039
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4039
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: mrv2
Affects Versions: 0.23.2
Reporter: anty.rao
Assignee: anty
Priority: Minor
 Fix For: 0.23.2

 Attachments: IndexedCountingSortable.java, 
 MAPREDUCE-4039-branch-0.23.2.patch, MAPREDUCE-4039-branch-0.23.2.patch


 Inspired by 
 [Tenzing|http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//pubs/archive/37200.pdf],
  in 5.1 MapReduce Enhanceemtns:
 {quote}*Sort Avoidance*. Certain operators such as hash join
 and hash aggregation require shuffling, but not sorting. The
 MapReduce API was enhanced to automatically turn off
 sorting for these operations. When sorting is turned off, the
 mapper feeds data to the reducer which directly passes the
 data to the Reduce() function bypassing the intermediate
 sorting step. This makes many SQL operators significantly
 more ecient.{quote}
 There are a lot of applications which need aggregation only, not 
 sorting.Using sorting to achieve aggregation is costly and inefficient. 
 Without sorting, up application can make use of hash table or hash map to do 
 aggregation efficiently.But application should bear in mind that reduce 
 memory is limited, itself is committed to manage memory of reduce, guard 
 against out of memory. Map-side combiner is not supported, you can also do 
 hash aggregation in map side  as a workaround.
 the following is the main points of sort avoidance implementation
 # add a configuration parameter ??mapreduce.sort.avoidance??, boolean type, 
 to turn on/off sort avoidance workflow.Two type of workflow are coexist 
 together.
 # key/value pairs emitted by map function is sorted by partition only, using 
 a more efficient sorting algorithm: counting sort.
 # map-side merge, use a kind of byte merge, which just concatenate bytes from 
 generated spills, read in bytes, write out bytes, without overhead of 
 key/value serialization/deserailization, comparison, which current version 
 incurs.
 # reduce can start up as soon as there is any map output available, in 
 contrast to sort workflow which must wait until all map outputs are fetched 
 and merged.
 # map output in memory can be directly consumed by reduce.When reduce can't 
 catch up with the speed of incoming map outputs, in-memory merge thread will 
 kick in, merging in-memory map outputs onto disk.
 # sequentially read in on-disk files to feed reduce, in contrast to currently 
 implementation which read multiple files concurrently, result in many disk 
 seek. Map output in memory take precedence over on disk files in feeding 
 reduce function.
 I have already implement this feature based on hadoop CDH3U3 and done some 
 performance evaluation, you can reference to 
 [https://github.com/hanborq/hadoop] for details. Now,I'm willing to port it 
 into yarn. Welcome for commenting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-4298) NodeManager crashed after running out of file descriptors

2012-06-18 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved MAPREDUCE-4298.
--

Resolution: Duplicate

dup of HADOOP-8495

 NodeManager crashed after running out of file descriptors
 -

 Key: MAPREDUCE-4298
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4298
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, nodemanager
Affects Versions: 0.23.3, 2.0.0-alpha, 3.0.0
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical
 Attachments: MAPREDUCE-4298.patch


 A node on one of our clusters fell over because it ran out of open file 
 descriptors.  Log details with stack traceback to follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4342) Distributed Cache gives inconsistent result if cache files get deleted from task tracker

2012-06-18 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13395942#comment-13395942
 ] 

Robert Joseph Evans commented on MAPREDUCE-4342:


A couple of comments.
 # Minor correction to the grammar. {code}LOG.warn(Local Cache is been 
deleted... Downloading the cache again);{code} should be {code}LOG.warn(Local 
Cache has been deleted... Downloading the cache again);{code} 
 # Please run test-patch on it and post the results.
 # I believe that this problem also exists in trunk and branch 2.  It would be 
good to investigate and possibly file a JIRA, or post a patch for them as well.

It looks good, but it is not perfect.  It will work in the case where a single 
base distributed cache file or directory was deleted, but it will not work in 
the case where a file was corrupted, where a file in a cache archive was 
deleted, where new files were added, etc.  I agree that we want to be able to 
deal with a file being removed, but I personally think that prevention is 
preferable to recovery, although it may not be as backwards compatible.  I 
would prefer to see all of the files created in the distributed cache be marked 
as read only.  If the files are part of a private cache and someone messes with 
them, by modifying the permissions then it is on their head, and they need to 
modify the original HDFS file to force it to download a new copy.

Checking for corruption in because of FS/Disk issues is a separate one that we 
probably want to also look into, now that the data in the distributed cache can 
live for long periods of time.

 Distributed Cache gives inconsistent result if cache files get deleted from 
 task tracker 
 -

 Key: MAPREDUCE-4342
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4342
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.0, 1.0.3, trunk
Reporter: Mayank Bansal
Assignee: Mayank Bansal
 Attachments: MAPREDUCE-4342-22-1.patch, MAPREDUCE-4342-22.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4348) JobSubmissionProtocol should be made public, not package private

2012-06-18 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13395952#comment-13395952
 ] 

Steve Loughran commented on MAPREDUCE-4348:
---

# no tests, this is a package scope change, not a new feature. 
# it is to be applied against the 1.x branch


 JobSubmissionProtocol should be made public, not package private
 

 Key: MAPREDUCE-4348
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4348
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Affects Versions: 1.0.3
Reporter: Steve Loughran
Assignee: Steve Loughran
Priority: Minor
 Attachments: MAPREDUCE-4348.patch

   Original Estimate: 0.5h
  Remaining Estimate: 0.5h

 The JobSubmissionProtocol interface is package private, yet it is the only 
 way to remotely query the status of the JT or the cluster. 
 Even if Job Submission is considered private, probing JT state shouldn't be.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4341) add types to capacity scheduler properties documentation

2012-06-18 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4341:


Status: Open  (was: Patch Available)

Will add the documentation for max-capacity as well, and upload another patch 
shortly.

 add types to capacity scheduler properties documentation
 

 Key: MAPREDUCE-4341
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4341
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/capacity-sched, mrv2
Affects Versions: 0.23.3
Reporter: Thomas Graves
Assignee: Karthik Kambatla
 Attachments: MR-4341.patch


 MAPREDUCE-4311 is changing capacity/max capacity configuration to be floats. 
 We should document that in the capacity scheduler properties docs 
 (http://hadoop.apache.org/common/docs/r0.23.1/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html#Configuration).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4339) pi example job hangs on when run on hadoop 0.23.0 when capacity scheduler is included in the setting environment.

2012-06-18 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13395964#comment-13395964
 ] 

Jason Lowe commented on MAPREDUCE-4339:
---

I am unable to reproduce a hang like this on a single-node cluster.  Could you 
examine the ResourceManager logs for issues or post them (after any necessary 
scrubbing/anonymization)? That would help track down what's going on when the 
job hangs.

 pi example job hangs on when run on hadoop 0.23.0 when capacity scheduler is 
 included in the setting environment.
 -

 Key: MAPREDUCE-4339
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4339
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: examples, job submission, mrv2, scheduler
Affects Versions: 0.23.0
 Environment: Ubuntu Server 11.04, Hadoop 0.23.0, 
Reporter: srikanth ayalasomayajulu
  Labels: hadoop
 Fix For: 0.23.0

   Original Estimate: 48h
  Remaining Estimate: 48h

 Tried to include default capacity scheduler in hadoop and tried to run an 
 example pi program. The job hangs and no more output is getting displayed.
 Starting Job
 2012-06-12 22:10:02,524 INFO  ipc.YarnRPC (YarnRPC.java:create(47)) - 
 Creating YarnRPC for org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC
 2012-06-12 22:10:02,538 INFO  mapred.ResourceMgrDelegate 
 (ResourceMgrDelegate.java:init(95)) - Connecting to ResourceManager at 
 localhost/127.0.0.1:8030
 2012-06-12 22:10:02,539 INFO  ipc.HadoopYarnRPC 
 (HadoopYarnProtoRPC.java:getProxy(48)) - Creating a HadoopYarnProtoRpc proxy 
 for protocol interface org.apache.hadoop.yarn.api.ClientRMProtocol
 2012-06-12 22:10:02,665 INFO  mapred.ResourceMgrDelegate 
 (ResourceMgrDelegate.java:init(99)) - Connected to ResourceManager at 
 localhost/127.0.0.1:8030
 2012-06-12 22:10:02,727 WARN  conf.Configuration 
 (Configuration.java:handleDeprecation(326)) - fs.default.name is deprecated. 
 Instead, use fs.defaultFS
 2012-06-12 22:10:02,728 WARN  conf.Configuration 
 (Configuration.java:handleDeprecation(343)) - 
 mapred.used.genericoptionsparser is deprecated. Instead, use 
 mapreduce.client.genericoptionsparser.used
 2012-06-12 22:10:02,831 INFO  input.FileInputFormat 
 (FileInputFormat.java:listStatus(245)) - Total input paths to process : 10
 2012-06-12 22:10:02,900 INFO  mapreduce.JobSubmitter 
 (JobSubmitter.java:submitJobInternal(362)) - number of splits:10
 2012-06-12 22:10:03,044 INFO  mapred.YARNRunner 
 (YARNRunner.java:createApplicationSubmissionContext(279)) - AppMaster 
 capability = memory: 2048
 2012-06-12 22:10:03,286 INFO  mapred.YARNRunner 
 (YARNRunner.java:createApplicationSubmissionContext(355)) - Command to launch 
 container for ApplicationMaster is : $JAVA_HOME/bin/java 
 -Dlog4j.configuration=container-log4j.properties 
 -Dyarn.app.mapreduce.container.log.dir=LOG_DIR 
 -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA 
 -Xmx1536m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1LOG_DIR/stdout 
 2LOG_DIR/stderr 
 2012-06-12 22:10:03,370 INFO  mapred.ResourceMgrDelegate 
 (ResourceMgrDelegate.java:submitApplication(304)) - Submitted application 
 application_1339507608976_0002 to ResourceManager
 2012-06-12 22:10:03,432 INFO  mapreduce.Job 
 (Job.java:monitorAndPrintJob(1207)) - Running job: job_1339507608976_0002
 2012-06-12 22:10:04,443 INFO  mapreduce.Job 
 (Job.java:monitorAndPrintJob(1227)) -  map 0% reduce 0%

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3889) job client tries to use /tasklog interface, but that doesn't exist anymore

2012-06-18 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated MAPREDUCE-3889:
-

 Target Version/s: 2.0.0-alpha, 0.23.3, 3.0.0  (was: 0.23.3, 2.0.0-alpha, 
3.0.0)
Affects Version/s: 3.0.0
   2.0.1-alpha

 job client tries to use /tasklog interface, but that doesn't exist anymore
 --

 Key: MAPREDUCE-3889
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3889
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.1, 2.0.1-alpha, 3.0.0
Reporter: Thomas Graves
Assignee: Devaraj K
Priority: Critical
 Attachments: MAPREDUCE-3889.patch, MAPREDUCE-3889.patch


 if you specify  -Dmapreduce.client.output.filter=SUCCEEDED option when 
 running a job it tries to fetch task logs to print out on the client side 
 from a url like: 
 http://nodemanager:8080/tasklog?plaintext=trueattemptid=attempt_1329857083014_0003_r_00_0filter=stdout
 It always errors on this request with: Required param job, map and reduce
 We saw this error when using distcp and the distcp failed. I'm not sure if it 
 is mandatory for distcp or just informational purposes.  I'm guessing the 
 latter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4345) ZK-based High Availability (HA) for ResourceManager (RM)

2012-06-18 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated MAPREDUCE-4345:
--

Assignee: Bikas Saha

Assigning to myself since this looks like something that follows directly after 
MAPREDUCE-4326 and design/implementation would be closely related with it.

 ZK-based High Availability (HA) for ResourceManager (RM)
 

 Key: MAPREDUCE-4345
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4345
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Harsh J
Assignee: Bikas Saha

 One of the goals presented on MAPREDUCE-279 was to have high availability. 
 One way that was discussed, per Mahadev/others on 
 https://issues.apache.org/jira/browse/MAPREDUCE-2648 and other places, was ZK:
 {quote}
 Am not sure, if you already know about the MR-279 branch (the next version of 
 MR framework). We've been trying to integrate ZK into the framework from the 
 beginning. As for now, we are just doing restart with ZK but soon we should 
 have a HA soln with ZK.
 {quote}
 There is now MAPREDUCE-4343 that tracks recoverability via ZK. This JIRA is 
 meant to track HA via ZK.
 Currently there isn't a HA solution for RM, via ZK or otherwise.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4326) Resurrect RM Restart

2012-06-18 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396041#comment-13396041
 ] 

Bikas Saha commented on MAPREDUCE-4326:
---

Will be posting a preliminary design sketch this week for comments.

 Resurrect RM Restart 
 -

 Key: MAPREDUCE-4326
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4326
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, resourcemanager
Affects Versions: 2.0.0-alpha
Reporter: Arun C Murthy
Assignee: Bikas Saha

 We should resurrect 'RM Restart' which we disabled sometime during the RM 
 refactor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4341) add types to capacity scheduler properties documentation

2012-06-18 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4341:


Attachment: (was: MR-4341.patch)

 add types to capacity scheduler properties documentation
 

 Key: MAPREDUCE-4341
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4341
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/capacity-sched, mrv2
Affects Versions: 0.23.3
Reporter: Thomas Graves
Assignee: Karthik Kambatla

 MAPREDUCE-4311 is changing capacity/max capacity configuration to be floats. 
 We should document that in the capacity scheduler properties docs 
 (http://hadoop.apache.org/common/docs/r0.23.1/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html#Configuration).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4341) add types to capacity scheduler properties documentation

2012-06-18 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4341:


Attachment: MR-4341.patch

 add types to capacity scheduler properties documentation
 

 Key: MAPREDUCE-4341
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4341
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/capacity-sched, mrv2
Affects Versions: 0.23.3
Reporter: Thomas Graves
Assignee: Karthik Kambatla
 Attachments: MR-4341.patch


 MAPREDUCE-4311 is changing capacity/max capacity configuration to be floats. 
 We should document that in the capacity scheduler properties docs 
 (http://hadoop.apache.org/common/docs/r0.23.1/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html#Configuration).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4341) add types to capacity scheduler properties documentation

2012-06-18 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4341:


Fix Version/s: 0.23.3
   Status: Patch Available  (was: Open)

Modified documentation to mention both capacity and max-capacity are of type 
float.

Didn't test.

 add types to capacity scheduler properties documentation
 

 Key: MAPREDUCE-4341
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4341
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/capacity-sched, mrv2
Affects Versions: 0.23.3
Reporter: Thomas Graves
Assignee: Karthik Kambatla
 Fix For: 0.23.3

 Attachments: MR-4341.patch


 MAPREDUCE-4311 is changing capacity/max capacity configuration to be floats. 
 We should document that in the capacity scheduler properties docs 
 (http://hadoop.apache.org/common/docs/r0.23.1/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html#Configuration).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4341) add types to capacity scheduler properties documentation

2012-06-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396051#comment-13396051
 ] 

Hadoop QA commented on MAPREDUCE-4341:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12532428/MR-4341.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2469//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2469//console

This message is automatically generated.

 add types to capacity scheduler properties documentation
 

 Key: MAPREDUCE-4341
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4341
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/capacity-sched, mrv2
Affects Versions: 0.23.3
Reporter: Thomas Graves
Assignee: Karthik Kambatla
 Fix For: 0.23.3

 Attachments: MR-4341.patch


 MAPREDUCE-4311 is changing capacity/max capacity configuration to be floats. 
 We should document that in the capacity scheduler properties docs 
 (http://hadoop.apache.org/common/docs/r0.23.1/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html#Configuration).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4343) ZK recovery support for ResourceManager

2012-06-18 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396063#comment-13396063
 ] 

Tsuyoshi OZAWA commented on MAPREDUCE-4343:
---

Sharad, 

Bikas marked MAPREDUCE-2713 as a duplicated task.

 ZK recovery support for ResourceManager
 ---

 Key: MAPREDUCE-4343
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4343
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Harsh J
 Attachments: MR-4343.1.patch


 MAPREDUCE-279 included bits and pieces of possible ZK integration for YARN's 
 RM, but looks like it failed to complete it (for scalability reasons? etc?) 
 and there seems to be no JIRA tracking this feature that has been already 
 claimed publicly as a good part about YARN.
 If it did complete it, we should document how to use it. Setting the 
 following only yields:
 {code}
 property
 nameyarn.resourcemanager.store.class/name
 valueorg.apache.hadoop.yarn.server.resourcemanager.recovery.ZKStore/value
 /property
 property
 nameyarn.resourcemanager.zookeeper-store.address/name
 valuetest.vm:2181/yarn-recovery-store/value
 /property
 {code}
 {code}
 Error starting ResourceManager
 java.lang.RuntimeException: java.lang.NoSuchMethodException: 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKStore.init()
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:128)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.StoreFactory.getStore(StoreFactory.java:32)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:621)
 Caused by: java.lang.NoSuchMethodException: 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKStore.init()
 at java.lang.Class.getConstructor0(Class.java:2706)
 at java.lang.Class.getDeclaredConstructor(Class.java:1985)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:122)
 ... 2 more
 {code}
 This JIRA is hence filed to track the addition/completion of recovery via ZK.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4290) JobStatus.getState() API is giving ambiguous values

2012-06-18 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated MAPREDUCE-4290:
--

Status: Open  (was: Patch Available)

There needs to be a couple additional cases within the FINISHED state - to deal 
with KILLED/FAILED. Other than that the patch looks good.

Another problem with the getAllJobs() API - it gets the application list from 
the RM - which means it's going to convert non MapReduce apps as well. Don't 
believe there's any good way to differentiate between application types from 
the RM list.

 JobStatus.getState() API is giving ambiguous values
 ---

 Key: MAPREDUCE-4290
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4290
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Nishan Shetty
Assignee: Devaraj K
 Attachments: MAPREDUCE-4290.patch


 For failed job getState() API is giving status as SUCCEEDED if we use 
 JobClient.getAllJobs() for retrieving all jobs info from RM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-4343) ZK recovery support for ResourceManager

2012-06-18 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy resolved MAPREDUCE-4343.
--

Resolution: Duplicate

Duplicate of MAPREDUCE-4326.

 ZK recovery support for ResourceManager
 ---

 Key: MAPREDUCE-4343
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4343
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Harsh J
 Attachments: MR-4343.1.patch


 MAPREDUCE-279 included bits and pieces of possible ZK integration for YARN's 
 RM, but looks like it failed to complete it (for scalability reasons? etc?) 
 and there seems to be no JIRA tracking this feature that has been already 
 claimed publicly as a good part about YARN.
 If it did complete it, we should document how to use it. Setting the 
 following only yields:
 {code}
 property
 nameyarn.resourcemanager.store.class/name
 valueorg.apache.hadoop.yarn.server.resourcemanager.recovery.ZKStore/value
 /property
 property
 nameyarn.resourcemanager.zookeeper-store.address/name
 valuetest.vm:2181/yarn-recovery-store/value
 /property
 {code}
 {code}
 Error starting ResourceManager
 java.lang.RuntimeException: java.lang.NoSuchMethodException: 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKStore.init()
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:128)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.StoreFactory.getStore(StoreFactory.java:32)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:621)
 Caused by: java.lang.NoSuchMethodException: 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKStore.init()
 at java.lang.Class.getConstructor0(Class.java:2706)
 at java.lang.Class.getDeclaredConstructor(Class.java:1985)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:122)
 ... 2 more
 {code}
 This JIRA is hence filed to track the addition/completion of recovery via ZK.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4306) Problem running Distributed Shell applications as a user other than the one started the daemons

2012-06-18 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396080#comment-13396080
 ] 

Siddharth Seth commented on MAPREDUCE-4306:
---

The -user option in general seems to be broken. Even after this patch, the AM 
will be localized as the original user - since the RM picks up the username 
from ugi.

Maybe we should remove the -user option completely? and use 
ApplicationConstants.Environment.USER in the AM - which is anyway set by the 
RM, based on the logged in user.

 Problem running Distributed Shell applications as a user other than the one 
 started the daemons
 ---

 Key: MAPREDUCE-4306
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4306
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.0-alpha
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Fix For: 2.0.1-alpha

 Attachments: MAPREDUCE-4306.patch, MAPREDUCE-4306_rev2.patch


 Using the tarball, if you start the yarn daemons using one user and then 
 switch to a different user. You can successfully run MR jobs, but DS jobs 
 fail to run. Only able to run DS jobs using the user who started the daemons.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4349) Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker

2012-06-18 Thread Mayank Bansal (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396090#comment-13396090
 ] 

Mayank Bansal commented on MAPREDUCE-4349:
--

Distributed Cache gives inconsistent result if Archive files get deleted from 
the task tracker. DC still thinks that it still have the file however file is 
deleted

 Distributed Cache gives inconsistent result if cache Archive files get 
 deleted from task tracker 
 -

 Key: MAPREDUCE-4349
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4349
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.0, 1.0.3, trunk
Reporter: Mayank Bansal
Assignee: Mayank Bansal



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4326) Resurrect RM Restart

2012-06-18 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated MAPREDUCE-4326:
--

Attachment: MR-4343.1.patch

Bikas,

The attached patch is originally created for MAPREDUCE-4343, which is marked as 
a duplicated task of this ticket.

The patch may be a reference, so I attached it to this ticket.

 Resurrect RM Restart 
 -

 Key: MAPREDUCE-4326
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4326
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, resourcemanager
Affects Versions: 2.0.0-alpha
Reporter: Arun C Murthy
Assignee: Bikas Saha
 Attachments: MR-4343.1.patch


 We should resurrect 'RM Restart' which we disabled sometime during the RM 
 refactor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4350) Distributed Cache should put files read only on Task tracker

2012-06-18 Thread Mayank Bansal (JIRA)
Mayank Bansal created MAPREDUCE-4350:


 Summary: Distributed Cache should put files read only on Task 
tracker
 Key: MAPREDUCE-4350
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4350
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distributed-cache
Affects Versions: 1.0.3, 0.22.0, trunk
Reporter: Mayank Bansal
Assignee: Mayank Bansal




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4350) Distributed Cache should put files read only on Task tracker

2012-06-18 Thread Mayank Bansal (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396096#comment-13396096
 ] 

Mayank Bansal commented on MAPREDUCE-4350:
--

This issue is based on the comment posted by robert 

https://issues.apache.org/jira/browse/MAPREDUCE-4342?focusedCommentId=13395942page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13395942

Thanks,
Mayank

 Distributed Cache should put files read only on Task tracker
 

 Key: MAPREDUCE-4350
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4350
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distributed-cache
Affects Versions: 0.22.0, 1.0.3, trunk
Reporter: Mayank Bansal
Assignee: Mayank Bansal



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4326) Resurrect RM Restart

2012-06-18 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396103#comment-13396103
 ] 

Bikas Saha commented on MAPREDUCE-4326:
---

Thanks! I will take a look before posting the design.

 Resurrect RM Restart 
 -

 Key: MAPREDUCE-4326
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4326
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, resourcemanager
Affects Versions: 2.0.0-alpha
Reporter: Arun C Murthy
Assignee: Bikas Saha
 Attachments: MR-4343.1.patch


 We should resurrect 'RM Restart' which we disabled sometime during the RM 
 refactor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4203) Create equivalent of ProcfsBasedProcessTree for Windows

2012-06-18 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396132#comment-13396132
 ] 

Jonathan Eagles commented on MAPREDUCE-4203:


Thanks, Bikas. Just trying to prevent Hadoop code from being contaminated with 
GPL or proprietary code licenses. Sounds like you are already controlling for 
that.

 Create equivalent of ProcfsBasedProcessTree for Windows
 ---

 Key: MAPREDUCE-4203
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4203
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: MAPREDUCE-4203.branch-1-win.1.patch, 
 MAPREDUCE-4203.patch, test.cpp


 ProcfsBasedProcessTree is used by the TaskTracker to get process information 
 like memory and cpu usage. This information is used to manage resources etc. 
 The current implementation is based on Linux procfs functionality and hence 
 does not work on other platforms, specifically windows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4288) ClusterStatus.getMapTasks() and ClusterStatus.getReduceTasks() is giving one when no job is running

2012-06-18 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396130#comment-13396130
 ] 

Karthik Kambatla commented on MAPREDUCE-4288:
-

In YARN, the ClusterMetrics should only correspond to numNodeManagers, 
numActiveJobs(), numActiveContainers(), availableResources(). Other 
job/app-specific metrics should move to the corresponding AMs. JobStatus would 
be a good place to have these metrics.

Subsequently, JobClient.getClusterStatus() can correspond to the job-specific 
metrics (would be a misnomer).

Comments?

 ClusterStatus.getMapTasks() and ClusterStatus.getReduceTasks() is giving one 
 when no job is running
 ---

 Key: MAPREDUCE-4288
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4288
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.0-alpha
Reporter: Nishan Shetty
Assignee: Karthik Kambatla

 When no job is running in the cluster invoke the ClusterStatus.getMapTasks() 
 and ClusterStatus.getReduceTasks() API's
 Observed that these API's are returning one instead of zero(as no job is 
 running)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4335) Change the default scheduler to the CapacityScheduler

2012-06-18 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated MAPREDUCE-4335:
--

Attachment: MR4335_4.txt

Thanks for taking a look Arun.

Updated the patch with the default scheduler defined in YarnConfiguration. Had 
to move the class loading into the ResourceManager instead of relying on 
Configuration.getClass...

 Change the default scheduler to the CapacityScheduler
 -

 Key: MAPREDUCE-4335
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4335
 Project: Hadoop Map/Reduce
  Issue Type: Task
  Components: mrv2
Affects Versions: 2.0.0-alpha
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Attachments: MR4335.txt, MR4335_2.txt, MR4335_3.txt, MR4335_4.txt


 There's some bugs in the FifoScheduler atm - doesn't distribute tasks across 
 nodes and some headroom (available resource) issues.
 That's not the best experience for users trying out the 2.0 branch. The CS 
 with the default configuration of a single queue behaves the same as the 
 FifoScheduler and doesn't have these issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4335) Change the default scheduler to the CapacityScheduler

2012-06-18 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated MAPREDUCE-4335:
--

Status: Patch Available  (was: Open)

 Change the default scheduler to the CapacityScheduler
 -

 Key: MAPREDUCE-4335
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4335
 Project: Hadoop Map/Reduce
  Issue Type: Task
  Components: mrv2
Affects Versions: 2.0.0-alpha
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Attachments: MR4335.txt, MR4335_2.txt, MR4335_3.txt, MR4335_4.txt


 There's some bugs in the FifoScheduler atm - doesn't distribute tasks across 
 nodes and some headroom (available resource) issues.
 That's not the best experience for users trying out the 2.0 branch. The CS 
 with the default configuration of a single queue behaves the same as the 
 FifoScheduler and doesn't have these issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4335) Change the default scheduler to the CapacityScheduler

2012-06-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396178#comment-13396178
 ] 

Hadoop QA commented on MAPREDUCE-4335:
--

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12532445/MR4335_4.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 10 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-api 
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell
 hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common 
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2470//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2470//console

This message is automatically generated.

 Change the default scheduler to the CapacityScheduler
 -

 Key: MAPREDUCE-4335
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4335
 Project: Hadoop Map/Reduce
  Issue Type: Task
  Components: mrv2
Affects Versions: 2.0.0-alpha
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Attachments: MR4335.txt, MR4335_2.txt, MR4335_3.txt, MR4335_4.txt


 There's some bugs in the FifoScheduler atm - doesn't distribute tasks across 
 nodes and some headroom (available resource) issues.
 That's not the best experience for users trying out the 2.0 branch. The CS 
 with the default configuration of a single queue behaves the same as the 
 FifoScheduler and doesn't have these issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3235) Improve CPU cache behavior in map side sort

2012-06-18 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396222#comment-13396222
 ] 

Todd Lipcon commented on MAPREDUCE-3235:


bq. BTW, I know you are interested in JVM intrinsic binary array compare

I guess you're working with Krystal Mok? Cool stuff, I hope to see it make it 
into OpenJDK as well!

bq. Almost the same, depends on if there are rack local maps. the more rack 
local maps, the slower.

You mean that if there are more rack-local (as opposed to data-local), right? 
If everything is data-local (eg terasort on an empty cluster) then I would 
expect the CPU difference to make a more noticeable difference.

 Improve CPU cache behavior in map side sort
 ---

 Key: MAPREDUCE-3235
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3235
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: performance, task
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: map_sort_perf.diff, mr-3235-poc.txt


 When running oprofile on a terasort workload, I noticed that a large amount 
 of CPU usage was going to MapTask$MapOutputBuffer.compare. Upon disassembling 
 this and looking at cycle counters, most of the cycles were going to memory 
 loads dereferencing into the array of key-value data -- implying expensive 
 cache misses. This can be avoided as follows:
 - rather than simply swapping indexes into the kv array, swap the entire meta 
 entries in the meta array. Swapping 16 bytes is only negligibly slower than 
 swapping 4 bytes. This requires adding the value-length into the meta array, 
 since we used to rely on the previous-in-the-array meta entry to determine 
 this. So we replace INDEX with VALUELEN and avoid one layer of indirection.
 - introduce an interface which allows key types to provide a 4-byte 
 comparison proxy. For string keys, this can simply be the first 4 bytes of 
 the string. The idea is that, if stringCompare(key1.proxy(), key2.proxy()) != 
 0, then compare(key1, key2) should have the same result. If the proxies are 
 equal, the normal comparison method is used. We then include the 4-byte proxy 
 as part of the metadata entry, so that for many cases the indirection into 
 the data buffer can be avoided.
 On a terasort benchmark, these optimizations plus an optimization to 
 WritableComparator.compareBytes dropped the aggregate mapside CPU millis by 
 40%, and the compare() routine mostly dropped off the oprofile results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4306) Problem running Distributed Shell applications as a user other than the one started the daemons

2012-06-18 Thread Ahmed Radwan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396223#comment-13396223
 ] 

Ahmed Radwan commented on MAPREDUCE-4306:
-

Thanks Siddharth for the review!

I agree, I think it is  better to completely remove the -user option. I 
originally thought of just keeping it in case it can be used for testing or 
other purposes. But leaving it now may lead to confusion, and also setting it 
to something other than the original user will lead to failure as described 
above. 

Also reading ApplicationConstants.Environment.USER is simpler than reevaluating 
the username from ugi (which will give the same result after all). I have 
updated the patch accordingly. Thanks! 

 Problem running Distributed Shell applications as a user other than the one 
 started the daemons
 ---

 Key: MAPREDUCE-4306
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4306
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.0-alpha
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Fix For: 2.0.1-alpha

 Attachments: MAPREDUCE-4306.patch, MAPREDUCE-4306_rev2.patch


 Using the tarball, if you start the yarn daemons using one user and then 
 switch to a different user. You can successfully run MR jobs, but DS jobs 
 fail to run. Only able to run DS jobs using the user who started the daemons.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4306) Problem running Distributed Shell applications as a user other than the one started the daemons

2012-06-18 Thread Ahmed Radwan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Radwan updated MAPREDUCE-4306:


Attachment: MAPREDUCE-4306_rev3.patch

 Problem running Distributed Shell applications as a user other than the one 
 started the daemons
 ---

 Key: MAPREDUCE-4306
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4306
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.0-alpha
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Fix For: 2.0.1-alpha

 Attachments: MAPREDUCE-4306.patch, MAPREDUCE-4306_rev2.patch, 
 MAPREDUCE-4306_rev3.patch


 Using the tarball, if you start the yarn daemons using one user and then 
 switch to a different user. You can successfully run MR jobs, but DS jobs 
 fail to run. Only able to run DS jobs using the user who started the daemons.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4334) Add support for CPU isolation/monitoring of containers

2012-06-18 Thread Andrew Ferguson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396261#comment-13396261
 ] 

Andrew Ferguson commented on MAPREDUCE-4334:


ok, putting all of this in the ContainerExecutor is not the way to go, as it 
precludes use of secure Hadoop's Linux container-executor.

In my new design, ContainerMonitor will be a pluggable component, just as 
ContainerExecutor is now. Then, we can provide a ContainerMonitor which uses 
cgroups to control resource usage, rather than the existing ContainerMonitor 
(to be renamed as DefaultContainerMonitor). This has several advantages:
1) allows us to keep existing ContainerMonitor for users who can't use cgroups 
(eg, users without root access during Hadoop setup)
2) ContainerMonitor already receives an event when it's time to stop 
monitoring, which we can use as notification to delete the container's cgroup
3) ContainerMonitor receives the resource limits already; no need to calculate 
them based on the configs
4) A pluggable ContainerMonitor paves the way for ContainerMonitors on other 
platforms

I will first open a sub-task to make ContainerMonitor pluggable.

The only trouble spot with this design is that it's not possible to move 
another non-root user's process into a cgroup. I plan to extend the secure 
container-executor to be able to make such a move.

Please let me know if you have any feedback about this proposal.


thank you,
Andrew

 Add support for CPU isolation/monitoring of containers
 --

 Key: MAPREDUCE-4334
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4334
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Arun C Murthy
Assignee: Arun C Murthy

 Once we get in MAPREDUCE-4327, it will be important to actually enforce 
 limits on CPU consumption of containers. 
 Several options spring to mind:
 # taskset (RHEL5+)
 # cgroups (RHEL6+)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4203) Create equivalent of ProcfsBasedProcessTree for Windows

2012-06-18 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated MAPREDUCE-4203:
--

Attachment: MAPREDUCE-4203.branch-1-win.2.patch

Fix some bugs in formatting. 
TestTaskTrackerMemoryManager now passes on Windows and tests the feature 
functionally.

 Create equivalent of ProcfsBasedProcessTree for Windows
 ---

 Key: MAPREDUCE-4203
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4203
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: MAPREDUCE-4203.branch-1-win.1.patch, 
 MAPREDUCE-4203.branch-1-win.2.patch, MAPREDUCE-4203.patch, test.cpp


 ProcfsBasedProcessTree is used by the TaskTracker to get process information 
 like memory and cpu usage. This information is used to manage resources etc. 
 The current implementation is based on Linux procfs functionality and hence 
 does not work on other platforms, specifically windows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4342) Distributed Cache gives inconsistent result if cache files get deleted from task tracker

2012-06-18 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396386#comment-13396386
 ] 

Konstantin Shvachko commented on MAPREDUCE-4342:


Mayank, the patch is not applying as is. Namely the empty line change in 
TrackerDistributedCacheManager. You can just leave the line there. I did that, 
but then it is not compiling. You need to sync it with the repo.

- Could you also change is been to has been as Robert suggested.
- And add spaces between method parameters.
- Reporting the results of test-patch and test builds would very useful, since 
we don't have Jenkins to verify that for 0.22.

The fix looks good modular the jiras you opened.


 Distributed Cache gives inconsistent result if cache files get deleted from 
 task tracker 
 -

 Key: MAPREDUCE-4342
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4342
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.0, 1.0.3, trunk
Reporter: Mayank Bansal
Assignee: Mayank Bansal
 Attachments: MAPREDUCE-4342-22-1.patch, MAPREDUCE-4342-22.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4351) Make ContainersMonitor pluggable

2012-06-18 Thread Andrew Ferguson (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Ferguson updated MAPREDUCE-4351:
---

Attachment: MAPREDUCE-4351-v1.patch

First cut at making ContainersMonitor pluggable. I have tested that the new 
configuration option is used, and that it works with a local cluster.

 Make ContainersMonitor pluggable
 

 Key: MAPREDUCE-4351
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4351
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mrv2, nodemanager
Reporter: Andrew Ferguson
 Attachments: MAPREDUCE-4351-v1.patch


 Make the existing ContainersManager pluggable, just as the ContainerExecutor 
 is currently. This will allow us to add container resource enforcement using 
 other techniques (such as cgroups) in an extensible fashion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4351) Make ContainersMonitor pluggable

2012-06-18 Thread Andrew Ferguson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396411#comment-13396411
 ] 

Andrew Ferguson commented on MAPREDUCE-4351:


the bulk of the lines in the patch are to rename ContainersMonitorImpl.java to 
DefaultContainersMonitor.java, and TestContainersMonitor.java to 
TestDefaultContainersMonitor.java

 Make ContainersMonitor pluggable
 

 Key: MAPREDUCE-4351
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4351
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mrv2, nodemanager
Reporter: Andrew Ferguson
 Attachments: MAPREDUCE-4351-v1.patch


 Make the existing ContainersManager pluggable, just as the ContainerExecutor 
 is currently. This will allow us to add container resource enforcement using 
 other techniques (such as cgroups) in an extensible fashion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3868) Reenable Raid

2012-06-18 Thread Scott Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Chen updated MAPREDUCE-3868:
--

Issue Type: Bug  (was: New Feature)

 Reenable Raid
 -

 Key: MAPREDUCE-3868
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3868
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/raid
Reporter: Scott Chen
Assignee: Weiyan Wang
 Attachments: MAPREDUCE-3868-1.patch, MAPREDUCE-3868-2.patch, 
 MAPREDUCE-3868-3.patch, MAPREDUCE-3868.patch, MAPREDUCE-3868v1.patch, 
 MAPREDUCE-3868v1.sh


 Currently Raid is outdated and not compiled. Make it compile.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-3868) Reenable Raid

2012-06-18 Thread Scott Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Chen resolved MAPREDUCE-3868.
---

  Resolution: Fixed
Hadoop Flags: Reviewed

I just committed this. Thanks, Weiyan.

 Reenable Raid
 -

 Key: MAPREDUCE-3868
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3868
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/raid
Reporter: Scott Chen
Assignee: Weiyan Wang
 Attachments: MAPREDUCE-3868-1.patch, MAPREDUCE-3868-2.patch, 
 MAPREDUCE-3868-3.patch, MAPREDUCE-3868.patch, MAPREDUCE-3868v1.patch, 
 MAPREDUCE-3868v1.sh


 Currently Raid is outdated and not compiled. Make it compile.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3868) Reenable Raid

2012-06-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396434#comment-13396434
 ] 

Hudson commented on MAPREDUCE-3868:
---

Integrated in Hadoop-Common-trunk-Commit #2369 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2369/])
MAPREDUCE-3868. Make Raid Compile. (Weiyan Wang via schen) (Revision 
1351548)

 Result = SUCCESS
schen : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1351548
Files : 
* 
/hadoop/common/trunk/hadoop-assemblies/src/main/resources/assemblies/hadoop-raid-dist.xml
* /hadoop/common/trunk/hadoop-dist/pom.xml
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/pom.xml
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/conf
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/hdfs/server/datanode/RaidBlockSender.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRaidUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/BlockFixer.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DirectoryTraversal.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaid.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/GaloisField.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/JobMonitor.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidShell.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/ReedSolomonCode.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/sbin
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/hdfs/TestRaidDfs.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestBlockFixer.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestDirectoryTraversal.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestErasureCodes.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidFilter.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidHar.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidPurge.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidShell.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidShellFsck.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestReedSolomonDecoder.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestReedSolomonEncoder.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java
* /hadoop/common/trunk/hadoop-hdfs-project/pom.xml
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/bin
* /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/conf
* /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/src/java/org
* /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/src/test/org
* /hadoop/common/trunk/hadoop-project/pom.xml


 Reenable Raid
 -

 Key: MAPREDUCE-3868
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3868
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/raid
Reporter: Scott Chen

[jira] [Commented] (MAPREDUCE-3868) Reenable Raid

2012-06-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396437#comment-13396437
 ] 

Hudson commented on MAPREDUCE-3868:
---

Integrated in Hadoop-Hdfs-trunk-Commit #2439 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2439/])
MAPREDUCE-3868. Make Raid Compile. (Weiyan Wang via schen) (Revision 
1351548)

 Result = SUCCESS
schen : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1351548
Files : 
* 
/hadoop/common/trunk/hadoop-assemblies/src/main/resources/assemblies/hadoop-raid-dist.xml
* /hadoop/common/trunk/hadoop-dist/pom.xml
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/pom.xml
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/conf
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/hdfs/server/datanode/RaidBlockSender.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRaidUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/BlockFixer.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DirectoryTraversal.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaid.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/GaloisField.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/JobMonitor.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidShell.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/ReedSolomonCode.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/sbin
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/hdfs/TestRaidDfs.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestBlockFixer.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestDirectoryTraversal.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestErasureCodes.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidFilter.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidHar.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidPurge.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidShell.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidShellFsck.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestReedSolomonDecoder.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestReedSolomonEncoder.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java
* /hadoop/common/trunk/hadoop-hdfs-project/pom.xml
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/bin
* /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/conf
* /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/src/java/org
* /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/src/test/org
* /hadoop/common/trunk/hadoop-project/pom.xml


 Reenable Raid
 -

 Key: MAPREDUCE-3868
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3868
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/raid
Reporter: Scott Chen
Assignee: 

[jira] [Commented] (MAPREDUCE-4336) Distributed Shell fails when used with the CapacityScheduler

2012-06-18 Thread Ahmed Radwan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396442#comment-13396442
 ] 

Ahmed Radwan commented on MAPREDUCE-4336:
-

The fix looks fairly straight forward: set the queue name for 
GetQueueInfoRequest, and also add default as the default queue name if not 
specified on the command line. I'll upload a patch now. 

 Distributed Shell fails when used with the CapacityScheduler
 

 Key: MAPREDUCE-4336
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4336
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.0-alpha
Reporter: Siddharth Seth

 DistributedShell attempts to get queue info without providing a queue name - 
 which ends up in an NPE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4336) Distributed Shell fails when used with the CapacityScheduler

2012-06-18 Thread Ahmed Radwan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Radwan updated MAPREDUCE-4336:


Attachment: MAPREDUCE-4336.patch

I have manually tested the patch by successfully submitting/running DS jobs on 
both the capacity and fifo schedulers.

 Distributed Shell fails when used with the CapacityScheduler
 

 Key: MAPREDUCE-4336
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4336
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.0-alpha
Reporter: Siddharth Seth
 Attachments: MAPREDUCE-4336.patch


 DistributedShell attempts to get queue info without providing a queue name - 
 which ends up in an NPE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (MAPREDUCE-4336) Distributed Shell fails when used with the CapacityScheduler

2012-06-18 Thread Ahmed Radwan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Radwan reassigned MAPREDUCE-4336:
---

Assignee: Ahmed Radwan

 Distributed Shell fails when used with the CapacityScheduler
 

 Key: MAPREDUCE-4336
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4336
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.0-alpha
Reporter: Siddharth Seth
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4336.patch


 DistributedShell attempts to get queue info without providing a queue name - 
 which ends up in an NPE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4336) Distributed Shell fails when used with the CapacityScheduler

2012-06-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396450#comment-13396450
 ] 

Hadoop QA commented on MAPREDUCE-4336:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12532494/MAPREDUCE-4336.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 javadoc.  The javadoc tool appears to have generated 13 warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2473//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2473//console

This message is automatically generated.

 Distributed Shell fails when used with the CapacityScheduler
 

 Key: MAPREDUCE-4336
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4336
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.0-alpha
Reporter: Siddharth Seth
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4336.patch


 DistributedShell attempts to get queue info without providing a queue name - 
 which ends up in an NPE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4350) Distributed Cache should put files read only on Task tracker

2012-06-18 Thread Kang Xiao (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396451#comment-13396451
 ] 

Kang Xiao commented on MAPREDUCE-4350:
--

+1. It will prevent some task to write the same file in DistributedCache 
directory.

 Distributed Cache should put files read only on Task tracker
 

 Key: MAPREDUCE-4350
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4350
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distributed-cache
Affects Versions: 0.22.0, 1.0.3, trunk
Reporter: Mayank Bansal
Assignee: Mayank Bansal



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3868) Reenable Raid

2012-06-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396454#comment-13396454
 ] 

Hudson commented on MAPREDUCE-3868:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #2388 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2388/])
MAPREDUCE-3868. Make Raid Compile. (Weiyan Wang via schen) (Revision 
1351548)

 Result = FAILURE
schen : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1351548
Files : 
* 
/hadoop/common/trunk/hadoop-assemblies/src/main/resources/assemblies/hadoop-raid-dist.xml
* /hadoop/common/trunk/hadoop-dist/pom.xml
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/pom.xml
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/conf
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/hdfs/server/datanode/RaidBlockSender.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRaidUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/BlockFixer.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DirectoryTraversal.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaid.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/GaloisField.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/JobMonitor.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidShell.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/ReedSolomonCode.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/sbin
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/hdfs/TestRaidDfs.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestBlockFixer.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestDirectoryTraversal.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestErasureCodes.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidFilter.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidHar.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidPurge.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidShell.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidShellFsck.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestReedSolomonDecoder.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestReedSolomonEncoder.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java
* /hadoop/common/trunk/hadoop-hdfs-project/pom.xml
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/bin
* /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/conf
* /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/src/java/org
* /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/src/test/org
* /hadoop/common/trunk/hadoop-project/pom.xml


 Reenable Raid
 -

 Key: MAPREDUCE-3868
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3868
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/raid
Reporter: Scott Chen

[jira] [Commented] (MAPREDUCE-3868) Reenable Raid

2012-06-18 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396462#comment-13396462
 ] 

Andrew Purtell commented on MAPREDUCE-3868:
---

Can we get this on branch-2?

 Reenable Raid
 -

 Key: MAPREDUCE-3868
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3868
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/raid
Reporter: Scott Chen
Assignee: Weiyan Wang
 Attachments: MAPREDUCE-3868-1.patch, MAPREDUCE-3868-2.patch, 
 MAPREDUCE-3868-3.patch, MAPREDUCE-3868.patch, MAPREDUCE-3868v1.patch, 
 MAPREDUCE-3868v1.sh


 Currently Raid is outdated and not compiled. Make it compile.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Reopened] (MAPREDUCE-4345) ZK-based High Availability (HA) for ResourceManager (RM)

2012-06-18 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J reopened MAPREDUCE-4345:



Thanks Bikas! Agree it is related to resurrecting RM restart.

Arun - It isn't a duplicate, at least the way I see it the MAPREDUCE-4326 
targets a restart-recovery while this one I'd opened to target proper HA 
(multiple RMs, failing over automatically, with client code covered too). It is 
what may come after restart-ability is achieved. Thanks, I've reopened it :)

 ZK-based High Availability (HA) for ResourceManager (RM)
 

 Key: MAPREDUCE-4345
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4345
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Harsh J
Assignee: Bikas Saha

 One of the goals presented on MAPREDUCE-279 was to have high availability. 
 One way that was discussed, per Mahadev/others on 
 https://issues.apache.org/jira/browse/MAPREDUCE-2648 and other places, was ZK:
 {quote}
 Am not sure, if you already know about the MR-279 branch (the next version of 
 MR framework). We've been trying to integrate ZK into the framework from the 
 beginning. As for now, we are just doing restart with ZK but soon we should 
 have a HA soln with ZK.
 {quote}
 There is now MAPREDUCE-4343 that tracks recoverability via ZK. This JIRA is 
 meant to track HA via ZK.
 Currently there isn't a HA solution for RM, via ZK or otherwise.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira