date:20120618

Ahmed Radwan created MAPREDUCE-4346:
---

 Summary: Adding a refined version of JobTracker.getAllJobs() and 
exposing through the JobClient
 Key: MAPREDUCE-4346
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan


The current implementation for JobTracker.getAllJobs() returns all submitted 
jobs in any state, in addition to retired jobs. This list can be long and 
represents an unneeded overhead especially in the case of clients only 
interested in jobs in specific state(s). 

It is beneficial to include a refined version where only jobs having specific 
statuses are returned and retired jobs are optional to include. 

I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Radwan updated MAPREDUCE-4346:


Attachment: MAPREDUCE-4346.patch

 Adding a refined version of JobTracker.getAllJobs() and exposing through the 
 JobClient
 --

 Key: MAPREDUCE-4346
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4346.patch


 The current implementation for JobTracker.getAllJobs() returns all submitted 
 jobs in any state, in addition to retired jobs. This list can be long and 
 represents an unneeded overhead especially in the case of clients only 
 interested in jobs in specific state(s). 
 It is beneficial to include a refined version where only jobs having specific 
 statuses are returned and retired jobs are optional to include. 
 I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Radwan updated MAPREDUCE-4346:


Status: Patch Available  (was: Open)

 Adding a refined version of JobTracker.getAllJobs() and exposing through the 
 JobClient
 --

 Key: MAPREDUCE-4346
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4346.patch


 The current implementation for JobTracker.getAllJobs() returns all submitted 
 jobs in any state, in addition to retired jobs. This list can be long and 
 represents an unneeded overhead especially in the case of clients only 
 interested in jobs in specific state(s). 
 It is beneficial to include a refined version where only jobs having specific 
 statuses are returned and retired jobs are optional to include. 
 I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient

[
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13395486#comment-13395486
]

Hadoop QA commented on MAPREDUCE-4346:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12532379/MAPREDUCE-4346.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

-1 tests included. The patch doesn't appear to include any new or modified
tests.
Please justify why no new tests are needed for this
patch.
Also please list what manual steps were performed to
verify this patch.

-1 patch. The patch command could not apply the patch.

Console output:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2465//console

This message is automatically generated.

Adding a refined version of JobTracker.getAllJobs() and exposing through the
JobClient
--

Key: MAPREDUCE-4346
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346
Project: Hadoop Map/Reduce
Issue Type: Improvement
Components: mrv1
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
Attachments: MAPREDUCE-4346.patch

The current implementation for JobTracker.getAllJobs() returns all submitted
jobs in any state, in addition to retired jobs. This list can be long and
represents an unneeded overhead especially in the case of clients only
interested in jobs in specific state(s).
It is beneficial to include a refined version where only jobs having specific
statuses are returned and retired jobs are optional to include.
I'll be uploading an initial patch momentarily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient

[
https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13395490#comment-13395490
]

Hadoop QA commented on MAPREDUCE-4346:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12532379/MAPREDUCE-4346.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

-1 patch. The patch command could not apply the patch.

Console output:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2466//console

This message is automatically generated.

Adding a refined version of JobTracker.getAllJobs() and exposing through the
JobClient
--

[jira] [Resolved] (MAPREDUCE-4114) saveVersion.sh fails if build directory contains space

2012-06-18 Thread Radim Kolar (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Radim Kolar resolved MAPREDUCE-4114.


Resolution: Duplicate

 saveVersion.sh fails if build directory contains space 
 ---

 Key: MAPREDUCE-4114
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4114
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: build
Affects Versions: 0.23.2
 Environment: FreeBSD 8.2, 64bit
Reporter: Radim Kolar

 if you rename build directory to something without space like /tmp/hadoop 
 then it works
 [INFO]
  
 [INFO] 
 
 [INFO] Building hadoop-yarn-common 0.23.3-SNAPSHOT
 [INFO] 
 
 [INFO] 
 [INFO] --- maven-antrun-plugin:1.6:run (create-testdirs) @ hadoop-yarn-common 
 ---
 [INFO] Executing tasks
 main:
 [INFO] Executed tasks
 [INFO] 
 [INFO] --- maven-antrun-plugin:1.6:run 
 (create-protobuf-generated-sources-directory) @ hadoop-yarn-common ---
 [INFO] Executing tasks
 main:
 [INFO] Executed tasks
 [INFO] 
 [INFO] --- exec-maven-plugin:1.2:exec (generate-sources) @ hadoop-yarn-common 
 ---
 [INFO] 
 [INFO] --- exec-maven-plugin:1.2:exec (generate-version) @ hadoop-yarn-common 
 ---
 scripts/saveVersion.sh: cannot create /usr/local/jboss/.jenkins/jobs/Hadoop 
 0.23 
 branch/workspace/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/target/generated-sources/version/org/apache/hadoop/yarn/package-info.java:
  No such file or directory
 [JENKINS] Archiving /usr/local/jboss/.jenkins/jobs/Hadoop 0.23 
 branch/workspace/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/pom.xml
  to /usr/local/jboss/.jenkins/jobs/Hadoop 0.23 
 branch/modules/org.apache.hadoop$hadoop-yarn-common/builds/2012-04-05_19-44-16/archive/org.apache.hadoop/hadoop-yarn-common/0.23.3-SNAPSHOT/hadoop-yarn-common-0.23.3-SNAPSHOT.pom
 [INFO] 
 
 [INFO] Reactor Summary:
 [INFO] 
 [INFO] Apache Hadoop Main  SUCCESS [5.124s]
 [INFO] Apache Hadoop Project POM . SUCCESS [1.692s]
 [INFO] Apache Hadoop Annotations . SUCCESS [1.672s]
 [INFO] Apache Hadoop Project Dist POM  SUCCESS [1.823s]
 [INFO] Apache Hadoop Assemblies .. SUCCESS [0.796s]
 [INFO] Apache Hadoop Auth  SUCCESS [2.456s]
 [INFO] Apache Hadoop Auth Examples ... SUCCESS [1.093s]
 [INFO] Apache Hadoop Common .. SUCCESS [23.648s]
 [INFO] Apache Hadoop Common Project .. SUCCESS [0.434s]
 [INFO] Apache Hadoop HDFS  SUCCESS [22.124s]
 [INFO] Apache Hadoop HttpFS .. SUCCESS [3.251s]
 [INFO] Apache Hadoop HDFS Project  SUCCESS [0.443s]
 [INFO] hadoop-yarn ... SUCCESS [1.175s]
 [INFO] hadoop-yarn-api ... SUCCESS [7.049s]
 [INFO] hadoop-yarn-common  FAILURE [5.565s]
 [INFO] hadoop-yarn-server  SKIPPED
 [INFO] hadoop-yarn-server-common . SKIPPED
 [INFO] hadoop-yarn-server-nodemanager  SKIPPED
 [INFO] hadoop-yarn-server-web-proxy .. SKIPPED
 [INFO] hadoop-yarn-server-resourcemanager  SKIPPED
 [INFO] hadoop-yarn-server-tests .. SKIPPED
 [INFO] hadoop-mapreduce-client ... SKIPPED
 [INFO] hadoop-mapreduce-client-core .. SKIPPED
 [INFO] hadoop-yarn-applications .. SKIPPED
 [INFO] hadoop-yarn-applications-distributedshell . SKIPPED
 [INFO] hadoop-yarn-site .. SKIPPED
 [INFO] hadoop-mapreduce-client-common  SKIPPED
 [INFO] hadoop-mapreduce-client-shuffle ... SKIPPED
 [INFO] hadoop-mapreduce-client-app ... SKIPPED
 [INFO] hadoop-mapreduce-client-hs  SKIPPED
 [INFO] hadoop-mapreduce-client-jobclient . SKIPPED
 [INFO] Apache Hadoop MapReduce Examples .. SKIPPED
 [INFO] hadoop-mapreduce .. SKIPPED
 [INFO] Apache Hadoop MapReduce Streaming . SKIPPED
 [INFO] Apache Hadoop Distributed Copy  SKIPPED
 [INFO] Apache Hadoop Archives  SKIPPED
 [INFO] Apache Hadoop Rumen ... SKIPPED
 [INFO] Apache Hadoop

[jira] [Commented] (MAPREDUCE-3968) add support for getNumMapTasks() into mapreduce JobContext

2012-06-18 Thread Radim Kolar (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13395756#comment-13395756
 ] 

Radim Kolar commented on MAPREDUCE-3968:


Yes, i need to know number of splits.

 add support for getNumMapTasks() into mapreduce JobContext
 --

 Key: MAPREDUCE-3968
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3968
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Affects Versions: trunk
 Environment: hadoop 0.22
Reporter: Radim Kolar
Priority: Minor
 Attachments: MAPREDUCE-3968.patch


 In old mapred api there was way to query number of mappers:
 job.getNumMapTasks())
 No such function exists in new mapreduce api

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4031) Node Manager hangs on shut down


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated MAPREDUCE-4031:
-

Status: Open  (was: Patch Available)

 Node Manager hangs on shut down
 ---

 Key: MAPREDUCE-4031
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4031
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, nodemanager
Affects Versions: 0.23.2, 2.0.1-alpha, 3.0.0
Reporter: Devaraj K
Assignee: Devaraj K
Priority: Critical
 Attachments: MAPREDUCE-4031.patch, MAPREDUCE-4031.patch, 
 nm-threaddump.out


 I have the MAPREDUCE-3862 changes which fixed this issue earlier and 
 yarn.nodemanager.delete.debug-delay-sec set to default value but still 
 getting this issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4031) Node Manager hangs on shut down


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated MAPREDUCE-4031:
-

Attachment: MAPREDUCE-4031.patch

 Node Manager hangs on shut down
 ---

 Key: MAPREDUCE-4031
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4031
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, nodemanager
Affects Versions: 0.23.2, 2.0.1-alpha, 3.0.0
Reporter: Devaraj K
Assignee: Devaraj K
Priority: Critical
 Attachments: MAPREDUCE-4031.patch, MAPREDUCE-4031.patch, 
 MAPREDUCE-4031.patch, nm-threaddump.out


 I have the MAPREDUCE-3862 changes which fixed this issue earlier and 
 yarn.nodemanager.delete.debug-delay-sec set to default value but still 
 getting this issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4031) Node Manager hangs on shut down


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated MAPREDUCE-4031:
-

Status: Patch Available  (was: Open)

Thanks a lot Sid for looking into the patch.

The above test failures are not related to the patch. Resubmitting the same 
patch to trigger Jenkins.

 Node Manager hangs on shut down
 ---

 Key: MAPREDUCE-4031
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4031
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, nodemanager
Affects Versions: 0.23.2, 2.0.1-alpha, 3.0.0
Reporter: Devaraj K
Assignee: Devaraj K
Priority: Critical
 Attachments: MAPREDUCE-4031.patch, MAPREDUCE-4031.patch, 
 MAPREDUCE-4031.patch, nm-threaddump.out


 I have the MAPREDUCE-3862 changes which fixed this issue earlier and 
 yarn.nodemanager.delete.debug-delay-sec set to default value but still 
 getting this issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (MAPREDUCE-4347) joined PhD. Intrested to do research in cloud especially in Hadoop

2012-06-18 Thread Suresh S (JIRA)

Suresh S created MAPREDUCE-4347:
---

 Summary: joined PhD. Intrested to do research in cloud especially 
in Hadoop
 Key: MAPREDUCE-4347
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4347
 Project: Hadoop Map/Reduce
  Issue Type: Wish
Reporter: Suresh S




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4347) joined PhD. Intrested to do research in cloud especially in Hadoop. need suggession for problems to work.

2012-06-18 Thread Suresh S (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh S updated MAPREDUCE-4347:


Summary: joined PhD. Intrested to do research in cloud especially in 
Hadoop. need suggession for problems to work.  (was: joined PhD. Intrested to 
do research in cloud especially in Hadoop)

 joined PhD. Intrested to do research in cloud especially in Hadoop. need 
 suggession for problems to work.
 -

 Key: MAPREDUCE-4347
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4347
 Project: Hadoop Map/Reduce
  Issue Type: Wish
Reporter: Suresh S



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4328) Add the option to quiesce the JobTracker

2012-06-18 Thread Kang Xiao (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-4328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13395841#comment-13395841
]

Kang Xiao commented on MAPREDUCE-4328:
--

It is useful in some condition such as NN is down. Actually we find a way to
achieve the first goal by updating the fair scheduler's conf set each pool's
max share to be zero.
The second goal will protect the job from going to FAILED. But it seems so
possible for a job to go to FAILED since no more task scheduled.

It may be more simple to just not invoke assignTasks() in JobTracker to
implement the first goal. And it will not burden the scheduler implementation
since 'safemode' is a small probability event.

Add the option to quiesce the JobTracker

Key: MAPREDUCE-4328
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4328
Project: Hadoop Map/Reduce
Issue Type: Improvement
Components: mrv1
Affects Versions: 1.0.3
Reporter: Arun C Murthy
Assignee: Arun C Murthy
Attachments: MAPREDUCE-4328.patch

In several failure scenarios it would be very handy to have an option to
quiesce the JobTracker.
Recently, we saw a case where the NameNode had to be rebooted at a customer
due to a random hardware failure - in such a case it would have been nice to
not lose jobs by quiescing the JobTracker.

[jira] [Resolved] (MAPREDUCE-4347) joined PhD. Intrested to do research in cloud especially in Hadoop. need suggession for problems to work.

2012-06-18 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved MAPREDUCE-4347.


Resolution: Invalid

The JIRA exists to track issues with the project, not for discussions such as 
these.

Please send your email to mapreduce-...@hadoop.apache.org instead. Thanks!

 joined PhD. Intrested to do research in cloud especially in Hadoop. need 
 suggession for problems to work.
 -

 Key: MAPREDUCE-4347
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4347
 Project: Hadoop Map/Reduce
  Issue Type: Wish
Reporter: Suresh S



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4039) Sort Avoidance

2012-06-18 Thread Kang Xiao (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13395855#comment-13395855
]

Kang Xiao commented on MAPREDUCE-4039:
--

@Schubert, could you give some typical applications that benefit from sort
avoidance? It seems that using this feature simple aggregation app such as
wordcount will use more memory to wait for all keys processed.

Sort Avoidance
--

Key: MAPREDUCE-4039
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4039
Project: Hadoop Map/Reduce
Issue Type: New Feature
Components: mrv2
Affects Versions: 0.23.2
Reporter: anty.rao
Assignee: anty
Priority: Minor
Fix For: 0.23.2

Attachments: MAPREDUCE-4039-branch-0.23.2.patch,
MAPREDUCE-4039-branch-0.23.2.patch

Inspired by
[Tenzing|http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//pubs/archive/37200.pdf],
in 5.1 MapReduce Enhanceemtns:
{quote}*Sort Avoidance*. Certain operators such as hash join
and hash aggregation require shuffling, but not sorting. The
MapReduce API was enhanced to automatically turn off
sorting for these operations. When sorting is turned off, the
mapper feeds data to the reducer which directly passes the
data to the Reduce() function bypassing the intermediate
sorting step. This makes many SQL operators significantly
more ecient.{quote}
There are a lot of applications which need aggregation only, not
sorting.Using sorting to achieve aggregation is costly and inefficient.
Without sorting, up application can make use of hash table or hash map to do
aggregation efficiently.But application should bear in mind that reduce
memory is limited, itself is committed to manage memory of reduce, guard
against out of memory. Map-side combiner is not supported, you can also do
hash aggregation in map side as a workaround.
the following is the main points of sort avoidance implementation
# add a configuration parameter ??mapreduce.sort.avoidance??, boolean type,
to turn on/off sort avoidance workflow.Two type of workflow are coexist
together.
# key/value pairs emitted by map function is sorted by partition only, using
a more efficient sorting algorithm: counting sort.
# map-side merge, use a kind of byte merge, which just concatenate bytes from
generated spills, read in bytes, write out bytes, without overhead of
key/value serialization/deserailization, comparison, which current version
incurs.
# reduce can start up as soon as there is any map output available, in
contrast to sort workflow which must wait until all map outputs are fetched
and merged.
# map output in memory can be directly consumed by reduce.When reduce can't
catch up with the speed of incoming map outputs, in-memory merge thread will
kick in, merging in-memory map outputs onto disk.
# sequentially read in on-disk files to feed reduce, in contrast to currently
implementation which read multiple files concurrently, result in many disk
seek. Map output in memory take precedence over on disk files in feeding
reduce function.
I have already implement this feature based on hadoop CDH3U3 and done some
performance evaluation, you can reference to
[https://github.com/hanborq/hadoop] for details. Now,I'm willing to port it
into yarn. Welcome for commenting.

[jira] [Commented] (MAPREDUCE-4343) ZK recovery support for ResourceManager

2012-06-18 Thread Sharad Agarwal (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13395869#comment-13395869
 ] 

Sharad Agarwal commented on MAPREDUCE-4343:
---

There is already MAPREDUCE-2713 for this. Some ZK code may be lying around but 
it is not implemented as yet.

can this be marked as duplicate ?

 ZK recovery support for ResourceManager
 ---

 Key: MAPREDUCE-4343
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4343
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Harsh J
 Attachments: MR-4343.1.patch


 MAPREDUCE-279 included bits and pieces of possible ZK integration for YARN's 
 RM, but looks like it failed to complete it (for scalability reasons? etc?) 
 and there seems to be no JIRA tracking this feature that has been already 
 claimed publicly as a good part about YARN.
 If it did complete it, we should document how to use it. Setting the 
 following only yields:
 {code}
 property
 nameyarn.resourcemanager.store.class/name
 valueorg.apache.hadoop.yarn.server.resourcemanager.recovery.ZKStore/value
 /property
 property
 nameyarn.resourcemanager.zookeeper-store.address/name
 valuetest.vm:2181/yarn-recovery-store/value
 /property
 {code}
 {code}
 Error starting ResourceManager
 java.lang.RuntimeException: java.lang.NoSuchMethodException: 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKStore.init()
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:128)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.StoreFactory.getStore(StoreFactory.java:32)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:621)
 Caused by: java.lang.NoSuchMethodException: 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKStore.init()
 at java.lang.Class.getConstructor0(Class.java:2706)
 at java.lang.Class.getDeclaredConstructor(Class.java:1985)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:122)
 ... 2 more
 {code}
 This JIRA is hence filed to track the addition/completion of recovery via ZK.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (MAPREDUCE-4348) JobSubmissionProtocol should be made public, not package private

Steve Loughran created MAPREDUCE-4348:
-

 Summary: JobSubmissionProtocol should be made public, not package 
private
 Key: MAPREDUCE-4348
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4348
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Affects Versions: 1.0.3
Reporter: Steve Loughran
Priority: Minor


The JobSubmissionProtocol interface is package private, yet it is the only way 
to remotely query the status of the JT or the cluster. 

Even if Job Submission is considered private, probing JT state shouldn't be.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4341) add types to capacity scheduler properties documentation

2012-06-18 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13395887#comment-13395887
 ] 

Thomas Graves commented on MAPREDUCE-4341:
--

can you add it for max capacity also please.

 add types to capacity scheduler properties documentation
 

 Key: MAPREDUCE-4341
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4341
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/capacity-sched, mrv2
Affects Versions: 0.23.3
Reporter: Thomas Graves
Assignee: Karthik Kambatla
 Attachments: MR-4341.patch


 MAPREDUCE-4311 is changing capacity/max capacity configuration to be floats. 
 We should document that in the capacity scheduler properties docs 
 (http://hadoop.apache.org/common/docs/r0.23.1/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html#Configuration).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4039) Sort Avoidance

2012-06-18 Thread anty.rao (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

anty.rao updated MAPREDUCE-4039:

Attachment: IndexedCountingSortable.java

the missing file.

Sort Avoidance
--

Attachments: IndexedCountingSortable.java,
MAPREDUCE-4039-branch-0.23.2.patch, MAPREDUCE-4039-branch-0.23.2.patch

[jira] [Updated] (MAPREDUCE-4348) JobSubmissionProtocol should be made public, not package private


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated MAPREDUCE-4348:
--

Attachment: MAPREDUCE-4348.patch

makes i/f public but marks as private and evolving. 

 JobSubmissionProtocol should be made public, not package private
 

 Key: MAPREDUCE-4348
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4348
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Affects Versions: 1.0.3
Reporter: Steve Loughran
Priority: Minor
 Attachments: MAPREDUCE-4348.patch

   Original Estimate: 0.5h
  Remaining Estimate: 0.5h

 The JobSubmissionProtocol interface is package private, yet it is the only 
 way to remotely query the status of the JT or the cluster. 
 Even if Job Submission is considered private, probing JT state shouldn't be.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4348) JobSubmissionProtocol should be made public, not package private


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated MAPREDUCE-4348:
--

Assignee: Steve Loughran
Target Version/s: 1.1.0
  Status: Patch Available  (was: Open)

 JobSubmissionProtocol should be made public, not package private
 

 Key: MAPREDUCE-4348
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4348
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Affects Versions: 1.0.3
Reporter: Steve Loughran
Assignee: Steve Loughran
Priority: Minor
 Attachments: MAPREDUCE-4348.patch

   Original Estimate: 0.5h
  Remaining Estimate: 0.5h

 The JobSubmissionProtocol interface is package private, yet it is the only 
 way to remotely query the status of the JT or the cluster. 
 Even if Job Submission is considered private, probing JT state shouldn't be.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4039) Sort Avoidance

2012-06-18 Thread anty.rao (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13395916#comment-13395916
]

anty.rao commented on MAPREDUCE-4039:
-

@Kang
Yes, you are right.
Using merge-sort to achieve aggregation maybe don't use so much memory as hash
aggregation with this feature.But the process of merge-sort require much
useless work to done, consume more resources, e.g. CPU, disk, network.
it's just a tradeoff according to your usecase, latency requirement, etc.

Sort Avoidance
--

Attachments: IndexedCountingSortable.java,
MAPREDUCE-4039-branch-0.23.2.patch, MAPREDUCE-4039-branch-0.23.2.patch

[jira] [Resolved] (MAPREDUCE-4298) NodeManager crashed after running out of file descriptors

2012-06-18 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves resolved MAPREDUCE-4298.
--

Resolution: Duplicate

dup of HADOOP-8495

 NodeManager crashed after running out of file descriptors
 -

 Key: MAPREDUCE-4298
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4298
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, nodemanager
Affects Versions: 0.23.3, 2.0.0-alpha, 3.0.0
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical
 Attachments: MAPREDUCE-4298.patch


 A node on one of our clusters fell over because it ran out of open file 
 descriptors.  Log details with stack traceback to follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4342) Distributed Cache gives inconsistent result if cache files get deleted from task tracker

2012-06-18 Thread Robert Joseph Evans (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13395942#comment-13395942
]

Robert Joseph Evans commented on MAPREDUCE-4342:

A couple of comments.
# Minor correction to the grammar. {code}LOG.warn(Local Cache is been
deleted... Downloading the cache again);{code} should be {code}LOG.warn(Local
Cache has been deleted... Downloading the cache again);{code}
# Please run test-patch on it and post the results.
# I believe that this problem also exists in trunk and branch 2. It would be
good to investigate and possibly file a JIRA, or post a patch for them as well.

It looks good, but it is not perfect. It will work in the case where a single
base distributed cache file or directory was deleted, but it will not work in
the case where a file was corrupted, where a file in a cache archive was
deleted, where new files were added, etc. I agree that we want to be able to
deal with a file being removed, but I personally think that prevention is
preferable to recovery, although it may not be as backwards compatible. I
would prefer to see all of the files created in the distributed cache be marked
as read only. If the files are part of a private cache and someone messes with
them, by modifying the permissions then it is on their head, and they need to
modify the original HDFS file to force it to download a new copy.

Checking for corruption in because of FS/Disk issues is a separate one that we
probably want to also look into, now that the data in the distributed cache can
live for long periods of time.

Distributed Cache gives inconsistent result if cache files get deleted from
task tracker
-

Key: MAPREDUCE-4342
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4342
Project: Hadoop Map/Reduce
Issue Type: Bug
Affects Versions: 0.22.0, 1.0.3, trunk
Reporter: Mayank Bansal
Assignee: Mayank Bansal
Attachments: MAPREDUCE-4342-22-1.patch, MAPREDUCE-4342-22.patch

[jira] [Commented] (MAPREDUCE-4348) JobSubmissionProtocol should be made public, not package private


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13395952#comment-13395952
 ] 

Steve Loughran commented on MAPREDUCE-4348:
---

# no tests, this is a package scope change, not a new feature. 
# it is to be applied against the 1.x branch


 JobSubmissionProtocol should be made public, not package private
 

 Key: MAPREDUCE-4348
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4348
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Affects Versions: 1.0.3
Reporter: Steve Loughran
Assignee: Steve Loughran
Priority: Minor
 Attachments: MAPREDUCE-4348.patch

   Original Estimate: 0.5h
  Remaining Estimate: 0.5h

 The JobSubmissionProtocol interface is package private, yet it is the only 
 way to remotely query the status of the JT or the cluster. 
 Even if Job Submission is considered private, probing JT state shouldn't be.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4341) add types to capacity scheduler properties documentation


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4341:


Status: Open  (was: Patch Available)

Will add the documentation for max-capacity as well, and upload another patch 
shortly.

 add types to capacity scheduler properties documentation
 

 Key: MAPREDUCE-4341
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4341
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/capacity-sched, mrv2
Affects Versions: 0.23.3
Reporter: Thomas Graves
Assignee: Karthik Kambatla
 Attachments: MR-4341.patch


 MAPREDUCE-4311 is changing capacity/max capacity configuration to be floats. 
 We should document that in the capacity scheduler properties docs 
 (http://hadoop.apache.org/common/docs/r0.23.1/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html#Configuration).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4339) pi example job hangs on when run on hadoop 0.23.0 when capacity scheduler is included in the setting environment.

2012-06-18 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13395964#comment-13395964
 ] 

Jason Lowe commented on MAPREDUCE-4339:
---

I am unable to reproduce a hang like this on a single-node cluster.  Could you 
examine the ResourceManager logs for issues or post them (after any necessary 
scrubbing/anonymization)? That would help track down what's going on when the 
job hangs.

 pi example job hangs on when run on hadoop 0.23.0 when capacity scheduler is 
 included in the setting environment.
 -

 Key: MAPREDUCE-4339
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4339
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: examples, job submission, mrv2, scheduler
Affects Versions: 0.23.0
 Environment: Ubuntu Server 11.04, Hadoop 0.23.0, 
Reporter: srikanth ayalasomayajulu
  Labels: hadoop
 Fix For: 0.23.0

   Original Estimate: 48h
  Remaining Estimate: 48h

 Tried to include default capacity scheduler in hadoop and tried to run an 
 example pi program. The job hangs and no more output is getting displayed.
 Starting Job
 2012-06-12 22:10:02,524 INFO  ipc.YarnRPC (YarnRPC.java:create(47)) - 
 Creating YarnRPC for org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC
 2012-06-12 22:10:02,538 INFO  mapred.ResourceMgrDelegate 
 (ResourceMgrDelegate.java:init(95)) - Connecting to ResourceManager at 
 localhost/127.0.0.1:8030
 2012-06-12 22:10:02,539 INFO  ipc.HadoopYarnRPC 
 (HadoopYarnProtoRPC.java:getProxy(48)) - Creating a HadoopYarnProtoRpc proxy 
 for protocol interface org.apache.hadoop.yarn.api.ClientRMProtocol
 2012-06-12 22:10:02,665 INFO  mapred.ResourceMgrDelegate 
 (ResourceMgrDelegate.java:init(99)) - Connected to ResourceManager at 
 localhost/127.0.0.1:8030
 2012-06-12 22:10:02,727 WARN  conf.Configuration 
 (Configuration.java:handleDeprecation(326)) - fs.default.name is deprecated. 
 Instead, use fs.defaultFS
 2012-06-12 22:10:02,728 WARN  conf.Configuration 
 (Configuration.java:handleDeprecation(343)) - 
 mapred.used.genericoptionsparser is deprecated. Instead, use 
 mapreduce.client.genericoptionsparser.used
 2012-06-12 22:10:02,831 INFO  input.FileInputFormat 
 (FileInputFormat.java:listStatus(245)) - Total input paths to process : 10
 2012-06-12 22:10:02,900 INFO  mapreduce.JobSubmitter 
 (JobSubmitter.java:submitJobInternal(362)) - number of splits:10
 2012-06-12 22:10:03,044 INFO  mapred.YARNRunner 
 (YARNRunner.java:createApplicationSubmissionContext(279)) - AppMaster 
 capability = memory: 2048
 2012-06-12 22:10:03,286 INFO  mapred.YARNRunner 
 (YARNRunner.java:createApplicationSubmissionContext(355)) - Command to launch 
 container for ApplicationMaster is : $JAVA_HOME/bin/java 
 -Dlog4j.configuration=container-log4j.properties 
 -Dyarn.app.mapreduce.container.log.dir=LOG_DIR 
 -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA 
 -Xmx1536m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1LOG_DIR/stdout 
 2LOG_DIR/stderr 
 2012-06-12 22:10:03,370 INFO  mapred.ResourceMgrDelegate 
 (ResourceMgrDelegate.java:submitApplication(304)) - Submitted application 
 application_1339507608976_0002 to ResourceManager
 2012-06-12 22:10:03,432 INFO  mapreduce.Job 
 (Job.java:monitorAndPrintJob(1207)) - Running job: job_1339507608976_0002
 2012-06-12 22:10:04,443 INFO  mapreduce.Job 
 (Job.java:monitorAndPrintJob(1227)) -  map 0% reduce 0%

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-3889) job client tries to use /tasklog interface, but that doesn't exist anymore


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated MAPREDUCE-3889:
-

 Target Version/s: 2.0.0-alpha, 0.23.3, 3.0.0  (was: 0.23.3, 2.0.0-alpha, 
3.0.0)
Affects Version/s: 3.0.0
   2.0.1-alpha

 job client tries to use /tasklog interface, but that doesn't exist anymore
 --

 Key: MAPREDUCE-3889
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3889
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.1, 2.0.1-alpha, 3.0.0
Reporter: Thomas Graves
Assignee: Devaraj K
Priority: Critical
 Attachments: MAPREDUCE-3889.patch, MAPREDUCE-3889.patch


 if you specify  -Dmapreduce.client.output.filter=SUCCEEDED option when 
 running a job it tries to fetch task logs to print out on the client side 
 from a url like: 
 http://nodemanager:8080/tasklog?plaintext=trueattemptid=attempt_1329857083014_0003_r_00_0filter=stdout
 It always errors on this request with: Required param job, map and reduce
 We saw this error when using distcp and the distcp failed. I'm not sure if it 
 is mandatory for distcp or just informational purposes.  I'm guessing the 
 latter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4345) ZK-based High Availability (HA) for ResourceManager (RM)

[
https://issues.apache.org/jira/browse/MAPREDUCE-4345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Bikas Saha updated MAPREDUCE-4345:
--

Assignee: Bikas Saha

Assigning to myself since this looks like something that follows directly after
MAPREDUCE-4326 and design/implementation would be closely related with it.

ZK-based High Availability (HA) for ResourceManager (RM)

Key: MAPREDUCE-4345
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4345
Project: Hadoop Map/Reduce
Issue Type: Improvement
Reporter: Harsh J
Assignee: Bikas Saha

One of the goals presented on MAPREDUCE-279 was to have high availability.
One way that was discussed, per Mahadev/others on
https://issues.apache.org/jira/browse/MAPREDUCE-2648 and other places, was ZK:
{quote}
Am not sure, if you already know about the MR-279 branch (the next version of
MR framework). We've been trying to integrate ZK into the framework from the
beginning. As for now, we are just doing restart with ZK but soon we should
have a HA soln with ZK.
{quote}
There is now MAPREDUCE-4343 that tracks recoverability via ZK. This JIRA is
meant to track HA via ZK.
Currently there isn't a HA solution for RM, via ZK or otherwise.

[jira] [Commented] (MAPREDUCE-4326) Resurrect RM Restart


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396041#comment-13396041
 ] 

Bikas Saha commented on MAPREDUCE-4326:
---

Will be posting a preliminary design sketch this week for comments.

 Resurrect RM Restart 
 -

 Key: MAPREDUCE-4326
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4326
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, resourcemanager
Affects Versions: 2.0.0-alpha
Reporter: Arun C Murthy
Assignee: Bikas Saha

 We should resurrect 'RM Restart' which we disabled sometime during the RM 
 refactor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4341) add types to capacity scheduler properties documentation


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4341:


Attachment: (was: MR-4341.patch)

 add types to capacity scheduler properties documentation
 

 Key: MAPREDUCE-4341
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4341
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/capacity-sched, mrv2
Affects Versions: 0.23.3
Reporter: Thomas Graves
Assignee: Karthik Kambatla

 MAPREDUCE-4311 is changing capacity/max capacity configuration to be floats. 
 We should document that in the capacity scheduler properties docs 
 (http://hadoop.apache.org/common/docs/r0.23.1/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html#Configuration).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4341) add types to capacity scheduler properties documentation


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4341:


Attachment: MR-4341.patch

 add types to capacity scheduler properties documentation
 

 Key: MAPREDUCE-4341
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4341
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/capacity-sched, mrv2
Affects Versions: 0.23.3
Reporter: Thomas Graves
Assignee: Karthik Kambatla
 Attachments: MR-4341.patch


 MAPREDUCE-4311 is changing capacity/max capacity configuration to be floats. 
 We should document that in the capacity scheduler properties docs 
 (http://hadoop.apache.org/common/docs/r0.23.1/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html#Configuration).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4341) add types to capacity scheduler properties documentation


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-4341:


Fix Version/s: 0.23.3
   Status: Patch Available  (was: Open)

Modified documentation to mention both capacity and max-capacity are of type 
float.

Didn't test.

 add types to capacity scheduler properties documentation
 

 Key: MAPREDUCE-4341
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4341
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/capacity-sched, mrv2
Affects Versions: 0.23.3
Reporter: Thomas Graves
Assignee: Karthik Kambatla
 Fix For: 0.23.3

 Attachments: MR-4341.patch


 MAPREDUCE-4311 is changing capacity/max capacity configuration to be floats. 
 We should document that in the capacity scheduler properties docs 
 (http://hadoop.apache.org/common/docs/r0.23.1/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html#Configuration).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4341) add types to capacity scheduler properties documentation

[
https://issues.apache.org/jira/browse/MAPREDUCE-4341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396051#comment-13396051
]

Hadoop QA commented on MAPREDUCE-4341:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12532428/MR-4341.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed unit tests in
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site.

+1 contrib tests. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2469//testReport/
Console output:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2469//console

This message is automatically generated.

add types to capacity scheduler properties documentation

Key: MAPREDUCE-4341
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4341
Project: Hadoop Map/Reduce
Issue Type: Bug
Components: contrib/capacity-sched, mrv2
Affects Versions: 0.23.3
Reporter: Thomas Graves
Assignee: Karthik Kambatla
Fix For: 0.23.3

Attachments: MR-4341.patch

MAPREDUCE-4311 is changing capacity/max capacity configuration to be floats.
We should document that in the capacity scheduler properties docs
(http://hadoop.apache.org/common/docs/r0.23.1/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html#Configuration).

[jira] [Commented] (MAPREDUCE-4343) ZK recovery support for ResourceManager

2012-06-18 Thread Tsuyoshi OZAWA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396063#comment-13396063
 ] 

Tsuyoshi OZAWA commented on MAPREDUCE-4343:
---

Sharad, 

Bikas marked MAPREDUCE-2713 as a duplicated task.

 ZK recovery support for ResourceManager
 ---

 Key: MAPREDUCE-4343
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4343
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Harsh J
 Attachments: MR-4343.1.patch


 MAPREDUCE-279 included bits and pieces of possible ZK integration for YARN's 
 RM, but looks like it failed to complete it (for scalability reasons? etc?) 
 and there seems to be no JIRA tracking this feature that has been already 
 claimed publicly as a good part about YARN.
 If it did complete it, we should document how to use it. Setting the 
 following only yields:
 {code}
 property
 nameyarn.resourcemanager.store.class/name
 valueorg.apache.hadoop.yarn.server.resourcemanager.recovery.ZKStore/value
 /property
 property
 nameyarn.resourcemanager.zookeeper-store.address/name
 valuetest.vm:2181/yarn-recovery-store/value
 /property
 {code}
 {code}
 Error starting ResourceManager
 java.lang.RuntimeException: java.lang.NoSuchMethodException: 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKStore.init()
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:128)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.StoreFactory.getStore(StoreFactory.java:32)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:621)
 Caused by: java.lang.NoSuchMethodException: 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKStore.init()
 at java.lang.Class.getConstructor0(Class.java:2706)
 at java.lang.Class.getDeclaredConstructor(Class.java:1985)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:122)
 ... 2 more
 {code}
 This JIRA is hence filed to track the addition/completion of recovery via ZK.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4290) JobStatus.getState() API is giving ambiguous values


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated MAPREDUCE-4290:
--

Status: Open  (was: Patch Available)

There needs to be a couple additional cases within the FINISHED state - to deal 
with KILLED/FAILED. Other than that the patch looks good.

Another problem with the getAllJobs() API - it gets the application list from 
the RM - which means it's going to convert non MapReduce apps as well. Don't 
believe there's any good way to differentiate between application types from 
the RM list.

 JobStatus.getState() API is giving ambiguous values
 ---

 Key: MAPREDUCE-4290
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4290
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Nishan Shetty
Assignee: Devaraj K
 Attachments: MAPREDUCE-4290.patch


 For failed job getState() API is giving status as SUCCEEDED if we use 
 JobClient.getAllJobs() for retrieving all jobs info from RM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (MAPREDUCE-4343) ZK recovery support for ResourceManager

2012-06-18 Thread Arun C Murthy (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy resolved MAPREDUCE-4343.
--

Resolution: Duplicate

Duplicate of MAPREDUCE-4326.

 ZK recovery support for ResourceManager
 ---

 Key: MAPREDUCE-4343
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4343
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Harsh J
 Attachments: MR-4343.1.patch


 MAPREDUCE-279 included bits and pieces of possible ZK integration for YARN's 
 RM, but looks like it failed to complete it (for scalability reasons? etc?) 
 and there seems to be no JIRA tracking this feature that has been already 
 claimed publicly as a good part about YARN.
 If it did complete it, we should document how to use it. Setting the 
 following only yields:
 {code}
 property
 nameyarn.resourcemanager.store.class/name
 valueorg.apache.hadoop.yarn.server.resourcemanager.recovery.ZKStore/value
 /property
 property
 nameyarn.resourcemanager.zookeeper-store.address/name
 valuetest.vm:2181/yarn-recovery-store/value
 /property
 {code}
 {code}
 Error starting ResourceManager
 java.lang.RuntimeException: java.lang.NoSuchMethodException: 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKStore.init()
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:128)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.StoreFactory.getStore(StoreFactory.java:32)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:621)
 Caused by: java.lang.NoSuchMethodException: 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKStore.init()
 at java.lang.Class.getConstructor0(Class.java:2706)
 at java.lang.Class.getDeclaredConstructor(Class.java:1985)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:122)
 ... 2 more
 {code}
 This JIRA is hence filed to track the addition/completion of recovery via ZK.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4306) Problem running Distributed Shell applications as a user other than the one started the daemons


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396080#comment-13396080
 ] 

Siddharth Seth commented on MAPREDUCE-4306:
---

The -user option in general seems to be broken. Even after this patch, the AM 
will be localized as the original user - since the RM picks up the username 
from ugi.

Maybe we should remove the -user option completely? and use 
ApplicationConstants.Environment.USER in the AM - which is anyway set by the 
RM, based on the logged in user.

 Problem running Distributed Shell applications as a user other than the one 
 started the daemons
 ---

 Key: MAPREDUCE-4306
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4306
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.0-alpha
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Fix For: 2.0.1-alpha

 Attachments: MAPREDUCE-4306.patch, MAPREDUCE-4306_rev2.patch


 Using the tarball, if you start the yarn daemons using one user and then 
 switch to a different user. You can successfully run MR jobs, but DS jobs 
 fail to run. Only able to run DS jobs using the user who started the daemons.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4349) Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker

2012-06-18 Thread Mayank Bansal (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396090#comment-13396090
 ] 

Mayank Bansal commented on MAPREDUCE-4349:
--

Distributed Cache gives inconsistent result if Archive files get deleted from 
the task tracker. DC still thinks that it still have the file however file is 
deleted

 Distributed Cache gives inconsistent result if cache Archive files get 
 deleted from task tracker 
 -

 Key: MAPREDUCE-4349
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4349
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.0, 1.0.3, trunk
Reporter: Mayank Bansal
Assignee: Mayank Bansal



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4326) Resurrect RM Restart

2012-06-18 Thread Tsuyoshi OZAWA (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated MAPREDUCE-4326:
--

Attachment: MR-4343.1.patch

Bikas,

The attached patch is originally created for MAPREDUCE-4343, which is marked as 
a duplicated task of this ticket.

The patch may be a reference, so I attached it to this ticket.

 Resurrect RM Restart 
 -

 Key: MAPREDUCE-4326
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4326
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, resourcemanager
Affects Versions: 2.0.0-alpha
Reporter: Arun C Murthy
Assignee: Bikas Saha
 Attachments: MR-4343.1.patch


 We should resurrect 'RM Restart' which we disabled sometime during the RM 
 refactor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (MAPREDUCE-4350) Distributed Cache should put files read only on Task tracker

2012-06-18 Thread Mayank Bansal (JIRA)

Mayank Bansal created MAPREDUCE-4350:


 Summary: Distributed Cache should put files read only on Task 
tracker
 Key: MAPREDUCE-4350
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4350
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distributed-cache
Affects Versions: 1.0.3, 0.22.0, trunk
Reporter: Mayank Bansal
Assignee: Mayank Bansal




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4350) Distributed Cache should put files read only on Task tracker

2012-06-18 Thread Mayank Bansal (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396096#comment-13396096
 ] 

Mayank Bansal commented on MAPREDUCE-4350:
--

This issue is based on the comment posted by robert 

https://issues.apache.org/jira/browse/MAPREDUCE-4342?focusedCommentId=13395942page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13395942

Thanks,
Mayank

 Distributed Cache should put files read only on Task tracker
 

 Key: MAPREDUCE-4350
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4350
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distributed-cache
Affects Versions: 0.22.0, 1.0.3, trunk
Reporter: Mayank Bansal
Assignee: Mayank Bansal



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4326) Resurrect RM Restart


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396103#comment-13396103
 ] 

Bikas Saha commented on MAPREDUCE-4326:
---

Thanks! I will take a look before posting the design.

 Resurrect RM Restart 
 -

 Key: MAPREDUCE-4326
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4326
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2, resourcemanager
Affects Versions: 2.0.0-alpha
Reporter: Arun C Murthy
Assignee: Bikas Saha
 Attachments: MR-4343.1.patch


 We should resurrect 'RM Restart' which we disabled sometime during the RM 
 refactor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4203) Create equivalent of ProcfsBasedProcessTree for Windows

2012-06-18 Thread Jonathan Eagles (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396132#comment-13396132
 ] 

Jonathan Eagles commented on MAPREDUCE-4203:


Thanks, Bikas. Just trying to prevent Hadoop code from being contaminated with 
GPL or proprietary code licenses. Sounds like you are already controlling for 
that.

 Create equivalent of ProcfsBasedProcessTree for Windows
 ---

 Key: MAPREDUCE-4203
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4203
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: MAPREDUCE-4203.branch-1-win.1.patch, 
 MAPREDUCE-4203.patch, test.cpp


 ProcfsBasedProcessTree is used by the TaskTracker to get process information 
 like memory and cpu usage. This information is used to manage resources etc. 
 The current implementation is based on Linux procfs functionality and hence 
 does not work on other platforms, specifically windows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4288) ClusterStatus.getMapTasks() and ClusterStatus.getReduceTasks() is giving one when no job is running


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396130#comment-13396130
 ] 

Karthik Kambatla commented on MAPREDUCE-4288:
-

In YARN, the ClusterMetrics should only correspond to numNodeManagers, 
numActiveJobs(), numActiveContainers(), availableResources(). Other 
job/app-specific metrics should move to the corresponding AMs. JobStatus would 
be a good place to have these metrics.

Subsequently, JobClient.getClusterStatus() can correspond to the job-specific 
metrics (would be a misnomer).

Comments?

 ClusterStatus.getMapTasks() and ClusterStatus.getReduceTasks() is giving one 
 when no job is running
 ---

 Key: MAPREDUCE-4288
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4288
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.0-alpha
Reporter: Nishan Shetty
Assignee: Karthik Kambatla

 When no job is running in the cluster invoke the ClusterStatus.getMapTasks() 
 and ClusterStatus.getReduceTasks() API's
 Observed that these API's are returning one instead of zero(as no job is 
 running)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4335) Change the default scheduler to the CapacityScheduler


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated MAPREDUCE-4335:
--

Attachment: MR4335_4.txt

Thanks for taking a look Arun.

Updated the patch with the default scheduler defined in YarnConfiguration. Had 
to move the class loading into the ResourceManager instead of relying on 
Configuration.getClass...

 Change the default scheduler to the CapacityScheduler
 -

 Key: MAPREDUCE-4335
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4335
 Project: Hadoop Map/Reduce
  Issue Type: Task
  Components: mrv2
Affects Versions: 2.0.0-alpha
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Attachments: MR4335.txt, MR4335_2.txt, MR4335_3.txt, MR4335_4.txt


 There's some bugs in the FifoScheduler atm - doesn't distribute tasks across 
 nodes and some headroom (available resource) issues.
 That's not the best experience for users trying out the 2.0 branch. The CS 
 with the default configuration of a single queue behaves the same as the 
 FifoScheduler and doesn't have these issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4335) Change the default scheduler to the CapacityScheduler


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated MAPREDUCE-4335:
--

Status: Patch Available  (was: Open)

 Change the default scheduler to the CapacityScheduler
 -

 Key: MAPREDUCE-4335
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4335
 Project: Hadoop Map/Reduce
  Issue Type: Task
  Components: mrv2
Affects Versions: 2.0.0-alpha
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Attachments: MR4335.txt, MR4335_2.txt, MR4335_3.txt, MR4335_4.txt


 There's some bugs in the FifoScheduler atm - doesn't distribute tasks across 
 nodes and some headroom (available resource) issues.
 That's not the best experience for users trying out the 2.0 branch. The CS 
 with the default configuration of a single queue behaves the same as the 
 FifoScheduler and doesn't have these issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4335) Change the default scheduler to the CapacityScheduler

[
https://issues.apache.org/jira/browse/MAPREDUCE-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396178#comment-13396178
]

Hadoop QA commented on MAPREDUCE-4335:
--

+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12532445/MR4335_4.txt
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 10 new or modified test
files.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed unit tests in
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-api
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common
hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager

hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests.

+1 contrib tests. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2470//testReport/
Console output:
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2470//console

This message is automatically generated.

Change the default scheduler to the CapacityScheduler
-

Key: MAPREDUCE-4335
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4335
Project: Hadoop Map/Reduce
Issue Type: Task
Components: mrv2
Affects Versions: 2.0.0-alpha
Reporter: Siddharth Seth
Assignee: Siddharth Seth
Attachments: MR4335.txt, MR4335_2.txt, MR4335_3.txt, MR4335_4.txt

There's some bugs in the FifoScheduler atm - doesn't distribute tasks across
nodes and some headroom (available resource) issues.
That's not the best experience for users trying out the 2.0 branch. The CS
with the default configuration of a single queue behaves the same as the
FifoScheduler and doesn't have these issues.

[jira] [Commented] (MAPREDUCE-3235) Improve CPU cache behavior in map side sort

2012-06-18 Thread Todd Lipcon (JIRA)

[
https://issues.apache.org/jira/browse/MAPREDUCE-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396222#comment-13396222
]

Todd Lipcon commented on MAPREDUCE-3235:

bq. BTW, I know you are interested in JVM intrinsic binary array compare

I guess you're working with Krystal Mok? Cool stuff, I hope to see it make it
into OpenJDK as well!

bq. Almost the same, depends on if there are rack local maps. the more rack
local maps, the slower.

You mean that if there are more rack-local (as opposed to data-local), right?
If everything is data-local (eg terasort on an empty cluster) then I would
expect the CPU difference to make a more noticeable difference.

Improve CPU cache behavior in map side sort
---

Key: MAPREDUCE-3235
URL: https://issues.apache.org/jira/browse/MAPREDUCE-3235
Project: Hadoop Map/Reduce
Issue Type: Improvement
Components: performance, task
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Attachments: map_sort_perf.diff, mr-3235-poc.txt

When running oprofile on a terasort workload, I noticed that a large amount
of CPU usage was going to MapTask$MapOutputBuffer.compare. Upon disassembling
this and looking at cycle counters, most of the cycles were going to memory
loads dereferencing into the array of key-value data -- implying expensive
cache misses. This can be avoided as follows:
- rather than simply swapping indexes into the kv array, swap the entire meta
entries in the meta array. Swapping 16 bytes is only negligibly slower than
swapping 4 bytes. This requires adding the value-length into the meta array,
since we used to rely on the previous-in-the-array meta entry to determine
this. So we replace INDEX with VALUELEN and avoid one layer of indirection.
- introduce an interface which allows key types to provide a 4-byte
comparison proxy. For string keys, this can simply be the first 4 bytes of
the string. The idea is that, if stringCompare(key1.proxy(), key2.proxy()) !=
0, then compare(key1, key2) should have the same result. If the proxies are
equal, the normal comparison method is used. We then include the 4-byte proxy
as part of the metadata entry, so that for many cases the indirection into
the data buffer can be avoided.
On a terasort benchmark, these optimizations plus an optimization to
WritableComparator.compareBytes dropped the aggregate mapside CPU millis by
40%, and the compare() routine mostly dropped off the oprofile results.

[jira] [Commented] (MAPREDUCE-4306) Problem running Distributed Shell applications as a user other than the one started the daemons

[
https://issues.apache.org/jira/browse/MAPREDUCE-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396223#comment-13396223
]

Ahmed Radwan commented on MAPREDUCE-4306:
-

Thanks Siddharth for the review!

I agree, I think it is better to completely remove the -user option. I
originally thought of just keeping it in case it can be used for testing or
other purposes. But leaving it now may lead to confusion, and also setting it
to something other than the original user will lead to failure as described
above.

Also reading ApplicationConstants.Environment.USER is simpler than reevaluating
the username from ugi (which will give the same result after all). I have
updated the patch accordingly. Thanks!

Problem running Distributed Shell applications as a user other than the one
started the daemons
---

Key: MAPREDUCE-4306
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4306
Project: Hadoop Map/Reduce
Issue Type: Bug
Components: mrv2
Affects Versions: 2.0.0-alpha
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
Fix For: 2.0.1-alpha

Attachments: MAPREDUCE-4306.patch, MAPREDUCE-4306_rev2.patch

Using the tarball, if you start the yarn daemons using one user and then
switch to a different user. You can successfully run MR jobs, but DS jobs
fail to run. Only able to run DS jobs using the user who started the daemons.

[jira] [Updated] (MAPREDUCE-4306) Problem running Distributed Shell applications as a user other than the one started the daemons


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Radwan updated MAPREDUCE-4306:


Attachment: MAPREDUCE-4306_rev3.patch

 Problem running Distributed Shell applications as a user other than the one 
 started the daemons
 ---

 Key: MAPREDUCE-4306
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4306
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.0-alpha
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan
 Fix For: 2.0.1-alpha

 Attachments: MAPREDUCE-4306.patch, MAPREDUCE-4306_rev2.patch, 
 MAPREDUCE-4306_rev3.patch


 Using the tarball, if you start the yarn daemons using one user and then 
 switch to a different user. You can successfully run MR jobs, but DS jobs 
 fail to run. Only able to run DS jobs using the user who started the daemons.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4334) Add support for CPU isolation/monitoring of containers

2012-06-18 Thread Andrew Ferguson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396261#comment-13396261
 ] 

Andrew Ferguson commented on MAPREDUCE-4334:


ok, putting all of this in the ContainerExecutor is not the way to go, as it 
precludes use of secure Hadoop's Linux container-executor.

In my new design, ContainerMonitor will be a pluggable component, just as 
ContainerExecutor is now. Then, we can provide a ContainerMonitor which uses 
cgroups to control resource usage, rather than the existing ContainerMonitor 
(to be renamed as DefaultContainerMonitor). This has several advantages:
1) allows us to keep existing ContainerMonitor for users who can't use cgroups 
(eg, users without root access during Hadoop setup)
2) ContainerMonitor already receives an event when it's time to stop 
monitoring, which we can use as notification to delete the container's cgroup
3) ContainerMonitor receives the resource limits already; no need to calculate 
them based on the configs
4) A pluggable ContainerMonitor paves the way for ContainerMonitors on other 
platforms

I will first open a sub-task to make ContainerMonitor pluggable.

The only trouble spot with this design is that it's not possible to move 
another non-root user's process into a cgroup. I plan to extend the secure 
container-executor to be able to make such a move.

Please let me know if you have any feedback about this proposal.


thank you,
Andrew

 Add support for CPU isolation/monitoring of containers
 --

 Key: MAPREDUCE-4334
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4334
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Arun C Murthy
Assignee: Arun C Murthy

 Once we get in MAPREDUCE-4327, it will be important to actually enforce 
 limits on CPU consumption of containers. 
 Several options spring to mind:
 # taskset (RHEL5+)
 # cgroups (RHEL6+)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4203) Create equivalent of ProcfsBasedProcessTree for Windows


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated MAPREDUCE-4203:
--

Attachment: MAPREDUCE-4203.branch-1-win.2.patch

Fix some bugs in formatting. 
TestTaskTrackerMemoryManager now passes on Windows and tests the feature 
functionally.

 Create equivalent of ProcfsBasedProcessTree for Windows
 ---

 Key: MAPREDUCE-4203
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4203
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: MAPREDUCE-4203.branch-1-win.1.patch, 
 MAPREDUCE-4203.branch-1-win.2.patch, MAPREDUCE-4203.patch, test.cpp


 ProcfsBasedProcessTree is used by the TaskTracker to get process information 
 like memory and cpu usage. This information is used to manage resources etc. 
 The current implementation is based on Linux procfs functionality and hence 
 does not work on other platforms, specifically windows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4342) Distributed Cache gives inconsistent result if cache files get deleted from task tracker

2012-06-18 Thread Konstantin Shvachko (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396386#comment-13396386
 ] 

Konstantin Shvachko commented on MAPREDUCE-4342:


Mayank, the patch is not applying as is. Namely the empty line change in 
TrackerDistributedCacheManager. You can just leave the line there. I did that, 
but then it is not compiling. You need to sync it with the repo.

- Could you also change is been to has been as Robert suggested.
- And add spaces between method parameters.
- Reporting the results of test-patch and test builds would very useful, since 
we don't have Jenkins to verify that for 0.22.

The fix looks good modular the jiras you opened.


 Distributed Cache gives inconsistent result if cache files get deleted from 
 task tracker 
 -

 Key: MAPREDUCE-4342
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4342
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.0, 1.0.3, trunk
Reporter: Mayank Bansal
Assignee: Mayank Bansal
 Attachments: MAPREDUCE-4342-22-1.patch, MAPREDUCE-4342-22.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4351) Make ContainersMonitor pluggable

2012-06-18 Thread Andrew Ferguson (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Ferguson updated MAPREDUCE-4351:
---

Attachment: MAPREDUCE-4351-v1.patch

First cut at making ContainersMonitor pluggable. I have tested that the new 
configuration option is used, and that it works with a local cluster.

 Make ContainersMonitor pluggable
 

 Key: MAPREDUCE-4351
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4351
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mrv2, nodemanager
Reporter: Andrew Ferguson
 Attachments: MAPREDUCE-4351-v1.patch


 Make the existing ContainersManager pluggable, just as the ContainerExecutor 
 is currently. This will allow us to add container resource enforcement using 
 other techniques (such as cgroups) in an extensible fashion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4351) Make ContainersMonitor pluggable

2012-06-18 Thread Andrew Ferguson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396411#comment-13396411
 ] 

Andrew Ferguson commented on MAPREDUCE-4351:


the bulk of the lines in the patch are to rename ContainersMonitorImpl.java to 
DefaultContainersMonitor.java, and TestContainersMonitor.java to 
TestDefaultContainersMonitor.java

 Make ContainersMonitor pluggable
 

 Key: MAPREDUCE-4351
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4351
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: mrv2, nodemanager
Reporter: Andrew Ferguson
 Attachments: MAPREDUCE-4351-v1.patch


 Make the existing ContainersManager pluggable, just as the ContainerExecutor 
 is currently. This will allow us to add container resource enforcement using 
 other techniques (such as cgroups) in an extensible fashion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-3868) Reenable Raid

2012-06-18 Thread Scott Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Chen updated MAPREDUCE-3868:
--

Issue Type: Bug  (was: New Feature)

 Reenable Raid
 -

 Key: MAPREDUCE-3868
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3868
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/raid
Reporter: Scott Chen
Assignee: Weiyan Wang
 Attachments: MAPREDUCE-3868-1.patch, MAPREDUCE-3868-2.patch, 
 MAPREDUCE-3868-3.patch, MAPREDUCE-3868.patch, MAPREDUCE-3868v1.patch, 
 MAPREDUCE-3868v1.sh


 Currently Raid is outdated and not compiled. Make it compile.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (MAPREDUCE-3868) Reenable Raid

2012-06-18 Thread Scott Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Chen resolved MAPREDUCE-3868.
---

  Resolution: Fixed
Hadoop Flags: Reviewed

I just committed this. Thanks, Weiyan.

 Reenable Raid
 -

 Key: MAPREDUCE-3868
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3868
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/raid
Reporter: Scott Chen
Assignee: Weiyan Wang
 Attachments: MAPREDUCE-3868-1.patch, MAPREDUCE-3868-2.patch, 
 MAPREDUCE-3868-3.patch, MAPREDUCE-3868.patch, MAPREDUCE-3868v1.patch, 
 MAPREDUCE-3868v1.sh


 Currently Raid is outdated and not compiled. Make it compile.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-3868) Reenable Raid

2012-06-18 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396434#comment-13396434
 ] 

Hudson commented on MAPREDUCE-3868:
---

Integrated in Hadoop-Common-trunk-Commit #2369 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2369/])
MAPREDUCE-3868. Make Raid Compile. (Weiyan Wang via schen) (Revision 
1351548)

 Result = SUCCESS
schen : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1351548
Files : 
* 
/hadoop/common/trunk/hadoop-assemblies/src/main/resources/assemblies/hadoop-raid-dist.xml
* /hadoop/common/trunk/hadoop-dist/pom.xml
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/pom.xml
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/conf
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/hdfs/server/datanode/RaidBlockSender.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRaidUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/BlockFixer.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DirectoryTraversal.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaid.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/GaloisField.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/JobMonitor.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidShell.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/ReedSolomonCode.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/sbin
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/hdfs/TestRaidDfs.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestBlockFixer.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestDirectoryTraversal.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestErasureCodes.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidFilter.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidHar.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidPurge.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidShell.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidShellFsck.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestReedSolomonDecoder.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestReedSolomonEncoder.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java
* /hadoop/common/trunk/hadoop-hdfs-project/pom.xml
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/bin
* /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/conf
* /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/src/java/org
* /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/src/test/org
* /hadoop/common/trunk/hadoop-project/pom.xml


 Reenable Raid
 -

 Key: MAPREDUCE-3868
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3868
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/raid
Reporter: Scott Chen

[jira] [Commented] (MAPREDUCE-3868) Reenable Raid

2012-06-18 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396437#comment-13396437
 ] 

Hudson commented on MAPREDUCE-3868:
---

Integrated in Hadoop-Hdfs-trunk-Commit #2439 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2439/])
MAPREDUCE-3868. Make Raid Compile. (Weiyan Wang via schen) (Revision 
1351548)

 Result = SUCCESS
schen : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1351548
Files : 
* 
/hadoop/common/trunk/hadoop-assemblies/src/main/resources/assemblies/hadoop-raid-dist.xml
* /hadoop/common/trunk/hadoop-dist/pom.xml
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/pom.xml
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/conf
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/hdfs/server/datanode/RaidBlockSender.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRaidUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/BlockFixer.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DirectoryTraversal.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaid.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/GaloisField.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/JobMonitor.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidShell.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/ReedSolomonCode.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/sbin
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/hdfs/TestRaidDfs.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestBlockFixer.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestDirectoryTraversal.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestErasureCodes.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidFilter.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidHar.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidPurge.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidShell.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidShellFsck.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestReedSolomonDecoder.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestReedSolomonEncoder.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java
* /hadoop/common/trunk/hadoop-hdfs-project/pom.xml
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/bin
* /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/conf
* /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/src/java/org
* /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/src/test/org
* /hadoop/common/trunk/hadoop-project/pom.xml


 Reenable Raid
 -

 Key: MAPREDUCE-3868
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3868
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/raid
Reporter: Scott Chen
Assignee:

[jira] [Commented] (MAPREDUCE-4336) Distributed Shell fails when used with the CapacityScheduler


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396442#comment-13396442
 ] 

Ahmed Radwan commented on MAPREDUCE-4336:
-

The fix looks fairly straight forward: set the queue name for 
GetQueueInfoRequest, and also add default as the default queue name if not 
specified on the command line. I'll upload a patch now. 

 Distributed Shell fails when used with the CapacityScheduler
 

 Key: MAPREDUCE-4336
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4336
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.0-alpha
Reporter: Siddharth Seth

 DistributedShell attempts to get queue info without providing a queue name - 
 which ends up in an NPE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4336) Distributed Shell fails when used with the CapacityScheduler


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Radwan updated MAPREDUCE-4336:


Attachment: MAPREDUCE-4336.patch

I have manually tested the patch by successfully submitting/running DS jobs on 
both the capacity and fifo schedulers.

 Distributed Shell fails when used with the CapacityScheduler
 

 Key: MAPREDUCE-4336
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4336
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.0-alpha
Reporter: Siddharth Seth
 Attachments: MAPREDUCE-4336.patch


 DistributedShell attempts to get queue info without providing a queue name - 
 which ends up in an NPE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (MAPREDUCE-4336) Distributed Shell fails when used with the CapacityScheduler


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Radwan reassigned MAPREDUCE-4336:
---

Assignee: Ahmed Radwan

 Distributed Shell fails when used with the CapacityScheduler
 

 Key: MAPREDUCE-4336
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4336
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.0.0-alpha
Reporter: Siddharth Seth
Assignee: Ahmed Radwan
 Attachments: MAPREDUCE-4336.patch


 DistributedShell attempts to get queue info without providing a queue name - 
 which ends up in an NPE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4336) Distributed Shell fails when used with the CapacityScheduler