[jira] [Created] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient
Ahmed Radwan created MAPREDUCE-4346: --- Summary: Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient Key: MAPREDUCE-4346 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Reporter: Ahmed Radwan Assignee: Ahmed Radwan The current implementation for JobTracker.getAllJobs() returns all submitted jobs in any state, in addition to retired jobs. This list can be long and represents an unneeded overhead especially in the case of clients only interested in jobs in specific state(s). It is beneficial to include a refined version where only jobs having specific statuses are returned and retired jobs are optional to include. I'll be uploading an initial patch momentarily. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient
[ https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmed Radwan updated MAPREDUCE-4346: Attachment: MAPREDUCE-4346.patch Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient -- Key: MAPREDUCE-4346 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Reporter: Ahmed Radwan Assignee: Ahmed Radwan Attachments: MAPREDUCE-4346.patch The current implementation for JobTracker.getAllJobs() returns all submitted jobs in any state, in addition to retired jobs. This list can be long and represents an unneeded overhead especially in the case of clients only interested in jobs in specific state(s). It is beneficial to include a refined version where only jobs having specific statuses are returned and retired jobs are optional to include. I'll be uploading an initial patch momentarily. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient
[ https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmed Radwan updated MAPREDUCE-4346: Status: Patch Available (was: Open) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient -- Key: MAPREDUCE-4346 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Reporter: Ahmed Radwan Assignee: Ahmed Radwan Attachments: MAPREDUCE-4346.patch The current implementation for JobTracker.getAllJobs() returns all submitted jobs in any state, in addition to retired jobs. This list can be long and represents an unneeded overhead especially in the case of clients only interested in jobs in specific state(s). It is beneficial to include a refined version where only jobs having specific statuses are returned and retired jobs are optional to include. I'll be uploading an initial patch momentarily. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient
[ https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13395486#comment-13395486 ] Hadoop QA commented on MAPREDUCE-4346: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12532379/MAPREDUCE-4346.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2465//console This message is automatically generated. Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient -- Key: MAPREDUCE-4346 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Reporter: Ahmed Radwan Assignee: Ahmed Radwan Attachments: MAPREDUCE-4346.patch The current implementation for JobTracker.getAllJobs() returns all submitted jobs in any state, in addition to retired jobs. This list can be long and represents an unneeded overhead especially in the case of clients only interested in jobs in specific state(s). It is beneficial to include a refined version where only jobs having specific statuses are returned and retired jobs are optional to include. I'll be uploading an initial patch momentarily. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient
[ https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13395490#comment-13395490 ] Hadoop QA commented on MAPREDUCE-4346: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12532379/MAPREDUCE-4346.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2466//console This message is automatically generated. Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient -- Key: MAPREDUCE-4346 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Reporter: Ahmed Radwan Assignee: Ahmed Radwan Attachments: MAPREDUCE-4346.patch The current implementation for JobTracker.getAllJobs() returns all submitted jobs in any state, in addition to retired jobs. This list can be long and represents an unneeded overhead especially in the case of clients only interested in jobs in specific state(s). It is beneficial to include a refined version where only jobs having specific statuses are returned and retired jobs are optional to include. I'll be uploading an initial patch momentarily. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-4114) saveVersion.sh fails if build directory contains space
[ https://issues.apache.org/jira/browse/MAPREDUCE-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Radim Kolar resolved MAPREDUCE-4114. Resolution: Duplicate saveVersion.sh fails if build directory contains space --- Key: MAPREDUCE-4114 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4114 Project: Hadoop Map/Reduce Issue Type: Bug Components: build Affects Versions: 0.23.2 Environment: FreeBSD 8.2, 64bit Reporter: Radim Kolar if you rename build directory to something without space like /tmp/hadoop then it works [INFO] [INFO] [INFO] Building hadoop-yarn-common 0.23.3-SNAPSHOT [INFO] [INFO] [INFO] --- maven-antrun-plugin:1.6:run (create-testdirs) @ hadoop-yarn-common --- [INFO] Executing tasks main: [INFO] Executed tasks [INFO] [INFO] --- maven-antrun-plugin:1.6:run (create-protobuf-generated-sources-directory) @ hadoop-yarn-common --- [INFO] Executing tasks main: [INFO] Executed tasks [INFO] [INFO] --- exec-maven-plugin:1.2:exec (generate-sources) @ hadoop-yarn-common --- [INFO] [INFO] --- exec-maven-plugin:1.2:exec (generate-version) @ hadoop-yarn-common --- scripts/saveVersion.sh: cannot create /usr/local/jboss/.jenkins/jobs/Hadoop 0.23 branch/workspace/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/target/generated-sources/version/org/apache/hadoop/yarn/package-info.java: No such file or directory [JENKINS] Archiving /usr/local/jboss/.jenkins/jobs/Hadoop 0.23 branch/workspace/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/pom.xml to /usr/local/jboss/.jenkins/jobs/Hadoop 0.23 branch/modules/org.apache.hadoop$hadoop-yarn-common/builds/2012-04-05_19-44-16/archive/org.apache.hadoop/hadoop-yarn-common/0.23.3-SNAPSHOT/hadoop-yarn-common-0.23.3-SNAPSHOT.pom [INFO] [INFO] Reactor Summary: [INFO] [INFO] Apache Hadoop Main SUCCESS [5.124s] [INFO] Apache Hadoop Project POM . SUCCESS [1.692s] [INFO] Apache Hadoop Annotations . SUCCESS [1.672s] [INFO] Apache Hadoop Project Dist POM SUCCESS [1.823s] [INFO] Apache Hadoop Assemblies .. SUCCESS [0.796s] [INFO] Apache Hadoop Auth SUCCESS [2.456s] [INFO] Apache Hadoop Auth Examples ... SUCCESS [1.093s] [INFO] Apache Hadoop Common .. SUCCESS [23.648s] [INFO] Apache Hadoop Common Project .. SUCCESS [0.434s] [INFO] Apache Hadoop HDFS SUCCESS [22.124s] [INFO] Apache Hadoop HttpFS .. SUCCESS [3.251s] [INFO] Apache Hadoop HDFS Project SUCCESS [0.443s] [INFO] hadoop-yarn ... SUCCESS [1.175s] [INFO] hadoop-yarn-api ... SUCCESS [7.049s] [INFO] hadoop-yarn-common FAILURE [5.565s] [INFO] hadoop-yarn-server SKIPPED [INFO] hadoop-yarn-server-common . SKIPPED [INFO] hadoop-yarn-server-nodemanager SKIPPED [INFO] hadoop-yarn-server-web-proxy .. SKIPPED [INFO] hadoop-yarn-server-resourcemanager SKIPPED [INFO] hadoop-yarn-server-tests .. SKIPPED [INFO] hadoop-mapreduce-client ... SKIPPED [INFO] hadoop-mapreduce-client-core .. SKIPPED [INFO] hadoop-yarn-applications .. SKIPPED [INFO] hadoop-yarn-applications-distributedshell . SKIPPED [INFO] hadoop-yarn-site .. SKIPPED [INFO] hadoop-mapreduce-client-common SKIPPED [INFO] hadoop-mapreduce-client-shuffle ... SKIPPED [INFO] hadoop-mapreduce-client-app ... SKIPPED [INFO] hadoop-mapreduce-client-hs SKIPPED [INFO] hadoop-mapreduce-client-jobclient . SKIPPED [INFO] Apache Hadoop MapReduce Examples .. SKIPPED [INFO] hadoop-mapreduce .. SKIPPED [INFO] Apache Hadoop MapReduce Streaming . SKIPPED [INFO] Apache Hadoop Distributed Copy SKIPPED [INFO] Apache Hadoop Archives SKIPPED [INFO] Apache Hadoop Rumen ... SKIPPED [INFO] Apache Hadoop
[jira] [Commented] (MAPREDUCE-3968) add support for getNumMapTasks() into mapreduce JobContext
[ https://issues.apache.org/jira/browse/MAPREDUCE-3968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13395756#comment-13395756 ] Radim Kolar commented on MAPREDUCE-3968: Yes, i need to know number of splits. add support for getNumMapTasks() into mapreduce JobContext -- Key: MAPREDUCE-3968 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3968 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Affects Versions: trunk Environment: hadoop 0.22 Reporter: Radim Kolar Priority: Minor Attachments: MAPREDUCE-3968.patch In old mapred api there was way to query number of mappers: job.getNumMapTasks()) No such function exists in new mapreduce api -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4031) Node Manager hangs on shut down
[ https://issues.apache.org/jira/browse/MAPREDUCE-4031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated MAPREDUCE-4031: - Status: Open (was: Patch Available) Node Manager hangs on shut down --- Key: MAPREDUCE-4031 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4031 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, nodemanager Affects Versions: 0.23.2, 2.0.1-alpha, 3.0.0 Reporter: Devaraj K Assignee: Devaraj K Priority: Critical Attachments: MAPREDUCE-4031.patch, MAPREDUCE-4031.patch, nm-threaddump.out I have the MAPREDUCE-3862 changes which fixed this issue earlier and yarn.nodemanager.delete.debug-delay-sec set to default value but still getting this issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4031) Node Manager hangs on shut down
[ https://issues.apache.org/jira/browse/MAPREDUCE-4031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated MAPREDUCE-4031: - Attachment: MAPREDUCE-4031.patch Node Manager hangs on shut down --- Key: MAPREDUCE-4031 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4031 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, nodemanager Affects Versions: 0.23.2, 2.0.1-alpha, 3.0.0 Reporter: Devaraj K Assignee: Devaraj K Priority: Critical Attachments: MAPREDUCE-4031.patch, MAPREDUCE-4031.patch, MAPREDUCE-4031.patch, nm-threaddump.out I have the MAPREDUCE-3862 changes which fixed this issue earlier and yarn.nodemanager.delete.debug-delay-sec set to default value but still getting this issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4031) Node Manager hangs on shut down
[ https://issues.apache.org/jira/browse/MAPREDUCE-4031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated MAPREDUCE-4031: - Status: Patch Available (was: Open) Thanks a lot Sid for looking into the patch. The above test failures are not related to the patch. Resubmitting the same patch to trigger Jenkins. Node Manager hangs on shut down --- Key: MAPREDUCE-4031 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4031 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, nodemanager Affects Versions: 0.23.2, 2.0.1-alpha, 3.0.0 Reporter: Devaraj K Assignee: Devaraj K Priority: Critical Attachments: MAPREDUCE-4031.patch, MAPREDUCE-4031.patch, MAPREDUCE-4031.patch, nm-threaddump.out I have the MAPREDUCE-3862 changes which fixed this issue earlier and yarn.nodemanager.delete.debug-delay-sec set to default value but still getting this issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4347) joined PhD. Intrested to do research in cloud especially in Hadoop
Suresh S created MAPREDUCE-4347: --- Summary: joined PhD. Intrested to do research in cloud especially in Hadoop Key: MAPREDUCE-4347 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4347 Project: Hadoop Map/Reduce Issue Type: Wish Reporter: Suresh S -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4347) joined PhD. Intrested to do research in cloud especially in Hadoop. need suggession for problems to work.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh S updated MAPREDUCE-4347: Summary: joined PhD. Intrested to do research in cloud especially in Hadoop. need suggession for problems to work. (was: joined PhD. Intrested to do research in cloud especially in Hadoop) joined PhD. Intrested to do research in cloud especially in Hadoop. need suggession for problems to work. - Key: MAPREDUCE-4347 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4347 Project: Hadoop Map/Reduce Issue Type: Wish Reporter: Suresh S -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4328) Add the option to quiesce the JobTracker
[ https://issues.apache.org/jira/browse/MAPREDUCE-4328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13395841#comment-13395841 ] Kang Xiao commented on MAPREDUCE-4328: -- It is useful in some condition such as NN is down. Actually we find a way to achieve the first goal by updating the fair scheduler's conf set each pool's max share to be zero. The second goal will protect the job from going to FAILED. But it seems so possible for a job to go to FAILED since no more task scheduled. It may be more simple to just not invoke assignTasks() in JobTracker to implement the first goal. And it will not burden the scheduler implementation since 'safemode' is a small probability event. Add the option to quiesce the JobTracker Key: MAPREDUCE-4328 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4328 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Affects Versions: 1.0.3 Reporter: Arun C Murthy Assignee: Arun C Murthy Attachments: MAPREDUCE-4328.patch In several failure scenarios it would be very handy to have an option to quiesce the JobTracker. Recently, we saw a case where the NameNode had to be rebooted at a customer due to a random hardware failure - in such a case it would have been nice to not lose jobs by quiescing the JobTracker. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-4347) joined PhD. Intrested to do research in cloud especially in Hadoop. need suggession for problems to work.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J resolved MAPREDUCE-4347. Resolution: Invalid The JIRA exists to track issues with the project, not for discussions such as these. Please send your email to mapreduce-...@hadoop.apache.org instead. Thanks! joined PhD. Intrested to do research in cloud especially in Hadoop. need suggession for problems to work. - Key: MAPREDUCE-4347 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4347 Project: Hadoop Map/Reduce Issue Type: Wish Reporter: Suresh S -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4039) Sort Avoidance
[ https://issues.apache.org/jira/browse/MAPREDUCE-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13395855#comment-13395855 ] Kang Xiao commented on MAPREDUCE-4039: -- @Schubert, could you give some typical applications that benefit from sort avoidance? It seems that using this feature simple aggregation app such as wordcount will use more memory to wait for all keys processed. Sort Avoidance -- Key: MAPREDUCE-4039 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4039 Project: Hadoop Map/Reduce Issue Type: New Feature Components: mrv2 Affects Versions: 0.23.2 Reporter: anty.rao Assignee: anty Priority: Minor Fix For: 0.23.2 Attachments: MAPREDUCE-4039-branch-0.23.2.patch, MAPREDUCE-4039-branch-0.23.2.patch Inspired by [Tenzing|http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//pubs/archive/37200.pdf], in 5.1 MapReduce Enhanceemtns: {quote}*Sort Avoidance*. Certain operators such as hash join and hash aggregation require shuffling, but not sorting. The MapReduce API was enhanced to automatically turn off sorting for these operations. When sorting is turned off, the mapper feeds data to the reducer which directly passes the data to the Reduce() function bypassing the intermediate sorting step. This makes many SQL operators significantly more ecient.{quote} There are a lot of applications which need aggregation only, not sorting.Using sorting to achieve aggregation is costly and inefficient. Without sorting, up application can make use of hash table or hash map to do aggregation efficiently.But application should bear in mind that reduce memory is limited, itself is committed to manage memory of reduce, guard against out of memory. Map-side combiner is not supported, you can also do hash aggregation in map side as a workaround. the following is the main points of sort avoidance implementation # add a configuration parameter ??mapreduce.sort.avoidance??, boolean type, to turn on/off sort avoidance workflow.Two type of workflow are coexist together. # key/value pairs emitted by map function is sorted by partition only, using a more efficient sorting algorithm: counting sort. # map-side merge, use a kind of byte merge, which just concatenate bytes from generated spills, read in bytes, write out bytes, without overhead of key/value serialization/deserailization, comparison, which current version incurs. # reduce can start up as soon as there is any map output available, in contrast to sort workflow which must wait until all map outputs are fetched and merged. # map output in memory can be directly consumed by reduce.When reduce can't catch up with the speed of incoming map outputs, in-memory merge thread will kick in, merging in-memory map outputs onto disk. # sequentially read in on-disk files to feed reduce, in contrast to currently implementation which read multiple files concurrently, result in many disk seek. Map output in memory take precedence over on disk files in feeding reduce function. I have already implement this feature based on hadoop CDH3U3 and done some performance evaluation, you can reference to [https://github.com/hanborq/hadoop] for details. Now,I'm willing to port it into yarn. Welcome for commenting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4343) ZK recovery support for ResourceManager
[ https://issues.apache.org/jira/browse/MAPREDUCE-4343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13395869#comment-13395869 ] Sharad Agarwal commented on MAPREDUCE-4343: --- There is already MAPREDUCE-2713 for this. Some ZK code may be lying around but it is not implemented as yet. can this be marked as duplicate ? ZK recovery support for ResourceManager --- Key: MAPREDUCE-4343 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4343 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Harsh J Attachments: MR-4343.1.patch MAPREDUCE-279 included bits and pieces of possible ZK integration for YARN's RM, but looks like it failed to complete it (for scalability reasons? etc?) and there seems to be no JIRA tracking this feature that has been already claimed publicly as a good part about YARN. If it did complete it, we should document how to use it. Setting the following only yields: {code} property nameyarn.resourcemanager.store.class/name valueorg.apache.hadoop.yarn.server.resourcemanager.recovery.ZKStore/value /property property nameyarn.resourcemanager.zookeeper-store.address/name valuetest.vm:2181/yarn-recovery-store/value /property {code} {code} Error starting ResourceManager java.lang.RuntimeException: java.lang.NoSuchMethodException: org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKStore.init() at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:128) at org.apache.hadoop.yarn.server.resourcemanager.recovery.StoreFactory.getStore(StoreFactory.java:32) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:621) Caused by: java.lang.NoSuchMethodException: org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKStore.init() at java.lang.Class.getConstructor0(Class.java:2706) at java.lang.Class.getDeclaredConstructor(Class.java:1985) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:122) ... 2 more {code} This JIRA is hence filed to track the addition/completion of recovery via ZK. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4348) JobSubmissionProtocol should be made public, not package private
Steve Loughran created MAPREDUCE-4348: - Summary: JobSubmissionProtocol should be made public, not package private Key: MAPREDUCE-4348 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4348 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Affects Versions: 1.0.3 Reporter: Steve Loughran Priority: Minor The JobSubmissionProtocol interface is package private, yet it is the only way to remotely query the status of the JT or the cluster. Even if Job Submission is considered private, probing JT state shouldn't be. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4341) add types to capacity scheduler properties documentation
[ https://issues.apache.org/jira/browse/MAPREDUCE-4341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13395887#comment-13395887 ] Thomas Graves commented on MAPREDUCE-4341: -- can you add it for max capacity also please. add types to capacity scheduler properties documentation Key: MAPREDUCE-4341 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4341 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/capacity-sched, mrv2 Affects Versions: 0.23.3 Reporter: Thomas Graves Assignee: Karthik Kambatla Attachments: MR-4341.patch MAPREDUCE-4311 is changing capacity/max capacity configuration to be floats. We should document that in the capacity scheduler properties docs (http://hadoop.apache.org/common/docs/r0.23.1/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html#Configuration). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4039) Sort Avoidance
[ https://issues.apache.org/jira/browse/MAPREDUCE-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] anty.rao updated MAPREDUCE-4039: Attachment: IndexedCountingSortable.java the missing file. Sort Avoidance -- Key: MAPREDUCE-4039 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4039 Project: Hadoop Map/Reduce Issue Type: New Feature Components: mrv2 Affects Versions: 0.23.2 Reporter: anty.rao Assignee: anty Priority: Minor Fix For: 0.23.2 Attachments: IndexedCountingSortable.java, MAPREDUCE-4039-branch-0.23.2.patch, MAPREDUCE-4039-branch-0.23.2.patch Inspired by [Tenzing|http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//pubs/archive/37200.pdf], in 5.1 MapReduce Enhanceemtns: {quote}*Sort Avoidance*. Certain operators such as hash join and hash aggregation require shuffling, but not sorting. The MapReduce API was enhanced to automatically turn off sorting for these operations. When sorting is turned off, the mapper feeds data to the reducer which directly passes the data to the Reduce() function bypassing the intermediate sorting step. This makes many SQL operators significantly more ecient.{quote} There are a lot of applications which need aggregation only, not sorting.Using sorting to achieve aggregation is costly and inefficient. Without sorting, up application can make use of hash table or hash map to do aggregation efficiently.But application should bear in mind that reduce memory is limited, itself is committed to manage memory of reduce, guard against out of memory. Map-side combiner is not supported, you can also do hash aggregation in map side as a workaround. the following is the main points of sort avoidance implementation # add a configuration parameter ??mapreduce.sort.avoidance??, boolean type, to turn on/off sort avoidance workflow.Two type of workflow are coexist together. # key/value pairs emitted by map function is sorted by partition only, using a more efficient sorting algorithm: counting sort. # map-side merge, use a kind of byte merge, which just concatenate bytes from generated spills, read in bytes, write out bytes, without overhead of key/value serialization/deserailization, comparison, which current version incurs. # reduce can start up as soon as there is any map output available, in contrast to sort workflow which must wait until all map outputs are fetched and merged. # map output in memory can be directly consumed by reduce.When reduce can't catch up with the speed of incoming map outputs, in-memory merge thread will kick in, merging in-memory map outputs onto disk. # sequentially read in on-disk files to feed reduce, in contrast to currently implementation which read multiple files concurrently, result in many disk seek. Map output in memory take precedence over on disk files in feeding reduce function. I have already implement this feature based on hadoop CDH3U3 and done some performance evaluation, you can reference to [https://github.com/hanborq/hadoop] for details. Now,I'm willing to port it into yarn. Welcome for commenting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4348) JobSubmissionProtocol should be made public, not package private
[ https://issues.apache.org/jira/browse/MAPREDUCE-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated MAPREDUCE-4348: -- Attachment: MAPREDUCE-4348.patch makes i/f public but marks as private and evolving. JobSubmissionProtocol should be made public, not package private Key: MAPREDUCE-4348 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4348 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Affects Versions: 1.0.3 Reporter: Steve Loughran Priority: Minor Attachments: MAPREDUCE-4348.patch Original Estimate: 0.5h Remaining Estimate: 0.5h The JobSubmissionProtocol interface is package private, yet it is the only way to remotely query the status of the JT or the cluster. Even if Job Submission is considered private, probing JT state shouldn't be. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4348) JobSubmissionProtocol should be made public, not package private
[ https://issues.apache.org/jira/browse/MAPREDUCE-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated MAPREDUCE-4348: -- Assignee: Steve Loughran Target Version/s: 1.1.0 Status: Patch Available (was: Open) JobSubmissionProtocol should be made public, not package private Key: MAPREDUCE-4348 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4348 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Affects Versions: 1.0.3 Reporter: Steve Loughran Assignee: Steve Loughran Priority: Minor Attachments: MAPREDUCE-4348.patch Original Estimate: 0.5h Remaining Estimate: 0.5h The JobSubmissionProtocol interface is package private, yet it is the only way to remotely query the status of the JT or the cluster. Even if Job Submission is considered private, probing JT state shouldn't be. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4039) Sort Avoidance
[ https://issues.apache.org/jira/browse/MAPREDUCE-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13395916#comment-13395916 ] anty.rao commented on MAPREDUCE-4039: - @Kang Yes, you are right. Using merge-sort to achieve aggregation maybe don't use so much memory as hash aggregation with this feature.But the process of merge-sort require much useless work to done, consume more resources, e.g. CPU, disk, network. it's just a tradeoff according to your usecase, latency requirement, etc. Sort Avoidance -- Key: MAPREDUCE-4039 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4039 Project: Hadoop Map/Reduce Issue Type: New Feature Components: mrv2 Affects Versions: 0.23.2 Reporter: anty.rao Assignee: anty Priority: Minor Fix For: 0.23.2 Attachments: IndexedCountingSortable.java, MAPREDUCE-4039-branch-0.23.2.patch, MAPREDUCE-4039-branch-0.23.2.patch Inspired by [Tenzing|http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//pubs/archive/37200.pdf], in 5.1 MapReduce Enhanceemtns: {quote}*Sort Avoidance*. Certain operators such as hash join and hash aggregation require shuffling, but not sorting. The MapReduce API was enhanced to automatically turn off sorting for these operations. When sorting is turned off, the mapper feeds data to the reducer which directly passes the data to the Reduce() function bypassing the intermediate sorting step. This makes many SQL operators significantly more ecient.{quote} There are a lot of applications which need aggregation only, not sorting.Using sorting to achieve aggregation is costly and inefficient. Without sorting, up application can make use of hash table or hash map to do aggregation efficiently.But application should bear in mind that reduce memory is limited, itself is committed to manage memory of reduce, guard against out of memory. Map-side combiner is not supported, you can also do hash aggregation in map side as a workaround. the following is the main points of sort avoidance implementation # add a configuration parameter ??mapreduce.sort.avoidance??, boolean type, to turn on/off sort avoidance workflow.Two type of workflow are coexist together. # key/value pairs emitted by map function is sorted by partition only, using a more efficient sorting algorithm: counting sort. # map-side merge, use a kind of byte merge, which just concatenate bytes from generated spills, read in bytes, write out bytes, without overhead of key/value serialization/deserailization, comparison, which current version incurs. # reduce can start up as soon as there is any map output available, in contrast to sort workflow which must wait until all map outputs are fetched and merged. # map output in memory can be directly consumed by reduce.When reduce can't catch up with the speed of incoming map outputs, in-memory merge thread will kick in, merging in-memory map outputs onto disk. # sequentially read in on-disk files to feed reduce, in contrast to currently implementation which read multiple files concurrently, result in many disk seek. Map output in memory take precedence over on disk files in feeding reduce function. I have already implement this feature based on hadoop CDH3U3 and done some performance evaluation, you can reference to [https://github.com/hanborq/hadoop] for details. Now,I'm willing to port it into yarn. Welcome for commenting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-4298) NodeManager crashed after running out of file descriptors
[ https://issues.apache.org/jira/browse/MAPREDUCE-4298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved MAPREDUCE-4298. -- Resolution: Duplicate dup of HADOOP-8495 NodeManager crashed after running out of file descriptors - Key: MAPREDUCE-4298 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4298 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, nodemanager Affects Versions: 0.23.3, 2.0.0-alpha, 3.0.0 Reporter: Jason Lowe Assignee: Jason Lowe Priority: Critical Attachments: MAPREDUCE-4298.patch A node on one of our clusters fell over because it ran out of open file descriptors. Log details with stack traceback to follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4342) Distributed Cache gives inconsistent result if cache files get deleted from task tracker
[ https://issues.apache.org/jira/browse/MAPREDUCE-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13395942#comment-13395942 ] Robert Joseph Evans commented on MAPREDUCE-4342: A couple of comments. # Minor correction to the grammar. {code}LOG.warn(Local Cache is been deleted... Downloading the cache again);{code} should be {code}LOG.warn(Local Cache has been deleted... Downloading the cache again);{code} # Please run test-patch on it and post the results. # I believe that this problem also exists in trunk and branch 2. It would be good to investigate and possibly file a JIRA, or post a patch for them as well. It looks good, but it is not perfect. It will work in the case where a single base distributed cache file or directory was deleted, but it will not work in the case where a file was corrupted, where a file in a cache archive was deleted, where new files were added, etc. I agree that we want to be able to deal with a file being removed, but I personally think that prevention is preferable to recovery, although it may not be as backwards compatible. I would prefer to see all of the files created in the distributed cache be marked as read only. If the files are part of a private cache and someone messes with them, by modifying the permissions then it is on their head, and they need to modify the original HDFS file to force it to download a new copy. Checking for corruption in because of FS/Disk issues is a separate one that we probably want to also look into, now that the data in the distributed cache can live for long periods of time. Distributed Cache gives inconsistent result if cache files get deleted from task tracker - Key: MAPREDUCE-4342 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4342 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.22.0, 1.0.3, trunk Reporter: Mayank Bansal Assignee: Mayank Bansal Attachments: MAPREDUCE-4342-22-1.patch, MAPREDUCE-4342-22.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4348) JobSubmissionProtocol should be made public, not package private
[ https://issues.apache.org/jira/browse/MAPREDUCE-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13395952#comment-13395952 ] Steve Loughran commented on MAPREDUCE-4348: --- # no tests, this is a package scope change, not a new feature. # it is to be applied against the 1.x branch JobSubmissionProtocol should be made public, not package private Key: MAPREDUCE-4348 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4348 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Affects Versions: 1.0.3 Reporter: Steve Loughran Assignee: Steve Loughran Priority: Minor Attachments: MAPREDUCE-4348.patch Original Estimate: 0.5h Remaining Estimate: 0.5h The JobSubmissionProtocol interface is package private, yet it is the only way to remotely query the status of the JT or the cluster. Even if Job Submission is considered private, probing JT state shouldn't be. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4341) add types to capacity scheduler properties documentation
[ https://issues.apache.org/jira/browse/MAPREDUCE-4341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated MAPREDUCE-4341: Status: Open (was: Patch Available) Will add the documentation for max-capacity as well, and upload another patch shortly. add types to capacity scheduler properties documentation Key: MAPREDUCE-4341 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4341 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/capacity-sched, mrv2 Affects Versions: 0.23.3 Reporter: Thomas Graves Assignee: Karthik Kambatla Attachments: MR-4341.patch MAPREDUCE-4311 is changing capacity/max capacity configuration to be floats. We should document that in the capacity scheduler properties docs (http://hadoop.apache.org/common/docs/r0.23.1/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html#Configuration). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4339) pi example job hangs on when run on hadoop 0.23.0 when capacity scheduler is included in the setting environment.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13395964#comment-13395964 ] Jason Lowe commented on MAPREDUCE-4339: --- I am unable to reproduce a hang like this on a single-node cluster. Could you examine the ResourceManager logs for issues or post them (after any necessary scrubbing/anonymization)? That would help track down what's going on when the job hangs. pi example job hangs on when run on hadoop 0.23.0 when capacity scheduler is included in the setting environment. - Key: MAPREDUCE-4339 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4339 Project: Hadoop Map/Reduce Issue Type: Bug Components: examples, job submission, mrv2, scheduler Affects Versions: 0.23.0 Environment: Ubuntu Server 11.04, Hadoop 0.23.0, Reporter: srikanth ayalasomayajulu Labels: hadoop Fix For: 0.23.0 Original Estimate: 48h Remaining Estimate: 48h Tried to include default capacity scheduler in hadoop and tried to run an example pi program. The job hangs and no more output is getting displayed. Starting Job 2012-06-12 22:10:02,524 INFO ipc.YarnRPC (YarnRPC.java:create(47)) - Creating YarnRPC for org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC 2012-06-12 22:10:02,538 INFO mapred.ResourceMgrDelegate (ResourceMgrDelegate.java:init(95)) - Connecting to ResourceManager at localhost/127.0.0.1:8030 2012-06-12 22:10:02,539 INFO ipc.HadoopYarnRPC (HadoopYarnProtoRPC.java:getProxy(48)) - Creating a HadoopYarnProtoRpc proxy for protocol interface org.apache.hadoop.yarn.api.ClientRMProtocol 2012-06-12 22:10:02,665 INFO mapred.ResourceMgrDelegate (ResourceMgrDelegate.java:init(99)) - Connected to ResourceManager at localhost/127.0.0.1:8030 2012-06-12 22:10:02,727 WARN conf.Configuration (Configuration.java:handleDeprecation(326)) - fs.default.name is deprecated. Instead, use fs.defaultFS 2012-06-12 22:10:02,728 WARN conf.Configuration (Configuration.java:handleDeprecation(343)) - mapred.used.genericoptionsparser is deprecated. Instead, use mapreduce.client.genericoptionsparser.used 2012-06-12 22:10:02,831 INFO input.FileInputFormat (FileInputFormat.java:listStatus(245)) - Total input paths to process : 10 2012-06-12 22:10:02,900 INFO mapreduce.JobSubmitter (JobSubmitter.java:submitJobInternal(362)) - number of splits:10 2012-06-12 22:10:03,044 INFO mapred.YARNRunner (YARNRunner.java:createApplicationSubmissionContext(279)) - AppMaster capability = memory: 2048 2012-06-12 22:10:03,286 INFO mapred.YARNRunner (YARNRunner.java:createApplicationSubmissionContext(355)) - Command to launch container for ApplicationMaster is : $JAVA_HOME/bin/java -Dlog4j.configuration=container-log4j.properties -Dyarn.app.mapreduce.container.log.dir=LOG_DIR -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Xmx1536m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1LOG_DIR/stdout 2LOG_DIR/stderr 2012-06-12 22:10:03,370 INFO mapred.ResourceMgrDelegate (ResourceMgrDelegate.java:submitApplication(304)) - Submitted application application_1339507608976_0002 to ResourceManager 2012-06-12 22:10:03,432 INFO mapreduce.Job (Job.java:monitorAndPrintJob(1207)) - Running job: job_1339507608976_0002 2012-06-12 22:10:04,443 INFO mapreduce.Job (Job.java:monitorAndPrintJob(1227)) - map 0% reduce 0% -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3889) job client tries to use /tasklog interface, but that doesn't exist anymore
[ https://issues.apache.org/jira/browse/MAPREDUCE-3889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated MAPREDUCE-3889: - Target Version/s: 2.0.0-alpha, 0.23.3, 3.0.0 (was: 0.23.3, 2.0.0-alpha, 3.0.0) Affects Version/s: 3.0.0 2.0.1-alpha job client tries to use /tasklog interface, but that doesn't exist anymore -- Key: MAPREDUCE-3889 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3889 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1, 2.0.1-alpha, 3.0.0 Reporter: Thomas Graves Assignee: Devaraj K Priority: Critical Attachments: MAPREDUCE-3889.patch, MAPREDUCE-3889.patch if you specify -Dmapreduce.client.output.filter=SUCCEEDED option when running a job it tries to fetch task logs to print out on the client side from a url like: http://nodemanager:8080/tasklog?plaintext=trueattemptid=attempt_1329857083014_0003_r_00_0filter=stdout It always errors on this request with: Required param job, map and reduce We saw this error when using distcp and the distcp failed. I'm not sure if it is mandatory for distcp or just informational purposes. I'm guessing the latter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4345) ZK-based High Availability (HA) for ResourceManager (RM)
[ https://issues.apache.org/jira/browse/MAPREDUCE-4345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated MAPREDUCE-4345: -- Assignee: Bikas Saha Assigning to myself since this looks like something that follows directly after MAPREDUCE-4326 and design/implementation would be closely related with it. ZK-based High Availability (HA) for ResourceManager (RM) Key: MAPREDUCE-4345 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4345 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Harsh J Assignee: Bikas Saha One of the goals presented on MAPREDUCE-279 was to have high availability. One way that was discussed, per Mahadev/others on https://issues.apache.org/jira/browse/MAPREDUCE-2648 and other places, was ZK: {quote} Am not sure, if you already know about the MR-279 branch (the next version of MR framework). We've been trying to integrate ZK into the framework from the beginning. As for now, we are just doing restart with ZK but soon we should have a HA soln with ZK. {quote} There is now MAPREDUCE-4343 that tracks recoverability via ZK. This JIRA is meant to track HA via ZK. Currently there isn't a HA solution for RM, via ZK or otherwise. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4326) Resurrect RM Restart
[ https://issues.apache.org/jira/browse/MAPREDUCE-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396041#comment-13396041 ] Bikas Saha commented on MAPREDUCE-4326: --- Will be posting a preliminary design sketch this week for comments. Resurrect RM Restart - Key: MAPREDUCE-4326 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4326 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, resourcemanager Affects Versions: 2.0.0-alpha Reporter: Arun C Murthy Assignee: Bikas Saha We should resurrect 'RM Restart' which we disabled sometime during the RM refactor. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4341) add types to capacity scheduler properties documentation
[ https://issues.apache.org/jira/browse/MAPREDUCE-4341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated MAPREDUCE-4341: Attachment: (was: MR-4341.patch) add types to capacity scheduler properties documentation Key: MAPREDUCE-4341 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4341 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/capacity-sched, mrv2 Affects Versions: 0.23.3 Reporter: Thomas Graves Assignee: Karthik Kambatla MAPREDUCE-4311 is changing capacity/max capacity configuration to be floats. We should document that in the capacity scheduler properties docs (http://hadoop.apache.org/common/docs/r0.23.1/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html#Configuration). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4341) add types to capacity scheduler properties documentation
[ https://issues.apache.org/jira/browse/MAPREDUCE-4341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated MAPREDUCE-4341: Attachment: MR-4341.patch add types to capacity scheduler properties documentation Key: MAPREDUCE-4341 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4341 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/capacity-sched, mrv2 Affects Versions: 0.23.3 Reporter: Thomas Graves Assignee: Karthik Kambatla Attachments: MR-4341.patch MAPREDUCE-4311 is changing capacity/max capacity configuration to be floats. We should document that in the capacity scheduler properties docs (http://hadoop.apache.org/common/docs/r0.23.1/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html#Configuration). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4341) add types to capacity scheduler properties documentation
[ https://issues.apache.org/jira/browse/MAPREDUCE-4341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated MAPREDUCE-4341: Fix Version/s: 0.23.3 Status: Patch Available (was: Open) Modified documentation to mention both capacity and max-capacity are of type float. Didn't test. add types to capacity scheduler properties documentation Key: MAPREDUCE-4341 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4341 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/capacity-sched, mrv2 Affects Versions: 0.23.3 Reporter: Thomas Graves Assignee: Karthik Kambatla Fix For: 0.23.3 Attachments: MR-4341.patch MAPREDUCE-4311 is changing capacity/max capacity configuration to be floats. We should document that in the capacity scheduler properties docs (http://hadoop.apache.org/common/docs/r0.23.1/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html#Configuration). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4341) add types to capacity scheduler properties documentation
[ https://issues.apache.org/jira/browse/MAPREDUCE-4341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396051#comment-13396051 ] Hadoop QA commented on MAPREDUCE-4341: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12532428/MR-4341.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2469//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2469//console This message is automatically generated. add types to capacity scheduler properties documentation Key: MAPREDUCE-4341 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4341 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/capacity-sched, mrv2 Affects Versions: 0.23.3 Reporter: Thomas Graves Assignee: Karthik Kambatla Fix For: 0.23.3 Attachments: MR-4341.patch MAPREDUCE-4311 is changing capacity/max capacity configuration to be floats. We should document that in the capacity scheduler properties docs (http://hadoop.apache.org/common/docs/r0.23.1/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html#Configuration). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4343) ZK recovery support for ResourceManager
[ https://issues.apache.org/jira/browse/MAPREDUCE-4343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396063#comment-13396063 ] Tsuyoshi OZAWA commented on MAPREDUCE-4343: --- Sharad, Bikas marked MAPREDUCE-2713 as a duplicated task. ZK recovery support for ResourceManager --- Key: MAPREDUCE-4343 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4343 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Harsh J Attachments: MR-4343.1.patch MAPREDUCE-279 included bits and pieces of possible ZK integration for YARN's RM, but looks like it failed to complete it (for scalability reasons? etc?) and there seems to be no JIRA tracking this feature that has been already claimed publicly as a good part about YARN. If it did complete it, we should document how to use it. Setting the following only yields: {code} property nameyarn.resourcemanager.store.class/name valueorg.apache.hadoop.yarn.server.resourcemanager.recovery.ZKStore/value /property property nameyarn.resourcemanager.zookeeper-store.address/name valuetest.vm:2181/yarn-recovery-store/value /property {code} {code} Error starting ResourceManager java.lang.RuntimeException: java.lang.NoSuchMethodException: org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKStore.init() at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:128) at org.apache.hadoop.yarn.server.resourcemanager.recovery.StoreFactory.getStore(StoreFactory.java:32) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:621) Caused by: java.lang.NoSuchMethodException: org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKStore.init() at java.lang.Class.getConstructor0(Class.java:2706) at java.lang.Class.getDeclaredConstructor(Class.java:1985) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:122) ... 2 more {code} This JIRA is hence filed to track the addition/completion of recovery via ZK. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4290) JobStatus.getState() API is giving ambiguous values
[ https://issues.apache.org/jira/browse/MAPREDUCE-4290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated MAPREDUCE-4290: -- Status: Open (was: Patch Available) There needs to be a couple additional cases within the FINISHED state - to deal with KILLED/FAILED. Other than that the patch looks good. Another problem with the getAllJobs() API - it gets the application list from the RM - which means it's going to convert non MapReduce apps as well. Don't believe there's any good way to differentiate between application types from the RM list. JobStatus.getState() API is giving ambiguous values --- Key: MAPREDUCE-4290 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4290 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Nishan Shetty Assignee: Devaraj K Attachments: MAPREDUCE-4290.patch For failed job getState() API is giving status as SUCCEEDED if we use JobClient.getAllJobs() for retrieving all jobs info from RM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-4343) ZK recovery support for ResourceManager
[ https://issues.apache.org/jira/browse/MAPREDUCE-4343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy resolved MAPREDUCE-4343. -- Resolution: Duplicate Duplicate of MAPREDUCE-4326. ZK recovery support for ResourceManager --- Key: MAPREDUCE-4343 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4343 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Harsh J Attachments: MR-4343.1.patch MAPREDUCE-279 included bits and pieces of possible ZK integration for YARN's RM, but looks like it failed to complete it (for scalability reasons? etc?) and there seems to be no JIRA tracking this feature that has been already claimed publicly as a good part about YARN. If it did complete it, we should document how to use it. Setting the following only yields: {code} property nameyarn.resourcemanager.store.class/name valueorg.apache.hadoop.yarn.server.resourcemanager.recovery.ZKStore/value /property property nameyarn.resourcemanager.zookeeper-store.address/name valuetest.vm:2181/yarn-recovery-store/value /property {code} {code} Error starting ResourceManager java.lang.RuntimeException: java.lang.NoSuchMethodException: org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKStore.init() at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:128) at org.apache.hadoop.yarn.server.resourcemanager.recovery.StoreFactory.getStore(StoreFactory.java:32) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:621) Caused by: java.lang.NoSuchMethodException: org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKStore.init() at java.lang.Class.getConstructor0(Class.java:2706) at java.lang.Class.getDeclaredConstructor(Class.java:1985) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:122) ... 2 more {code} This JIRA is hence filed to track the addition/completion of recovery via ZK. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4306) Problem running Distributed Shell applications as a user other than the one started the daemons
[ https://issues.apache.org/jira/browse/MAPREDUCE-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396080#comment-13396080 ] Siddharth Seth commented on MAPREDUCE-4306: --- The -user option in general seems to be broken. Even after this patch, the AM will be localized as the original user - since the RM picks up the username from ugi. Maybe we should remove the -user option completely? and use ApplicationConstants.Environment.USER in the AM - which is anyway set by the RM, based on the logged in user. Problem running Distributed Shell applications as a user other than the one started the daemons --- Key: MAPREDUCE-4306 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4306 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.0.0-alpha Reporter: Ahmed Radwan Assignee: Ahmed Radwan Fix For: 2.0.1-alpha Attachments: MAPREDUCE-4306.patch, MAPREDUCE-4306_rev2.patch Using the tarball, if you start the yarn daemons using one user and then switch to a different user. You can successfully run MR jobs, but DS jobs fail to run. Only able to run DS jobs using the user who started the daemons. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4349) Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker
[ https://issues.apache.org/jira/browse/MAPREDUCE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396090#comment-13396090 ] Mayank Bansal commented on MAPREDUCE-4349: -- Distributed Cache gives inconsistent result if Archive files get deleted from the task tracker. DC still thinks that it still have the file however file is deleted Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker - Key: MAPREDUCE-4349 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4349 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.22.0, 1.0.3, trunk Reporter: Mayank Bansal Assignee: Mayank Bansal -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4326) Resurrect RM Restart
[ https://issues.apache.org/jira/browse/MAPREDUCE-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsuyoshi OZAWA updated MAPREDUCE-4326: -- Attachment: MR-4343.1.patch Bikas, The attached patch is originally created for MAPREDUCE-4343, which is marked as a duplicated task of this ticket. The patch may be a reference, so I attached it to this ticket. Resurrect RM Restart - Key: MAPREDUCE-4326 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4326 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, resourcemanager Affects Versions: 2.0.0-alpha Reporter: Arun C Murthy Assignee: Bikas Saha Attachments: MR-4343.1.patch We should resurrect 'RM Restart' which we disabled sometime during the RM refactor. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4350) Distributed Cache should put files read only on Task tracker
Mayank Bansal created MAPREDUCE-4350: Summary: Distributed Cache should put files read only on Task tracker Key: MAPREDUCE-4350 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4350 Project: Hadoop Map/Reduce Issue Type: Bug Components: distributed-cache Affects Versions: 1.0.3, 0.22.0, trunk Reporter: Mayank Bansal Assignee: Mayank Bansal -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4350) Distributed Cache should put files read only on Task tracker
[ https://issues.apache.org/jira/browse/MAPREDUCE-4350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396096#comment-13396096 ] Mayank Bansal commented on MAPREDUCE-4350: -- This issue is based on the comment posted by robert https://issues.apache.org/jira/browse/MAPREDUCE-4342?focusedCommentId=13395942page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13395942 Thanks, Mayank Distributed Cache should put files read only on Task tracker Key: MAPREDUCE-4350 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4350 Project: Hadoop Map/Reduce Issue Type: Bug Components: distributed-cache Affects Versions: 0.22.0, 1.0.3, trunk Reporter: Mayank Bansal Assignee: Mayank Bansal -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4326) Resurrect RM Restart
[ https://issues.apache.org/jira/browse/MAPREDUCE-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396103#comment-13396103 ] Bikas Saha commented on MAPREDUCE-4326: --- Thanks! I will take a look before posting the design. Resurrect RM Restart - Key: MAPREDUCE-4326 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4326 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, resourcemanager Affects Versions: 2.0.0-alpha Reporter: Arun C Murthy Assignee: Bikas Saha Attachments: MR-4343.1.patch We should resurrect 'RM Restart' which we disabled sometime during the RM refactor. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4203) Create equivalent of ProcfsBasedProcessTree for Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-4203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396132#comment-13396132 ] Jonathan Eagles commented on MAPREDUCE-4203: Thanks, Bikas. Just trying to prevent Hadoop code from being contaminated with GPL or proprietary code licenses. Sounds like you are already controlling for that. Create equivalent of ProcfsBasedProcessTree for Windows --- Key: MAPREDUCE-4203 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4203 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Bikas Saha Assignee: Bikas Saha Attachments: MAPREDUCE-4203.branch-1-win.1.patch, MAPREDUCE-4203.patch, test.cpp ProcfsBasedProcessTree is used by the TaskTracker to get process information like memory and cpu usage. This information is used to manage resources etc. The current implementation is based on Linux procfs functionality and hence does not work on other platforms, specifically windows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4288) ClusterStatus.getMapTasks() and ClusterStatus.getReduceTasks() is giving one when no job is running
[ https://issues.apache.org/jira/browse/MAPREDUCE-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396130#comment-13396130 ] Karthik Kambatla commented on MAPREDUCE-4288: - In YARN, the ClusterMetrics should only correspond to numNodeManagers, numActiveJobs(), numActiveContainers(), availableResources(). Other job/app-specific metrics should move to the corresponding AMs. JobStatus would be a good place to have these metrics. Subsequently, JobClient.getClusterStatus() can correspond to the job-specific metrics (would be a misnomer). Comments? ClusterStatus.getMapTasks() and ClusterStatus.getReduceTasks() is giving one when no job is running --- Key: MAPREDUCE-4288 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4288 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.0.0-alpha Reporter: Nishan Shetty Assignee: Karthik Kambatla When no job is running in the cluster invoke the ClusterStatus.getMapTasks() and ClusterStatus.getReduceTasks() API's Observed that these API's are returning one instead of zero(as no job is running) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4335) Change the default scheduler to the CapacityScheduler
[ https://issues.apache.org/jira/browse/MAPREDUCE-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated MAPREDUCE-4335: -- Attachment: MR4335_4.txt Thanks for taking a look Arun. Updated the patch with the default scheduler defined in YarnConfiguration. Had to move the class loading into the ResourceManager instead of relying on Configuration.getClass... Change the default scheduler to the CapacityScheduler - Key: MAPREDUCE-4335 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4335 Project: Hadoop Map/Reduce Issue Type: Task Components: mrv2 Affects Versions: 2.0.0-alpha Reporter: Siddharth Seth Assignee: Siddharth Seth Attachments: MR4335.txt, MR4335_2.txt, MR4335_3.txt, MR4335_4.txt There's some bugs in the FifoScheduler atm - doesn't distribute tasks across nodes and some headroom (available resource) issues. That's not the best experience for users trying out the 2.0 branch. The CS with the default configuration of a single queue behaves the same as the FifoScheduler and doesn't have these issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4335) Change the default scheduler to the CapacityScheduler
[ https://issues.apache.org/jira/browse/MAPREDUCE-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated MAPREDUCE-4335: -- Status: Patch Available (was: Open) Change the default scheduler to the CapacityScheduler - Key: MAPREDUCE-4335 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4335 Project: Hadoop Map/Reduce Issue Type: Task Components: mrv2 Affects Versions: 2.0.0-alpha Reporter: Siddharth Seth Assignee: Siddharth Seth Attachments: MR4335.txt, MR4335_2.txt, MR4335_3.txt, MR4335_4.txt There's some bugs in the FifoScheduler atm - doesn't distribute tasks across nodes and some headroom (available resource) issues. That's not the best experience for users trying out the 2.0 branch. The CS with the default configuration of a single queue behaves the same as the FifoScheduler and doesn't have these issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4335) Change the default scheduler to the CapacityScheduler
[ https://issues.apache.org/jira/browse/MAPREDUCE-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396178#comment-13396178 ] Hadoop QA commented on MAPREDUCE-4335: -- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12532445/MR4335_4.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 10 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-api hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2470//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2470//console This message is automatically generated. Change the default scheduler to the CapacityScheduler - Key: MAPREDUCE-4335 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4335 Project: Hadoop Map/Reduce Issue Type: Task Components: mrv2 Affects Versions: 2.0.0-alpha Reporter: Siddharth Seth Assignee: Siddharth Seth Attachments: MR4335.txt, MR4335_2.txt, MR4335_3.txt, MR4335_4.txt There's some bugs in the FifoScheduler atm - doesn't distribute tasks across nodes and some headroom (available resource) issues. That's not the best experience for users trying out the 2.0 branch. The CS with the default configuration of a single queue behaves the same as the FifoScheduler and doesn't have these issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3235) Improve CPU cache behavior in map side sort
[ https://issues.apache.org/jira/browse/MAPREDUCE-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396222#comment-13396222 ] Todd Lipcon commented on MAPREDUCE-3235: bq. BTW, I know you are interested in JVM intrinsic binary array compare I guess you're working with Krystal Mok? Cool stuff, I hope to see it make it into OpenJDK as well! bq. Almost the same, depends on if there are rack local maps. the more rack local maps, the slower. You mean that if there are more rack-local (as opposed to data-local), right? If everything is data-local (eg terasort on an empty cluster) then I would expect the CPU difference to make a more noticeable difference. Improve CPU cache behavior in map side sort --- Key: MAPREDUCE-3235 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3235 Project: Hadoop Map/Reduce Issue Type: Improvement Components: performance, task Affects Versions: 0.23.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Attachments: map_sort_perf.diff, mr-3235-poc.txt When running oprofile on a terasort workload, I noticed that a large amount of CPU usage was going to MapTask$MapOutputBuffer.compare. Upon disassembling this and looking at cycle counters, most of the cycles were going to memory loads dereferencing into the array of key-value data -- implying expensive cache misses. This can be avoided as follows: - rather than simply swapping indexes into the kv array, swap the entire meta entries in the meta array. Swapping 16 bytes is only negligibly slower than swapping 4 bytes. This requires adding the value-length into the meta array, since we used to rely on the previous-in-the-array meta entry to determine this. So we replace INDEX with VALUELEN and avoid one layer of indirection. - introduce an interface which allows key types to provide a 4-byte comparison proxy. For string keys, this can simply be the first 4 bytes of the string. The idea is that, if stringCompare(key1.proxy(), key2.proxy()) != 0, then compare(key1, key2) should have the same result. If the proxies are equal, the normal comparison method is used. We then include the 4-byte proxy as part of the metadata entry, so that for many cases the indirection into the data buffer can be avoided. On a terasort benchmark, these optimizations plus an optimization to WritableComparator.compareBytes dropped the aggregate mapside CPU millis by 40%, and the compare() routine mostly dropped off the oprofile results. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4306) Problem running Distributed Shell applications as a user other than the one started the daemons
[ https://issues.apache.org/jira/browse/MAPREDUCE-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396223#comment-13396223 ] Ahmed Radwan commented on MAPREDUCE-4306: - Thanks Siddharth for the review! I agree, I think it is better to completely remove the -user option. I originally thought of just keeping it in case it can be used for testing or other purposes. But leaving it now may lead to confusion, and also setting it to something other than the original user will lead to failure as described above. Also reading ApplicationConstants.Environment.USER is simpler than reevaluating the username from ugi (which will give the same result after all). I have updated the patch accordingly. Thanks! Problem running Distributed Shell applications as a user other than the one started the daemons --- Key: MAPREDUCE-4306 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4306 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.0.0-alpha Reporter: Ahmed Radwan Assignee: Ahmed Radwan Fix For: 2.0.1-alpha Attachments: MAPREDUCE-4306.patch, MAPREDUCE-4306_rev2.patch Using the tarball, if you start the yarn daemons using one user and then switch to a different user. You can successfully run MR jobs, but DS jobs fail to run. Only able to run DS jobs using the user who started the daemons. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4306) Problem running Distributed Shell applications as a user other than the one started the daemons
[ https://issues.apache.org/jira/browse/MAPREDUCE-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmed Radwan updated MAPREDUCE-4306: Attachment: MAPREDUCE-4306_rev3.patch Problem running Distributed Shell applications as a user other than the one started the daemons --- Key: MAPREDUCE-4306 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4306 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.0.0-alpha Reporter: Ahmed Radwan Assignee: Ahmed Radwan Fix For: 2.0.1-alpha Attachments: MAPREDUCE-4306.patch, MAPREDUCE-4306_rev2.patch, MAPREDUCE-4306_rev3.patch Using the tarball, if you start the yarn daemons using one user and then switch to a different user. You can successfully run MR jobs, but DS jobs fail to run. Only able to run DS jobs using the user who started the daemons. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4334) Add support for CPU isolation/monitoring of containers
[ https://issues.apache.org/jira/browse/MAPREDUCE-4334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396261#comment-13396261 ] Andrew Ferguson commented on MAPREDUCE-4334: ok, putting all of this in the ContainerExecutor is not the way to go, as it precludes use of secure Hadoop's Linux container-executor. In my new design, ContainerMonitor will be a pluggable component, just as ContainerExecutor is now. Then, we can provide a ContainerMonitor which uses cgroups to control resource usage, rather than the existing ContainerMonitor (to be renamed as DefaultContainerMonitor). This has several advantages: 1) allows us to keep existing ContainerMonitor for users who can't use cgroups (eg, users without root access during Hadoop setup) 2) ContainerMonitor already receives an event when it's time to stop monitoring, which we can use as notification to delete the container's cgroup 3) ContainerMonitor receives the resource limits already; no need to calculate them based on the configs 4) A pluggable ContainerMonitor paves the way for ContainerMonitors on other platforms I will first open a sub-task to make ContainerMonitor pluggable. The only trouble spot with this design is that it's not possible to move another non-root user's process into a cgroup. I plan to extend the secure container-executor to be able to make such a move. Please let me know if you have any feedback about this proposal. thank you, Andrew Add support for CPU isolation/monitoring of containers -- Key: MAPREDUCE-4334 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4334 Project: Hadoop Map/Reduce Issue Type: Sub-task Reporter: Arun C Murthy Assignee: Arun C Murthy Once we get in MAPREDUCE-4327, it will be important to actually enforce limits on CPU consumption of containers. Several options spring to mind: # taskset (RHEL5+) # cgroups (RHEL6+) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4203) Create equivalent of ProcfsBasedProcessTree for Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-4203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated MAPREDUCE-4203: -- Attachment: MAPREDUCE-4203.branch-1-win.2.patch Fix some bugs in formatting. TestTaskTrackerMemoryManager now passes on Windows and tests the feature functionally. Create equivalent of ProcfsBasedProcessTree for Windows --- Key: MAPREDUCE-4203 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4203 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Bikas Saha Assignee: Bikas Saha Attachments: MAPREDUCE-4203.branch-1-win.1.patch, MAPREDUCE-4203.branch-1-win.2.patch, MAPREDUCE-4203.patch, test.cpp ProcfsBasedProcessTree is used by the TaskTracker to get process information like memory and cpu usage. This information is used to manage resources etc. The current implementation is based on Linux procfs functionality and hence does not work on other platforms, specifically windows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4342) Distributed Cache gives inconsistent result if cache files get deleted from task tracker
[ https://issues.apache.org/jira/browse/MAPREDUCE-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396386#comment-13396386 ] Konstantin Shvachko commented on MAPREDUCE-4342: Mayank, the patch is not applying as is. Namely the empty line change in TrackerDistributedCacheManager. You can just leave the line there. I did that, but then it is not compiling. You need to sync it with the repo. - Could you also change is been to has been as Robert suggested. - And add spaces between method parameters. - Reporting the results of test-patch and test builds would very useful, since we don't have Jenkins to verify that for 0.22. The fix looks good modular the jiras you opened. Distributed Cache gives inconsistent result if cache files get deleted from task tracker - Key: MAPREDUCE-4342 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4342 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.22.0, 1.0.3, trunk Reporter: Mayank Bansal Assignee: Mayank Bansal Attachments: MAPREDUCE-4342-22-1.patch, MAPREDUCE-4342-22.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4351) Make ContainersMonitor pluggable
[ https://issues.apache.org/jira/browse/MAPREDUCE-4351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Ferguson updated MAPREDUCE-4351: --- Attachment: MAPREDUCE-4351-v1.patch First cut at making ContainersMonitor pluggable. I have tested that the new configuration option is used, and that it works with a local cluster. Make ContainersMonitor pluggable Key: MAPREDUCE-4351 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4351 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: mrv2, nodemanager Reporter: Andrew Ferguson Attachments: MAPREDUCE-4351-v1.patch Make the existing ContainersManager pluggable, just as the ContainerExecutor is currently. This will allow us to add container resource enforcement using other techniques (such as cgroups) in an extensible fashion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4351) Make ContainersMonitor pluggable
[ https://issues.apache.org/jira/browse/MAPREDUCE-4351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396411#comment-13396411 ] Andrew Ferguson commented on MAPREDUCE-4351: the bulk of the lines in the patch are to rename ContainersMonitorImpl.java to DefaultContainersMonitor.java, and TestContainersMonitor.java to TestDefaultContainersMonitor.java Make ContainersMonitor pluggable Key: MAPREDUCE-4351 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4351 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: mrv2, nodemanager Reporter: Andrew Ferguson Attachments: MAPREDUCE-4351-v1.patch Make the existing ContainersManager pluggable, just as the ContainerExecutor is currently. This will allow us to add container resource enforcement using other techniques (such as cgroups) in an extensible fashion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3868) Reenable Raid
[ https://issues.apache.org/jira/browse/MAPREDUCE-3868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen updated MAPREDUCE-3868: -- Issue Type: Bug (was: New Feature) Reenable Raid - Key: MAPREDUCE-3868 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3868 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/raid Reporter: Scott Chen Assignee: Weiyan Wang Attachments: MAPREDUCE-3868-1.patch, MAPREDUCE-3868-2.patch, MAPREDUCE-3868-3.patch, MAPREDUCE-3868.patch, MAPREDUCE-3868v1.patch, MAPREDUCE-3868v1.sh Currently Raid is outdated and not compiled. Make it compile. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-3868) Reenable Raid
[ https://issues.apache.org/jira/browse/MAPREDUCE-3868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen resolved MAPREDUCE-3868. --- Resolution: Fixed Hadoop Flags: Reviewed I just committed this. Thanks, Weiyan. Reenable Raid - Key: MAPREDUCE-3868 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3868 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/raid Reporter: Scott Chen Assignee: Weiyan Wang Attachments: MAPREDUCE-3868-1.patch, MAPREDUCE-3868-2.patch, MAPREDUCE-3868-3.patch, MAPREDUCE-3868.patch, MAPREDUCE-3868v1.patch, MAPREDUCE-3868v1.sh Currently Raid is outdated and not compiled. Make it compile. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3868) Reenable Raid
[ https://issues.apache.org/jira/browse/MAPREDUCE-3868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396434#comment-13396434 ] Hudson commented on MAPREDUCE-3868: --- Integrated in Hadoop-Common-trunk-Commit #2369 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2369/]) MAPREDUCE-3868. Make Raid Compile. (Weiyan Wang via schen) (Revision 1351548) Result = SUCCESS schen : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1351548 Files : * /hadoop/common/trunk/hadoop-assemblies/src/main/resources/assemblies/hadoop-raid-dist.xml * /hadoop/common/trunk/hadoop-dist/pom.xml * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/pom.xml * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/conf * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/hdfs/server/datanode/RaidBlockSender.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRaidUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/BlockFixer.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DirectoryTraversal.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaid.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/GaloisField.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/JobMonitor.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidShell.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/ReedSolomonCode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/sbin * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/hdfs/TestRaidDfs.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestBlockFixer.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestDirectoryTraversal.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestErasureCodes.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidFilter.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidHar.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidPurge.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidShell.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidShellFsck.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestReedSolomonDecoder.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestReedSolomonEncoder.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java * /hadoop/common/trunk/hadoop-hdfs-project/pom.xml * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/bin * /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/conf * /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/src/java/org * /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/src/test/org * /hadoop/common/trunk/hadoop-project/pom.xml Reenable Raid - Key: MAPREDUCE-3868 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3868 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/raid Reporter: Scott Chen
[jira] [Commented] (MAPREDUCE-3868) Reenable Raid
[ https://issues.apache.org/jira/browse/MAPREDUCE-3868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396437#comment-13396437 ] Hudson commented on MAPREDUCE-3868: --- Integrated in Hadoop-Hdfs-trunk-Commit #2439 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2439/]) MAPREDUCE-3868. Make Raid Compile. (Weiyan Wang via schen) (Revision 1351548) Result = SUCCESS schen : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1351548 Files : * /hadoop/common/trunk/hadoop-assemblies/src/main/resources/assemblies/hadoop-raid-dist.xml * /hadoop/common/trunk/hadoop-dist/pom.xml * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/pom.xml * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/conf * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/hdfs/server/datanode/RaidBlockSender.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRaidUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/BlockFixer.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DirectoryTraversal.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaid.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/GaloisField.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/JobMonitor.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidShell.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/ReedSolomonCode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/sbin * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/hdfs/TestRaidDfs.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestBlockFixer.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestDirectoryTraversal.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestErasureCodes.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidFilter.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidHar.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidPurge.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidShell.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidShellFsck.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestReedSolomonDecoder.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestReedSolomonEncoder.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java * /hadoop/common/trunk/hadoop-hdfs-project/pom.xml * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/bin * /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/conf * /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/src/java/org * /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/src/test/org * /hadoop/common/trunk/hadoop-project/pom.xml Reenable Raid - Key: MAPREDUCE-3868 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3868 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/raid Reporter: Scott Chen Assignee:
[jira] [Commented] (MAPREDUCE-4336) Distributed Shell fails when used with the CapacityScheduler
[ https://issues.apache.org/jira/browse/MAPREDUCE-4336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396442#comment-13396442 ] Ahmed Radwan commented on MAPREDUCE-4336: - The fix looks fairly straight forward: set the queue name for GetQueueInfoRequest, and also add default as the default queue name if not specified on the command line. I'll upload a patch now. Distributed Shell fails when used with the CapacityScheduler Key: MAPREDUCE-4336 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4336 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.0.0-alpha Reporter: Siddharth Seth DistributedShell attempts to get queue info without providing a queue name - which ends up in an NPE. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4336) Distributed Shell fails when used with the CapacityScheduler
[ https://issues.apache.org/jira/browse/MAPREDUCE-4336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmed Radwan updated MAPREDUCE-4336: Attachment: MAPREDUCE-4336.patch I have manually tested the patch by successfully submitting/running DS jobs on both the capacity and fifo schedulers. Distributed Shell fails when used with the CapacityScheduler Key: MAPREDUCE-4336 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4336 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.0.0-alpha Reporter: Siddharth Seth Attachments: MAPREDUCE-4336.patch DistributedShell attempts to get queue info without providing a queue name - which ends up in an NPE. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (MAPREDUCE-4336) Distributed Shell fails when used with the CapacityScheduler
[ https://issues.apache.org/jira/browse/MAPREDUCE-4336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmed Radwan reassigned MAPREDUCE-4336: --- Assignee: Ahmed Radwan Distributed Shell fails when used with the CapacityScheduler Key: MAPREDUCE-4336 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4336 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.0.0-alpha Reporter: Siddharth Seth Assignee: Ahmed Radwan Attachments: MAPREDUCE-4336.patch DistributedShell attempts to get queue info without providing a queue name - which ends up in an NPE. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4336) Distributed Shell fails when used with the CapacityScheduler
[ https://issues.apache.org/jira/browse/MAPREDUCE-4336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396450#comment-13396450 ] Hadoop QA commented on MAPREDUCE-4336: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12532494/MAPREDUCE-4336.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 javadoc. The javadoc tool appears to have generated 13 warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2473//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2473//console This message is automatically generated. Distributed Shell fails when used with the CapacityScheduler Key: MAPREDUCE-4336 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4336 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.0.0-alpha Reporter: Siddharth Seth Assignee: Ahmed Radwan Attachments: MAPREDUCE-4336.patch DistributedShell attempts to get queue info without providing a queue name - which ends up in an NPE. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4350) Distributed Cache should put files read only on Task tracker
[ https://issues.apache.org/jira/browse/MAPREDUCE-4350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396451#comment-13396451 ] Kang Xiao commented on MAPREDUCE-4350: -- +1. It will prevent some task to write the same file in DistributedCache directory. Distributed Cache should put files read only on Task tracker Key: MAPREDUCE-4350 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4350 Project: Hadoop Map/Reduce Issue Type: Bug Components: distributed-cache Affects Versions: 0.22.0, 1.0.3, trunk Reporter: Mayank Bansal Assignee: Mayank Bansal -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3868) Reenable Raid
[ https://issues.apache.org/jira/browse/MAPREDUCE-3868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396454#comment-13396454 ] Hudson commented on MAPREDUCE-3868: --- Integrated in Hadoop-Mapreduce-trunk-Commit #2388 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2388/]) MAPREDUCE-3868. Make Raid Compile. (Weiyan Wang via schen) (Revision 1351548) Result = FAILURE schen : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1351548 Files : * /hadoop/common/trunk/hadoop-assemblies/src/main/resources/assemblies/hadoop-raid-dist.xml * /hadoop/common/trunk/hadoop-dist/pom.xml * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/pom.xml * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/conf * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/hdfs/server/datanode/RaidBlockSender.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRaidUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/BlockFixer.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DirectoryTraversal.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaid.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/GaloisField.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/JobMonitor.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidShell.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/ReedSolomonCode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/sbin * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/hdfs/TestRaidDfs.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestBlockFixer.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestDirectoryTraversal.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestErasureCodes.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidFilter.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidHar.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidPurge.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidShell.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestRaidShellFsck.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestReedSolomonDecoder.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/test/java/org/apache/hadoop/raid/TestReedSolomonEncoder.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java * /hadoop/common/trunk/hadoop-hdfs-project/pom.xml * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/bin * /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/conf * /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/src/java/org * /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/src/test/org * /hadoop/common/trunk/hadoop-project/pom.xml Reenable Raid - Key: MAPREDUCE-3868 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3868 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/raid Reporter: Scott Chen
[jira] [Commented] (MAPREDUCE-3868) Reenable Raid
[ https://issues.apache.org/jira/browse/MAPREDUCE-3868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13396462#comment-13396462 ] Andrew Purtell commented on MAPREDUCE-3868: --- Can we get this on branch-2? Reenable Raid - Key: MAPREDUCE-3868 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3868 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/raid Reporter: Scott Chen Assignee: Weiyan Wang Attachments: MAPREDUCE-3868-1.patch, MAPREDUCE-3868-2.patch, MAPREDUCE-3868-3.patch, MAPREDUCE-3868.patch, MAPREDUCE-3868v1.patch, MAPREDUCE-3868v1.sh Currently Raid is outdated and not compiled. Make it compile. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (MAPREDUCE-4345) ZK-based High Availability (HA) for ResourceManager (RM)
[ https://issues.apache.org/jira/browse/MAPREDUCE-4345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J reopened MAPREDUCE-4345: Thanks Bikas! Agree it is related to resurrecting RM restart. Arun - It isn't a duplicate, at least the way I see it the MAPREDUCE-4326 targets a restart-recovery while this one I'd opened to target proper HA (multiple RMs, failing over automatically, with client code covered too). It is what may come after restart-ability is achieved. Thanks, I've reopened it :) ZK-based High Availability (HA) for ResourceManager (RM) Key: MAPREDUCE-4345 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4345 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Harsh J Assignee: Bikas Saha One of the goals presented on MAPREDUCE-279 was to have high availability. One way that was discussed, per Mahadev/others on https://issues.apache.org/jira/browse/MAPREDUCE-2648 and other places, was ZK: {quote} Am not sure, if you already know about the MR-279 branch (the next version of MR framework). We've been trying to integrate ZK into the framework from the beginning. As for now, we are just doing restart with ZK but soon we should have a HA soln with ZK. {quote} There is now MAPREDUCE-4343 that tracks recoverability via ZK. This JIRA is meant to track HA via ZK. Currently there isn't a HA solution for RM, via ZK or otherwise. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira