[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient
[ https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401997#comment-13401997 ] Ahmed Radwan commented on MAPREDUCE-4346: - Thanks Arun and Tucu for the comments! Tucu, I have modified the semantics, so the retired flag doesn't override the status filter. Also updated the newly added tests to reflect that. I have replaced the inner loop by the HashSet per Arun's suggestion too. Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient -- Key: MAPREDUCE-4346 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Reporter: Ahmed Radwan Assignee: Ahmed Radwan Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, MAPREDUCE-4346_rev3.patch The current implementation for JobTracker.getAllJobs() returns all submitted jobs in any state, in addition to retired jobs. This list can be long and represents an unneeded overhead especially in the case of clients only interested in jobs in specific state(s). It is beneficial to include a refined version where only jobs having specific statuses are returned and retired jobs are optional to include. I'll be uploading an initial patch momentarily. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient
[ https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmed Radwan updated MAPREDUCE-4346: Attachment: MAPREDUCE-4346_rev4.patch Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient -- Key: MAPREDUCE-4346 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Reporter: Ahmed Radwan Assignee: Ahmed Radwan Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, MAPREDUCE-4346_rev3.patch, MAPREDUCE-4346_rev4.patch The current implementation for JobTracker.getAllJobs() returns all submitted jobs in any state, in addition to retired jobs. This list can be long and represents an unneeded overhead especially in the case of clients only interested in jobs in specific state(s). It is beneficial to include a refined version where only jobs having specific statuses are returned and retired jobs are optional to include. I'll be uploading an initial patch momentarily. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient
[ https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmed Radwan updated MAPREDUCE-4346: Status: Patch Available (was: Open) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient -- Key: MAPREDUCE-4346 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Reporter: Ahmed Radwan Assignee: Ahmed Radwan Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, MAPREDUCE-4346_rev3.patch, MAPREDUCE-4346_rev4.patch The current implementation for JobTracker.getAllJobs() returns all submitted jobs in any state, in addition to retired jobs. This list can be long and represents an unneeded overhead especially in the case of clients only interested in jobs in specific state(s). It is beneficial to include a refined version where only jobs having specific statuses are returned and retired jobs are optional to include. I'll be uploading an initial patch momentarily. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient
[ https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402006#comment-13402006 ] Hadoop QA commented on MAPREDUCE-4346: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12533607/MAPREDUCE-4346_rev4.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified test files. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2519//console This message is automatically generated. Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient -- Key: MAPREDUCE-4346 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Reporter: Ahmed Radwan Assignee: Ahmed Radwan Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, MAPREDUCE-4346_rev3.patch, MAPREDUCE-4346_rev4.patch The current implementation for JobTracker.getAllJobs() returns all submitted jobs in any state, in addition to retired jobs. This list can be long and represents an unneeded overhead especially in the case of clients only interested in jobs in specific state(s). It is beneficial to include a refined version where only jobs having specific statuses are returned and retired jobs are optional to include. I'll be uploading an initial patch momentarily. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient
[ https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402013#comment-13402013 ] Arun C Murthy commented on MAPREDUCE-4346: -- Asking again, what is the use case? I really don't like the api... particularly since it's a public api. Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient -- Key: MAPREDUCE-4346 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Reporter: Ahmed Radwan Assignee: Ahmed Radwan Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, MAPREDUCE-4346_rev3.patch, MAPREDUCE-4346_rev4.patch The current implementation for JobTracker.getAllJobs() returns all submitted jobs in any state, in addition to retired jobs. This list can be long and represents an unneeded overhead especially in the case of clients only interested in jobs in specific state(s). It is beneficial to include a refined version where only jobs having specific statuses are returned and retired jobs are optional to include. I'll be uploading an initial patch momentarily. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient
[ https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402047#comment-13402047 ] Ahmed Radwan commented on MAPREDUCE-4346: - As I highlighted in the ticket description above: The JobClient only exposes a getAllJobs() which returns all submitted jobs in any state, the result also includes all retired jobs. This list is long and represents an unneeded overhead especially in the case of clients only interested in jobs in specific states. One use case is a monitoring service that uses the JobClient and periodically calls getAllJobs() to keep track of submitted jobs. Just using the current getAllJobs() will represent a communication overhead because the returned list is unnecessarily long with redundant information (when called periodically). The new api provides a way for clients to selectively filter the long list which is normally returned by getAllJobs(). The Client can now specify as part of the call: the job statuses of interest and if including retired jobs is desired or not. What do you think Arun? Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient -- Key: MAPREDUCE-4346 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Reporter: Ahmed Radwan Assignee: Ahmed Radwan Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, MAPREDUCE-4346_rev3.patch, MAPREDUCE-4346_rev4.patch The current implementation for JobTracker.getAllJobs() returns all submitted jobs in any state, in addition to retired jobs. This list can be long and represents an unneeded overhead especially in the case of clients only interested in jobs in specific state(s). It is beneficial to include a refined version where only jobs having specific statuses are returned and retired jobs are optional to include. I'll be uploading an initial patch momentarily. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4372) Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor and Shutdown hook manager
[ https://issues.apache.org/jira/browse/MAPREDUCE-4372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated MAPREDUCE-4372: - Attachment: MAPREDUCE-4372-1.patch Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor and Shutdown hook manager -- Key: MAPREDUCE-4372 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4372 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, resourcemanager Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Devaraj K Assignee: Devaraj K Attachments: MAPREDUCE-4372-1.patch, MAPREDUCE-4372.patch, rm-threaddump.out Please find the attached resource manager thread dump for the issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4372) Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor and Shutdown hook manager
[ https://issues.apache.org/jira/browse/MAPREDUCE-4372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated MAPREDUCE-4372: - Status: Open (was: Patch Available) Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor and Shutdown hook manager -- Key: MAPREDUCE-4372 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4372 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, resourcemanager Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Devaraj K Assignee: Devaraj K Attachments: MAPREDUCE-4372-1.patch, MAPREDUCE-4372.patch, rm-threaddump.out Please find the attached resource manager thread dump for the issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4372) Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor and Shutdown hook manager
[ https://issues.apache.org/jira/browse/MAPREDUCE-4372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated MAPREDUCE-4372: - Status: Patch Available (was: Open) Thanks a lot Robert for looking into the patch. I have updated the patch as per your suggestion. Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor and Shutdown hook manager -- Key: MAPREDUCE-4372 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4372 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, resourcemanager Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Devaraj K Assignee: Devaraj K Attachments: MAPREDUCE-4372-1.patch, MAPREDUCE-4372.patch, rm-threaddump.out Please find the attached resource manager thread dump for the issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4372) Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor and Shutdown hook manager
[ https://issues.apache.org/jira/browse/MAPREDUCE-4372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402126#comment-13402126 ] Hadoop QA commented on MAPREDUCE-4372: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12533636/MAPREDUCE-4372-1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 javadoc. The javadoc tool appears to have generated 2 warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2520//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2520//console This message is automatically generated. Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor and Shutdown hook manager -- Key: MAPREDUCE-4372 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4372 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, resourcemanager Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Devaraj K Assignee: Devaraj K Attachments: MAPREDUCE-4372-1.patch, MAPREDUCE-4372.patch, rm-threaddump.out Please find the attached resource manager thread dump for the issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4228) mapreduce.job.reduce.slowstart.completedmaps is not working properly to delay the scheduling of the reduce tasks
[ https://issues.apache.org/jira/browse/MAPREDUCE-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402155#comment-13402155 ] Hudson commented on MAPREDUCE-4228: --- Integrated in Hadoop-Hdfs-trunk #1089 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1089/]) MAPREDUCE-4228. mapreduce.job.reduce.slowstart.completedmaps is not working properly (Jason Lowe via bobby) (Revision 1354181) Result = FAILURE bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1354181 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRMContainerAllocator.java mapreduce.job.reduce.slowstart.completedmaps is not working properly to delay the scheduling of the reduce tasks Key: MAPREDUCE-4228 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4228 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster, mrv2 Affects Versions: 0.23.1 Reporter: Jason Lowe Assignee: Jason Lowe Fix For: 0.23.3, 2.0.1-alpha, 3.0.0 Attachments: MAPREDUCE-4228.patch, MAPREDUCE-4228.patch, MAPREDUCE-4228.patch If no more map tasks need to be scheduled but not all have completed, the ApplicationMaster will start scheduling reducers even if the number of completed maps has not met the mapreduce.job.reduce.slowstart.completedmaps threshold. For example, if the property is set to 1.0 all maps should complete before any reducers are scheduled. However the reducers are scheduled as soon as the last map task is assigned to a container. For a job with very long-running maps, a cluster with enough capacity to launch all map tasks could cause reducers to launch prematurely and waste cluster resources. Thanks to Phil Su for discovering this issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4378) hadoop-validate-setup.sh fails to execute kinit command in secure mode
Nishan Shetty created MAPREDUCE-4378: Summary: hadoop-validate-setup.sh fails to execute kinit command in secure mode Key: MAPREDUCE-4378 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4378 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.0.1-alpha, 3.0.0 Environment: SUSE Linux Enterprise Server 11 (x86_64) VERSION = 11 PATCHLEVEL = 1 Reporter: Nishan Shetty hadoop-validate-setup.sh is refering to the invalid kinit location. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4228) mapreduce.job.reduce.slowstart.completedmaps is not working properly to delay the scheduling of the reduce tasks
[ https://issues.apache.org/jira/browse/MAPREDUCE-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402198#comment-13402198 ] Hudson commented on MAPREDUCE-4228: --- Integrated in Hadoop-Hdfs-0.23-Build #299 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/299/]) svn merge -c 1354181 FIXES: MAPREDUCE-4228. mapreduce.job.reduce.slowstart.completedmaps is not working properly (Jason Lowe via bobby) (Revision 1354185) Result = UNSTABLE bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1354185 Files : * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRMContainerAllocator.java mapreduce.job.reduce.slowstart.completedmaps is not working properly to delay the scheduling of the reduce tasks Key: MAPREDUCE-4228 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4228 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster, mrv2 Affects Versions: 0.23.1 Reporter: Jason Lowe Assignee: Jason Lowe Fix For: 0.23.3, 2.0.1-alpha, 3.0.0 Attachments: MAPREDUCE-4228.patch, MAPREDUCE-4228.patch, MAPREDUCE-4228.patch If no more map tasks need to be scheduled but not all have completed, the ApplicationMaster will start scheduling reducers even if the number of completed maps has not met the mapreduce.job.reduce.slowstart.completedmaps threshold. For example, if the property is set to 1.0 all maps should complete before any reducers are scheduled. However the reducers are scheduled as soon as the last map task is assigned to a container. For a job with very long-running maps, a cluster with enough capacity to launch all map tasks could cause reducers to launch prematurely and waste cluster resources. Thanks to Phil Su for discovering this issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4049) plugin for generic shuffle service
[ https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Avner BenHanoch updated MAPREDUCE-4049: --- Attachment: HADOOP-1.x.y-review-oriented.patch This patch replaces all my previous patches. It is written in order to ease code review, by doing just the minimal changes in existing code. *I believe anyone can verify this patch at glance!* (my old patches included design enhancements by moving plugins' shared code out of ReduceCopier into plugins' base class, and by making ReduceCopier a standalone class instead of being inner class of ReduceTask). plugin for generic shuffle service -- Key: MAPREDUCE-4049 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049 Project: Hadoop Map/Reduce Issue Type: Improvement Components: performance, task, tasktracker Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0 Reporter: Avner BenHanoch Labels: merge, plugin, rdma, shuffle Attachments: HADOOP-1.0.2.patch, HADOOP-1.0.x.patch, HADOOP-1.1.patch, HADOOP-1.x.y-review-oriented.patch, Hadoop Shuffle Consumer Plugin TLD.rtf, Hadoop Shuffle Provider Plugin TLD.rtf, mapred-site.xml Support generic shuffle service as set of two plugins: ShuffleProvider ShuffleConsumer. This will satisfy the following needs: # Better shuffle and merge performance. For example: we are working on shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, or Infiniband) instead of using the current HTTP shuffle. Based on the fast RDMA shuffle, the plugin can also utilize a suitable merge approach during the intermediate merges. Hence, getting much better performance. # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden dependency of NodeManager with a specific version of mapreduce shuffle (currently targeted to 0.24.0). References: # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu from Auburn University with others, [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf] # I am attaching 2 documents with suggested Top Level Design for both plugins (currently, based on 1.0 branch) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4377) TaskRunner javaopts parsing doesn't handle embedded spaces
[ https://issues.apache.org/jira/browse/MAPREDUCE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402230#comment-13402230 ] Robert Joseph Evans commented on MAPREDUCE-4377: John, that is very true, and if you can fix it I would be very happy to commit it for you. However, I don't think this is the only place in the code that has problems with embedded spaces. I'm not saying that we should not fix it, we should, just be aware that there be monsters here. Also be aware that there may be some Windows vs. POSIX(bash) issues that you may run into with trying to parse the arguments. Hopefully not too much though. TaskRunner javaopts parsing doesn't handle embedded spaces -- Key: MAPREDUCE-4377 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4377 Project: Hadoop Map/Reduce Issue Type: Bug Components: task-controller Affects Versions: trunk Environment: java options containing escaped or non-escaped embedded spaces. Reporter: John Gordon TaskRunner::GetVMArgs reads getChildJavaOpts as one space-delimited string, then split is on ' ' and tries to reason on individual options from there. The problem with this approach is that java options may contain embedded spaces in many legitimate cases -- this means it is reasoning on incomplete option strings and cannot do appropriate preprocessing to do things like handle escape characters or matched quotation marks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4379) Node Manager throws java.lang.OutOfMemoryError: Java heap space due to org.apache.hadoop.fs.LocalDirAllocator.contexts
Devaraj K created MAPREDUCE-4379: Summary: Node Manager throws java.lang.OutOfMemoryError: Java heap space due to org.apache.hadoop.fs.LocalDirAllocator.contexts Key: MAPREDUCE-4379 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4379 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, nodemanager Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Devaraj K Assignee: Devaraj K Priority: Critical {code:xml} Exception in thread Container Monitor java.lang.OutOfMemoryError: Java heap space at java.io.BufferedReader.init(BufferedReader.java:80) at java.io.BufferedReader.init(BufferedReader.java:91) at org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:410) at org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:171) at org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl$MonitoringThread.run(ContainersMonitorImpl.java:389) Exception in thread LocalizerRunner for container_1340690914008_10890_01_03 java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOfRange(Arrays.java:3209) at java.lang.String.init(String.java:215) at com.sun.org.apache.xerces.internal.xni.XMLString.toString(XMLString.java:185) at com.sun.org.apache.xerces.internal.parsers.AbstractDOMParser.characters(AbstractDOMParser.java:1188) at com.sun.org.apache.xerces.internal.xinclude.XIncludeHandler.characters(XIncludeHandler.java:1084) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:464) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737) at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119) at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:235) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:284) at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:180) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1738) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1689) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1635) at org.apache.hadoop.conf.Configuration.set(Configuration.java:722) at org.apache.hadoop.conf.Configuration.setStrings(Configuration.java:1300) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.initDirs(ContainerLocalizer.java:375) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:127) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:103) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:862) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4379) Node Manager throws java.lang.OutOfMemoryError: Java heap space due to org.apache.hadoop.fs.LocalDirAllocator.contexts
[ https://issues.apache.org/jira/browse/MAPREDUCE-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402244#comment-13402244 ] Devaraj K commented on MAPREDUCE-4379: -- {code:title=ContainerLocalizer.java|borderStyle=solid} this.appDirs = new LocalDirAllocator(String.format(APPCACHE_CTXT_FMT, appId)); this.userDirs = new LocalDirAllocator(String.format(USERCACHE_CTXT_FMT, appId)); this.pendingResources = new HashMapLocalResource,FuturePath(); {code} Here for every application during localization, it creates two LocalDirAllocator instances. {code:title=LocalDirAllocator.java|borderStyle=solid} private AllocatorPerContext obtainContext(String contextCfgItemName) { synchronized (contexts) { AllocatorPerContext l = contexts.get(contextCfgItemName); if (l == null) { contexts.put(contextCfgItemName, (l = new AllocatorPerContext(contextCfgItemName))); } return l; } } {code} Those two instances will internally creates AllocatorPerContext instances and add those into contexts while obtaining contexts. It will keep on adding for every application and no where else these are getting removed from the map. It is leading to OOM after running for some time. Node Manager throws java.lang.OutOfMemoryError: Java heap space due to org.apache.hadoop.fs.LocalDirAllocator.contexts -- Key: MAPREDUCE-4379 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4379 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, nodemanager Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Devaraj K Assignee: Devaraj K Priority: Critical {code:xml} Exception in thread Container Monitor java.lang.OutOfMemoryError: Java heap space at java.io.BufferedReader.init(BufferedReader.java:80) at java.io.BufferedReader.init(BufferedReader.java:91) at org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:410) at org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:171) at org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl$MonitoringThread.run(ContainersMonitorImpl.java:389) Exception in thread LocalizerRunner for container_1340690914008_10890_01_03 java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOfRange(Arrays.java:3209) at java.lang.String.init(String.java:215) at com.sun.org.apache.xerces.internal.xni.XMLString.toString(XMLString.java:185) at com.sun.org.apache.xerces.internal.parsers.AbstractDOMParser.characters(AbstractDOMParser.java:1188) at com.sun.org.apache.xerces.internal.xinclude.XIncludeHandler.characters(XIncludeHandler.java:1084) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:464) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737) at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119) at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:235) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:284) at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:180) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1738) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1689) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1635) at org.apache.hadoop.conf.Configuration.set(Configuration.java:722) at org.apache.hadoop.conf.Configuration.setStrings(Configuration.java:1300) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.initDirs(ContainerLocalizer.java:375) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:127) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:103) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:862) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators:
[jira] [Updated] (MAPREDUCE-4379) Node Manager throws java.lang.OutOfMemoryError: Java heap space due to org.apache.hadoop.fs.LocalDirAllocator.contexts
[ https://issues.apache.org/jira/browse/MAPREDUCE-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4379: --- Target Version/s: 0.23.3 Affects Version/s: 0.23.3 I really would like to see this go into 0.23 as well. Node Manager throws java.lang.OutOfMemoryError: Java heap space due to org.apache.hadoop.fs.LocalDirAllocator.contexts -- Key: MAPREDUCE-4379 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4379 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, nodemanager Affects Versions: 0.23.3, 2.0.0-alpha, 3.0.0 Reporter: Devaraj K Assignee: Devaraj K Priority: Critical {code:xml} Exception in thread Container Monitor java.lang.OutOfMemoryError: Java heap space at java.io.BufferedReader.init(BufferedReader.java:80) at java.io.BufferedReader.init(BufferedReader.java:91) at org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:410) at org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:171) at org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl$MonitoringThread.run(ContainersMonitorImpl.java:389) Exception in thread LocalizerRunner for container_1340690914008_10890_01_03 java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOfRange(Arrays.java:3209) at java.lang.String.init(String.java:215) at com.sun.org.apache.xerces.internal.xni.XMLString.toString(XMLString.java:185) at com.sun.org.apache.xerces.internal.parsers.AbstractDOMParser.characters(AbstractDOMParser.java:1188) at com.sun.org.apache.xerces.internal.xinclude.XIncludeHandler.characters(XIncludeHandler.java:1084) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:464) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737) at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119) at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:235) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:284) at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:180) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1738) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1689) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1635) at org.apache.hadoop.conf.Configuration.set(Configuration.java:722) at org.apache.hadoop.conf.Configuration.setStrings(Configuration.java:1300) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.initDirs(ContainerLocalizer.java:375) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:127) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:103) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:862) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4228) mapreduce.job.reduce.slowstart.completedmaps is not working properly to delay the scheduling of the reduce tasks
[ https://issues.apache.org/jira/browse/MAPREDUCE-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402252#comment-13402252 ] Hudson commented on MAPREDUCE-4228: --- Integrated in Hadoop-Mapreduce-trunk #1122 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1122/]) MAPREDUCE-4228. mapreduce.job.reduce.slowstart.completedmaps is not working properly (Jason Lowe via bobby) (Revision 1354181) Result = FAILURE bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1354181 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRMContainerAllocator.java mapreduce.job.reduce.slowstart.completedmaps is not working properly to delay the scheduling of the reduce tasks Key: MAPREDUCE-4228 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4228 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster, mrv2 Affects Versions: 0.23.1 Reporter: Jason Lowe Assignee: Jason Lowe Fix For: 0.23.3, 2.0.1-alpha, 3.0.0 Attachments: MAPREDUCE-4228.patch, MAPREDUCE-4228.patch, MAPREDUCE-4228.patch If no more map tasks need to be scheduled but not all have completed, the ApplicationMaster will start scheduling reducers even if the number of completed maps has not met the mapreduce.job.reduce.slowstart.completedmaps threshold. For example, if the property is set to 1.0 all maps should complete before any reducers are scheduled. However the reducers are scheduled as soon as the last map task is assigned to a container. For a job with very long-running maps, a cluster with enough capacity to launch all map tasks could cause reducers to launch prematurely and waste cluster resources. Thanks to Phil Su for discovering this issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4372) Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor and Shutdown hook manager
[ https://issues.apache.org/jira/browse/MAPREDUCE-4372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402255#comment-13402255 ] Robert Joseph Evans commented on MAPREDUCE-4372: Changes look good to me +1. Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor and Shutdown hook manager -- Key: MAPREDUCE-4372 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4372 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, resourcemanager Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Devaraj K Assignee: Devaraj K Attachments: MAPREDUCE-4372-1.patch, MAPREDUCE-4372.patch, rm-threaddump.out Please find the attached resource manager thread dump for the issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4372) Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor and Shutdown hook manager
[ https://issues.apache.org/jira/browse/MAPREDUCE-4372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4372: --- Resolution: Fixed Fix Version/s: 3.0.0 2.0.1-alpha Status: Resolved (was: Patch Available) Thanks Devaraj, I put this into trunk and branch-2. Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor and Shutdown hook manager -- Key: MAPREDUCE-4372 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4372 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, resourcemanager Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Devaraj K Assignee: Devaraj K Fix For: 2.0.1-alpha, 3.0.0 Attachments: MAPREDUCE-4372-1.patch, MAPREDUCE-4372.patch, rm-threaddump.out Please find the attached resource manager thread dump for the issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4372) Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor and Shutdown hook manager
[ https://issues.apache.org/jira/browse/MAPREDUCE-4372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402262#comment-13402262 ] Hudson commented on MAPREDUCE-4372: --- Integrated in Hadoop-Hdfs-trunk-Commit #2462 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2462/]) MAPREDUCE-4372. Deadlock in Resource Manager (Devaraj K via bobby) (Revision 1354531) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1354531 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor and Shutdown hook manager -- Key: MAPREDUCE-4372 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4372 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, resourcemanager Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Devaraj K Assignee: Devaraj K Fix For: 2.0.1-alpha, 3.0.0 Attachments: MAPREDUCE-4372-1.patch, MAPREDUCE-4372.patch, rm-threaddump.out Please find the attached resource manager thread dump for the issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4372) Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor and Shutdown hook manager
[ https://issues.apache.org/jira/browse/MAPREDUCE-4372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402264#comment-13402264 ] Hudson commented on MAPREDUCE-4372: --- Integrated in Hadoop-Common-trunk-Commit #2393 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2393/]) MAPREDUCE-4372. Deadlock in Resource Manager (Devaraj K via bobby) (Revision 1354531) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1354531 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor and Shutdown hook manager -- Key: MAPREDUCE-4372 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4372 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, resourcemanager Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Devaraj K Assignee: Devaraj K Fix For: 2.0.1-alpha, 3.0.0 Attachments: MAPREDUCE-4372-1.patch, MAPREDUCE-4372.patch, rm-threaddump.out Please find the attached resource manager thread dump for the issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4371) Check for cyclic dependencies in Jobcontrol job DAG
[ https://issues.apache.org/jira/browse/MAPREDUCE-4371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402268#comment-13402268 ] Robert Joseph Evans commented on MAPREDUCE-4371: Just a few comments about the patch. # the new file needs an apache license comment at the top. # It would be nice to have a comment in the test about what the test class is intended to cover. # The test looks like it is passing, but without any exceptions ever being caught in the test. The run method catches all exceptions and then kills all of the jobs. This is because run is intended to potentially be called on its own thread. Please instead verify that all of the jobs are marked as failed at the end. # Inside the patch itself it looks like there are a few places where the formatting is off. We use 2 spaces for indentation and try to wrap the lines at under 80 characters. Other then that it looks good. Also a bit of process in when you upload a patch please mark the box indicating that it is intended for inclusion in Apache, also please then hit the submit patch button. This will trigger Jenkins to try and test the patch against trunk. I am going to hit submit patch for you, but the checkbox you have to do because it is your code and your copyright. Check for cyclic dependencies in Jobcontrol job DAG --- Key: MAPREDUCE-4371 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4371 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Affects Versions: 3.0.0 Reporter: madhukara phatak Attachments: MAPREDUCE-4371.patch In current implementation of JobControl, whenever there is a cyclic dependency between the jobs it throws a Stack overflow exception. This jira adds a cyclic check to jobcontrol. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4371) Check for cyclic dependencies in Jobcontrol job DAG
[ https://issues.apache.org/jira/browse/MAPREDUCE-4371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4371: --- Target Version/s: 3.0.0 Status: Patch Available (was: Open) Check for cyclic dependencies in Jobcontrol job DAG --- Key: MAPREDUCE-4371 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4371 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Affects Versions: 3.0.0 Reporter: madhukara phatak Attachments: MAPREDUCE-4371.patch In current implementation of JobControl, whenever there is a cyclic dependency between the jobs it throws a Stack overflow exception. This jira adds a cyclic check to jobcontrol. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4360) Capacity Scheduler Hierarchical leaf queue does not honur the max capacity of container queue
[ https://issues.apache.org/jira/browse/MAPREDUCE-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402271#comment-13402271 ] Jason Lowe commented on MAPREDUCE-4360: --- This JIRA indicates that trunk is affected, but I believe this has already been addressed in trunk (and branch-2 and branch-0.23) by MAPREDUCE-3683. Capacity Scheduler Hierarchical leaf queue does not honur the max capacity of container queue - Key: MAPREDUCE-4360 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4360 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.22.1, trunk Reporter: Mayank Bansal Assignee: Mayank Bansal Attachments: MAPREDUCE-4360-22.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4371) Check for cyclic dependencies in Jobcontrol job DAG
[ https://issues.apache.org/jira/browse/MAPREDUCE-4371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402282#comment-13402282 ] Hadoop QA commented on MAPREDUCE-4371: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12533455/MAPREDUCE-4371.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 javadoc. The javadoc tool appears to have generated 2 warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2521//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2521//console This message is automatically generated. Check for cyclic dependencies in Jobcontrol job DAG --- Key: MAPREDUCE-4371 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4371 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Affects Versions: 3.0.0 Reporter: madhukara phatak Attachments: MAPREDUCE-4371.patch In current implementation of JobControl, whenever there is a cyclic dependency between the jobs it throws a Stack overflow exception. This jira adds a cyclic check to jobcontrol. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4326) Resurrect RM Restart
[ https://issues.apache.org/jira/browse/MAPREDUCE-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402297#comment-13402297 ] Tsuyoshi OZAWA commented on MAPREDUCE-4326: --- Sharad, MAPREDUCE-2713 is now marked as dup of this ticket(MAPREDUCE-4326). Resurrect RM Restart - Key: MAPREDUCE-4326 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4326 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, resourcemanager Affects Versions: 2.0.0-alpha Reporter: Arun C Murthy Assignee: Bikas Saha Attachments: MR-4343.1.patch We should resurrect 'RM Restart' which we disabled sometime during the RM refactor. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4372) Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor and Shutdown hook manager
[ https://issues.apache.org/jira/browse/MAPREDUCE-4372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402304#comment-13402304 ] Hudson commented on MAPREDUCE-4372: --- Integrated in Hadoop-Mapreduce-trunk-Commit #2412 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2412/]) MAPREDUCE-4372. Deadlock in Resource Manager (Devaraj K via bobby) (Revision 1354531) Result = FAILURE bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1354531 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java Deadlock in Resource Manager between SchedulerEventDispatcher.EventProcessor and Shutdown hook manager -- Key: MAPREDUCE-4372 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4372 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, resourcemanager Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Devaraj K Assignee: Devaraj K Fix For: 2.0.1-alpha, 3.0.0 Attachments: MAPREDUCE-4372-1.patch, MAPREDUCE-4372.patch, rm-threaddump.out Please find the attached resource manager thread dump for the issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4371) Check for cyclic dependencies in Jobcontrol job DAG
[ https://issues.apache.org/jira/browse/MAPREDUCE-4371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] madhukara phatak updated MAPREDUCE-4371: Attachment: MAPREDUCE-4371-1.patch Updated the patch to fix test case and style issues. Check for cyclic dependencies in Jobcontrol job DAG --- Key: MAPREDUCE-4371 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4371 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Affects Versions: 3.0.0 Reporter: madhukara phatak Attachments: MAPREDUCE-4371-1.patch, MAPREDUCE-4371.patch In current implementation of JobControl, whenever there is a cyclic dependency between the jobs it throws a Stack overflow exception. This jira adds a cyclic check to jobcontrol. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4376) TestClusterMRNotification times out
[ https://issues.apache.org/jira/browse/MAPREDUCE-4376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402319#comment-13402319 ] Kihwal Lee commented on MAPREDUCE-4376: --- It used to be job 1, SUCCEEDED, SUCCEEDED job 2, KILLED, KILLED job 3, FAILED, FAILED Now it's getting job 1, SUCCEEDED, SUCCEEDED job 2, ERROR, ERROR The test hangs after job 2. TestClusterMRNotification times out --- Key: MAPREDUCE-4376 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4376 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, test Affects Versions: 2.0.1-alpha Reporter: Jason Lowe The TestClusterMRNotification test is often timing out. git bisect tests narrowed it down to MAPREDUCE-3921, as the test consistently passes before that change and times out most of the time after picking up that change. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4326) Resurrect RM Restart
[ https://issues.apache.org/jira/browse/MAPREDUCE-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402323#comment-13402323 ] Tsuyoshi OZAWA commented on MAPREDUCE-4326: --- Bikas, What's going on? I can help you if you have a difficulty related to a preliminary design sketch. Resurrect RM Restart - Key: MAPREDUCE-4326 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4326 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, resourcemanager Affects Versions: 2.0.0-alpha Reporter: Arun C Murthy Assignee: Bikas Saha Attachments: MR-4343.1.patch We should resurrect 'RM Restart' which we disabled sometime during the RM refactor. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4371) Check for cyclic dependencies in Jobcontrol job DAG
[ https://issues.apache.org/jira/browse/MAPREDUCE-4371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402328#comment-13402328 ] Hadoop QA commented on MAPREDUCE-4371: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12533666/MAPREDUCE-4371-1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 javadoc. The javadoc tool appears to have generated 2 warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2522//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2522//console This message is automatically generated. Check for cyclic dependencies in Jobcontrol job DAG --- Key: MAPREDUCE-4371 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4371 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Affects Versions: 3.0.0 Reporter: madhukara phatak Attachments: MAPREDUCE-4371-1.patch, MAPREDUCE-4371.patch In current implementation of JobControl, whenever there is a cyclic dependency between the jobs it throws a Stack overflow exception. This jira adds a cyclic check to jobcontrol. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4376) TestClusterMRNotification times out
[ https://issues.apache.org/jira/browse/MAPREDUCE-4376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402359#comment-13402359 ] Kihwal Lee commented on MAPREDUCE-4376: --- Relevant log entries: {noformat} 2012-06-27 08:48:55,331 INFO [IPC Server handler 0 on 57856] org.apache.hadoop.mapreduce.v2.app.client.MRClie ntService: Kill Job received from client job_1340812108963_0002 2012-06-27 08:48:55,332 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobI mpl: job_1340812108963_0002Job Transitioned from RUNNING to KILL_WAIT 2012-06-27 08:48:55,332 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.Task Impl: task_1340812108963_0002_m_00 Task Transitioned from SCHEDULED to KILL_WAIT 2012-06-27 08:48:55,332 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.Task Impl: task_1340812108963_0002_m_01 Task Transitioned from SCHEDULED to KILL_WAIT 2012-06-27 08:48:55,333 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.Task Impl: task_1340812108963_0002_r_00 Task Transitioned from SCHEDULED to KILL_WAIT 2012-06-27 08:48:55,334 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.Task AttemptImpl: attempt_1340812108963_0002_m_00_0 TaskAttempt Transitioned from UNASSIGNED to KILLED 2012-06-27 08:48:55,334 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1340812108963_0002_m_01_0 TaskAttempt Transitioned from UNASSIGNED to KILLED 2012-06-27 08:48:55,335 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1340812108963_0002_r_00_0 TaskAttempt Transitioned from UNASSIGNED to KILLED 2012-06-27 08:48:55,335 INFO [Thread-45] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Processing the event EventType: CONTAINER_DEALLOCATE 2012-06-27 08:48:55,338 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1340812108963_0002_m_00 Task Transitioned from KILL_WAIT to KILLED 2012-06-27 08:48:55,338 INFO [Thread-45] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Processing the event EventType: CONTAINER_DEALLOCATE 2012-06-27 08:48:55,338 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1340812108963_0002_m_01 Task Transitioned from KILL_WAIT to KILLED 2012-06-27 08:48:55,338 INFO [Thread-45] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Processing the event EventType: CONTAINER_DEALLOCATE 2012-06-27 08:48:55,339 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1340812108963_0002_r_00 Task Transitioned from KILL_WAIT to KILLED 2012-06-27 08:48:55,339 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Num completed Tasks: 1 2012-06-27 08:48:55,339 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Num completed Tasks: 2 2012-06-27 08:48:55,340 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Num completed Tasks: 3 2012-06-27 08:48:55,341 ERROR [Thread-45] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Error in handling event type CONTAINER_DEALLOCATE to the ContainreAllocator java.lang.NullPointerException at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator$AssignedRequests.get(RMContainerAllocator.java:1103) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.handleEvent(RMContainerAllocator.java:339) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator$1.run(RMContainerAllocator.java:191) 2012-06-27 08:48:55,348 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1340812108963_0002Job Transitioned from KILL_WAIT to KILLED 2012-06-27 08:48:55,348 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1340812108963_0002Job Transitioned from KILLED to ERROR {noformat} The code assumes that if the attempt ID is not found in scheduledRequests, it will be in assignedRequests. But in this case, it was still in UNASSIGNED. TestClusterMRNotification times out --- Key: MAPREDUCE-4376 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4376 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, test Affects Versions: 2.0.1-alpha Reporter: Jason Lowe The TestClusterMRNotification test is often timing out. git bisect tests narrowed it down to MAPREDUCE-3921, as the test consistently passes before that change and times out most of the time after
[jira] [Commented] (MAPREDUCE-4322) Fix command-line length abort issues on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402419#comment-13402419 ] Bikas Saha commented on MAPREDUCE-4322: --- 3. My main concern is that we are not differentiating that the first failure is due to a bad setup string while the second one is due to a bad cmd string. Since the code is adding the exact failed command into the exception we could look for setup in the first case and command in the second case in addition to sb.toString(). I should have been more clear. I didn't literally mean setup.toString() because its a list :) Fix command-line length abort issues on Windows --- Key: MAPREDUCE-4322 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4322 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Environment: Windows, downstream applications with long aggregate classpaths Reporter: John Gordon Assignee: Ivan Mitic Attachments: MAPREDUCE-4322-branch-1-win(2).patch, MAPREDUCE-4322-branch-1-win(3).patch, MAPREDUCE-4322-branch-1-win(4).patch, MAPREDUCE-4322-branch-1-win.patch Original Estimate: 12h Remaining Estimate: 12h When a task is started on the tasktracker, it creates a small batch file to invoke java and runs that batch. Within the batch file, the invocation of Java currently has -classpath ${CLASSPATH} inline to the command. That line often exceeds 8000 characters. This is ok for most linux distributions because the line limit env variable is often set much higher than this. However, for Windows this cause cmd to abort execution. This surfaces in Hadoop as an unknown failure mode for the task. I think the easiest and most natural way to fix this is to push the -classpath option into a config file to take the longest variable part of the line and put it somewhere that scales better. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4360) Capacity Scheduler Hierarchical leaf queue does not honur the max capacity of container queue
[ https://issues.apache.org/jira/browse/MAPREDUCE-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402426#comment-13402426 ] Mayank Bansal commented on MAPREDUCE-4360: -- Jason, I did not realize that it is already fixed in trunk will update the JIRA. Thanks for pointing this out. Konst, Thats already been done in when tasks been assigned any queue. Thanks, Mayank Capacity Scheduler Hierarchical leaf queue does not honur the max capacity of container queue - Key: MAPREDUCE-4360 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4360 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.22.1, trunk Reporter: Mayank Bansal Assignee: Mayank Bansal Attachments: MAPREDUCE-4360-22.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4322) Fix command-line length abort issues on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402444#comment-13402444 ] Ivan Mitic commented on MAPREDUCE-4322: --- 3. Oh, thanks for clarifying. My thinking was, from the user's perspective, we are outputting the actual command that exceeded the limit. Whether it is setup or command, it is not as relevant. In unit tests, since I know the code, I want to cover all cases, so I'm testing both. I am leaning toward keeping the code as is, given that I wouldn't want to have a hardcoded dependency on what is in the exception message. Let me know if you feel strong about this. Fix command-line length abort issues on Windows --- Key: MAPREDUCE-4322 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4322 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Environment: Windows, downstream applications with long aggregate classpaths Reporter: John Gordon Assignee: Ivan Mitic Attachments: MAPREDUCE-4322-branch-1-win(2).patch, MAPREDUCE-4322-branch-1-win(3).patch, MAPREDUCE-4322-branch-1-win(4).patch, MAPREDUCE-4322-branch-1-win.patch Original Estimate: 12h Remaining Estimate: 12h When a task is started on the tasktracker, it creates a small batch file to invoke java and runs that batch. Within the batch file, the invocation of Java currently has -classpath ${CLASSPATH} inline to the command. That line often exceeds 8000 characters. This is ok for most linux distributions because the line limit env variable is often set much higher than this. However, for Windows this cause cmd to abort execution. This surfaces in Hadoop as an unknown failure mode for the task. I think the easiest and most natural way to fix this is to push the -classpath option into a config file to take the longest variable part of the line and put it somewhere that scales better. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4355) Add startTime to RunningJob
[ https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402450#comment-13402450 ] Alejandro Abdelnur commented on MAPREDUCE-4355: --- reverted from trunk and branch-2 Add startTime to RunningJob --- Key: MAPREDUCE-4355 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355 Project: Hadoop Map/Reduce Issue Type: New Feature Components: mrv1, mrv2 Affects Versions: 1.0.3, 2.0.0-alpha Reporter: Karthik Kambatla Assignee: Karthik Kambatla Fix For: 1.1.0, 2.0.1-alpha Attachments: MR-4355_mr1.patch, MR-4355_mr2.patch To read the start-time of a particular job, one should not need to getAllJobs() and iterate through them. getJob(JobID) returns RunningJob, which doesn't hold the job's start time. Hence, we need to either add getJobStatus(JobID) to the API or add startTime to RunningJob. Doing the latter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4377) TaskRunner javaopts parsing doesn't handle embedded spaces
[ https://issues.apache.org/jira/browse/MAPREDUCE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402452#comment-13402452 ] John Gordon commented on MAPREDUCE-4377: Thanks Robert! I agree it won't be an easy fix and may need some rearchitecture and significant test additions. TaskRunner javaopts parsing doesn't handle embedded spaces -- Key: MAPREDUCE-4377 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4377 Project: Hadoop Map/Reduce Issue Type: Bug Components: task-controller Affects Versions: trunk Environment: java options containing escaped or non-escaped embedded spaces. Reporter: John Gordon TaskRunner::GetVMArgs reads getChildJavaOpts as one space-delimited string, then split is on ' ' and tries to reason on individual options from there. The problem with this approach is that java options may contain embedded spaces in many legitimate cases -- this means it is reasoning on incomplete option strings and cannot do appropriate preprocessing to do things like handle escape characters or matched quotation marks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (MAPREDUCE-4355) Add startTime to RunningJob
[ https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402450#comment-13402450 ] Alejandro Abdelnur edited comment on MAPREDUCE-4355 at 6/27/12 6:45 PM: reverted from trunk, branch-2 and branch-1. was (Author: tucu00): reverted from trunk and branch-2 Add startTime to RunningJob --- Key: MAPREDUCE-4355 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355 Project: Hadoop Map/Reduce Issue Type: New Feature Components: mrv1, mrv2 Affects Versions: 1.0.3, 2.0.0-alpha Reporter: Karthik Kambatla Assignee: Karthik Kambatla Fix For: 1.1.0, 2.0.1-alpha Attachments: MR-4355_mr1.patch, MR-4355_mr2.patch To read the start-time of a particular job, one should not need to getAllJobs() and iterate through them. getJob(JobID) returns RunningJob, which doesn't hold the job's start time. Hence, we need to either add getJobStatus(JobID) to the API or add startTime to RunningJob. Doing the latter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4342) Distributed Cache gives inconsistent result if cache files get deleted from task tracker
[ https://issues.apache.org/jira/browse/MAPREDUCE-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayank Bansal updated MAPREDUCE-4342: - Attachment: MAPREDUCE-4342-22-3.patch Hi Konst, Thanks for the comments, updated all the comments. Thanks, Mayank Distributed Cache gives inconsistent result if cache files get deleted from task tracker - Key: MAPREDUCE-4342 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4342 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.22.0, 1.0.3, trunk Reporter: Mayank Bansal Assignee: Mayank Bansal Attachments: MAPREDUCE-4342-22-1.patch, MAPREDUCE-4342-22-2.patch, MAPREDUCE-4342-22-3.patch, MAPREDUCE-4342-22.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient
[ https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402464#comment-13402464 ] Alejandro Abdelnur commented on MAPREDUCE-4346: --- Ahmed, LGTM, only thing is that status is an int and you are using a set to do the filtering, this means that for each comparison an Integer will be created. Instead I'd just iterate over the received filter using a helper method *boolean filter(int filter[], int status)*. Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient -- Key: MAPREDUCE-4346 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Reporter: Ahmed Radwan Assignee: Ahmed Radwan Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, MAPREDUCE-4346_rev3.patch, MAPREDUCE-4346_rev4.patch The current implementation for JobTracker.getAllJobs() returns all submitted jobs in any state, in addition to retired jobs. This list can be long and represents an unneeded overhead especially in the case of clients only interested in jobs in specific state(s). It is beneficial to include a refined version where only jobs having specific statuses are returned and retired jobs are optional to include. I'll be uploading an initial patch momentarily. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4355) Add startTime to RunningJob
[ https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402465#comment-13402465 ] Arun C Murthy commented on MAPREDUCE-4355: -- bq. Arun, it might be cleaner to add RunningJob.getJobStatus() instead of adding startTime, endTime fields to RunningJob and redundantly maintaining them. +1, good point! Add startTime to RunningJob --- Key: MAPREDUCE-4355 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355 Project: Hadoop Map/Reduce Issue Type: New Feature Components: mrv1, mrv2 Affects Versions: 1.0.3, 2.0.0-alpha Reporter: Karthik Kambatla Assignee: Karthik Kambatla Fix For: 1.1.0, 2.0.1-alpha Attachments: MR-4355_mr1.patch, MR-4355_mr2.patch To read the start-time of a particular job, one should not need to getAllJobs() and iterate through them. getJob(JobID) returns RunningJob, which doesn't hold the job's start time. Hence, we need to either add getJobStatus(JobID) to the API or add startTime to RunningJob. Doing the latter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient
[ https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402469#comment-13402469 ] Hudson commented on MAPREDUCE-4346: --- Integrated in Hadoop-Hdfs-trunk-Commit #2465 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2465/]) Reverting MAPREDUCE-4346 r1353757 (Revision 1354656) Result = SUCCESS tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1354656 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapred/TestJobClient.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapred/TestJobClientGetJob.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/JobClient.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/Cluster.java Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient -- Key: MAPREDUCE-4346 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Reporter: Ahmed Radwan Assignee: Ahmed Radwan Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, MAPREDUCE-4346_rev3.patch, MAPREDUCE-4346_rev4.patch The current implementation for JobTracker.getAllJobs() returns all submitted jobs in any state, in addition to retired jobs. This list can be long and represents an unneeded overhead especially in the case of clients only interested in jobs in specific state(s). It is beneficial to include a refined version where only jobs having specific statuses are returned and retired jobs are optional to include. I'll be uploading an initial patch momentarily. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient
[ https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402477#comment-13402477 ] Arun C Murthy commented on MAPREDUCE-4346: -- bq. As I highlighted in the ticket description above: The JobClient only exposes a getAllJobs() which returns all submitted jobs in any state, the result also includes all retired jobs. This list is long and represents an unneeded overhead especially in the case of clients only interested in jobs in specific states. Ahmed I'm not convinced. Yes, it's bit more overhead, but I don't see how adding a new public api is going to make significant difference. IAC, if you set completed jobs to 0, you'll not get retired jobs. Unless I hear a more compelling argument I'm -1 on this. Also, please remember that this API is fairly hard to support with YARN, so that is another problem. Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient -- Key: MAPREDUCE-4346 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Reporter: Ahmed Radwan Assignee: Ahmed Radwan Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, MAPREDUCE-4346_rev3.patch, MAPREDUCE-4346_rev4.patch The current implementation for JobTracker.getAllJobs() returns all submitted jobs in any state, in addition to retired jobs. This list can be long and represents an unneeded overhead especially in the case of clients only interested in jobs in specific state(s). It is beneficial to include a refined version where only jobs having specific statuses are returned and retired jobs are optional to include. I'll be uploading an initial patch momentarily. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient
[ https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402478#comment-13402478 ] Arun C Murthy commented on MAPREDUCE-4346: -- To be clear: we should refrain from adding public apis without a *compelling* use-case to MRv1, particularly when they are going to be hard to support in MRv2. Thanks. Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient -- Key: MAPREDUCE-4346 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Reporter: Ahmed Radwan Assignee: Ahmed Radwan Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, MAPREDUCE-4346_rev3.patch, MAPREDUCE-4346_rev4.patch The current implementation for JobTracker.getAllJobs() returns all submitted jobs in any state, in addition to retired jobs. This list can be long and represents an unneeded overhead especially in the case of clients only interested in jobs in specific state(s). It is beneficial to include a refined version where only jobs having specific statuses are returned and retired jobs are optional to include. I'll be uploading an initial patch momentarily. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient
[ https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402479#comment-13402479 ] Hudson commented on MAPREDUCE-4346: --- Integrated in Hadoop-Common-trunk-Commit #2396 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2396/]) Reverting MAPREDUCE-4346 r1353757 (Revision 1354656) Result = SUCCESS tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1354656 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapred/TestJobClient.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapred/TestJobClientGetJob.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/JobClient.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/Cluster.java Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient -- Key: MAPREDUCE-4346 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Reporter: Ahmed Radwan Assignee: Ahmed Radwan Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, MAPREDUCE-4346_rev3.patch, MAPREDUCE-4346_rev4.patch The current implementation for JobTracker.getAllJobs() returns all submitted jobs in any state, in addition to retired jobs. This list can be long and represents an unneeded overhead especially in the case of clients only interested in jobs in specific state(s). It is beneficial to include a refined version where only jobs having specific statuses are returned and retired jobs are optional to include. I'll be uploading an initial patch momentarily. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4360) Capacity Scheduler Hierarchical leaf queue does not honur the max capacity of container queue
[ https://issues.apache.org/jira/browse/MAPREDUCE-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayank Bansal updated MAPREDUCE-4360: - Affects Version/s: (was: trunk) Capacity Scheduler Hierarchical leaf queue does not honur the max capacity of container queue - Key: MAPREDUCE-4360 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4360 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.22.1 Reporter: Mayank Bansal Assignee: Mayank Bansal Attachments: MAPREDUCE-4360-22.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4360) Capacity Scheduler Hierarchical leaf queue does not honur the max capacity of container queue
[ https://issues.apache.org/jira/browse/MAPREDUCE-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayank Bansal updated MAPREDUCE-4360: - Attachment: MAPREDUCE-4360-22-1.patch Thanks Konst for your comments. Updated the patch with formatting issues. Thanks, Mayank Capacity Scheduler Hierarchical leaf queue does not honur the max capacity of container queue - Key: MAPREDUCE-4360 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4360 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.22.1 Reporter: Mayank Bansal Assignee: Mayank Bansal Attachments: MAPREDUCE-4360-22-1.patch, MAPREDUCE-4360-22.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (MAPREDUCE-4376) TestClusterMRNotification times out
[ https://issues.apache.org/jira/browse/MAPREDUCE-4376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee reassigned MAPREDUCE-4376: - Assignee: Kihwal Lee TestClusterMRNotification times out --- Key: MAPREDUCE-4376 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4376 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, test Affects Versions: 2.0.1-alpha Reporter: Jason Lowe Assignee: Kihwal Lee The TestClusterMRNotification test is often timing out. git bisect tests narrowed it down to MAPREDUCE-3921, as the test consistently passes before that change and times out most of the time after picking up that change. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4322) Fix command-line length abort issues on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402489#comment-13402489 ] Bikas Saha commented on MAPREDUCE-4322: --- Thats exactly what I am saying too :) The test is trying to cover both cases, but the result is kind of implicit right now because we know both paths are being covered. However, in the test itself by checking for only sb.toString() we are not making that explicit. There is nothing to hardcode. Unless I am reading the test code incorrectly, we have already defined Liststring setup and Liststring cmd. In the exception message, along with checking for sb.toString(), we could also check for setup[0] and cmd[0]. That way its explicit that 2 different paths are being covered. Fix command-line length abort issues on Windows --- Key: MAPREDUCE-4322 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4322 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Environment: Windows, downstream applications with long aggregate classpaths Reporter: John Gordon Assignee: Ivan Mitic Attachments: MAPREDUCE-4322-branch-1-win(2).patch, MAPREDUCE-4322-branch-1-win(3).patch, MAPREDUCE-4322-branch-1-win(4).patch, MAPREDUCE-4322-branch-1-win.patch Original Estimate: 12h Remaining Estimate: 12h When a task is started on the tasktracker, it creates a small batch file to invoke java and runs that batch. Within the batch file, the invocation of Java currently has -classpath ${CLASSPATH} inline to the command. That line often exceeds 8000 characters. This is ok for most linux distributions because the line limit env variable is often set much higher than this. However, for Windows this cause cmd to abort execution. This surfaces in Hadoop as an unknown failure mode for the task. I think the easiest and most natural way to fix this is to push the -classpath option into a config file to take the longest variable part of the line and put it somewhere that scales better. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4376) TestClusterMRNotification times out
[ https://issues.apache.org/jira/browse/MAPREDUCE-4376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402499#comment-13402499 ] Kihwal Lee commented on MAPREDUCE-4376: --- There is a check for null to handle transitions from UNASSIGNED state, but the check doesn't work anymore because assignedRequest.get() throws NPE after the following change from MAPREDUCE-3921. {noformat} ContainerId get(TaskAttemptId tId) { if (tId.getTaskId().getTaskType().equals(TaskType.MAP)) { -return maps.get(tId); +return maps.get(tId).getId(); } else { -return reduces.get(tId); +return reduces.get(tId).getId(); } } {noformat} Jason has also suggested we put a time limit in these jobs so that they don't hang even if something goes wrong. TestClusterMRNotification times out --- Key: MAPREDUCE-4376 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4376 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, test Affects Versions: 2.0.1-alpha Reporter: Jason Lowe Assignee: Kihwal Lee The TestClusterMRNotification test is often timing out. git bisect tests narrowed it down to MAPREDUCE-3921, as the test consistently passes before that change and times out most of the time after picking up that change. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3837) Hadoop 22 Job tracker is not able to recover job in case of crash and after that no user can submit job.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402503#comment-13402503 ] Tom White commented on MAPREDUCE-3837: -- Mayank - thanks for the changes. Here's my feedback: * If there is no need for restart count anymore - since jobs are re-run from the beginning each time - then would it be cleaner to remove it entirely? * In JobTracker you changed shouldRecover = false; to shouldRecover = true; without updating the comment on the line before. (This might be related to the previous point about not having restart counts.) * Remove the @Ignore annotation from TestRecoveryManager and the comment about MAPREDUCE-873. * The new test testJobresubmission (should be testJobResubmission) should test that the job succeeded after the restart. Also, there's no reason to run it as a high-priority job. * There's a comment saying it is a faulty job - which it isn't. * Have setUp and tearDown methods to start and stop the cluster. At the moment there is code duplication, and clusters won't be shut down cleanly on failure. * testJobTracker would be better named testJobTrackerRestartsWithMissingJobFile * testRecoveryManager would be better named testJobTrackerRestartWithBadJobs * There are multiple typos and formatting errors (including indentation, which should be 2 spaces) in the new code. See Konstantin's comment above. * TestJobTrackerRestartWithLostTracker still fails, as does TestJobTrackerSafeMode. These should be fixed as a part of this work. Hadoop 22 Job tracker is not able to recover job in case of crash and after that no user can submit job. Key: MAPREDUCE-3837 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3837 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.22.0 Reporter: Mayank Bansal Assignee: Mayank Bansal Fix For: 0.24.0, 0.22.1, 0.23.2 Attachments: PATCH-HADOOP-1-MAPREDUCE-3837-1.patch, PATCH-HADOOP-1-MAPREDUCE-3837-2.patch, PATCH-HADOOP-1-MAPREDUCE-3837-3.patch, PATCH-HADOOP-1-MAPREDUCE-3837.patch, PATCH-MAPREDUCE-3837.patch, PATCH-TRUNK-MAPREDUCE-3837.patch If job tracker is crashed while running , and there were some jobs are running , so if job tracker's property mapreduce.jobtracker.restart.recover is true then it should recover the job. However the current behavior is as follows jobtracker try to restore the jobs but it can not . And after that jobtracker closes its handle to hdfs and nobody else can submit job. Thanks, Mayank -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4322) Fix command-line length abort issues on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Mitic updated MAPREDUCE-4322: -- Attachment: MAPREDUCE-4322-branch-1-win(5).patch Attaching updated patch. Adding explicit checks that the correct exception string is returned back. Also removing some of if WINDOWS forks in the test code. Fix command-line length abort issues on Windows --- Key: MAPREDUCE-4322 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4322 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Environment: Windows, downstream applications with long aggregate classpaths Reporter: John Gordon Assignee: Ivan Mitic Attachments: MAPREDUCE-4322-branch-1-win(2).patch, MAPREDUCE-4322-branch-1-win(3).patch, MAPREDUCE-4322-branch-1-win(4).patch, MAPREDUCE-4322-branch-1-win(5).patch, MAPREDUCE-4322-branch-1-win.patch Original Estimate: 12h Remaining Estimate: 12h When a task is started on the tasktracker, it creates a small batch file to invoke java and runs that batch. Within the batch file, the invocation of Java currently has -classpath ${CLASSPATH} inline to the command. That line often exceeds 8000 characters. This is ok for most linux distributions because the line limit env variable is often set much higher than this. However, for Windows this cause cmd to abort execution. This surfaces in Hadoop as an unknown failure mode for the task. I think the easiest and most natural way to fix this is to push the -classpath option into a config file to take the longest variable part of the line and put it somewhere that scales better. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient
[ https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402516#comment-13402516 ] Hudson commented on MAPREDUCE-4346: --- Integrated in Hadoop-Mapreduce-trunk-Commit #2415 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2415/]) Reverting MAPREDUCE-4346 r1353757 (Revision 1354656) Result = FAILURE tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1354656 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapred/TestJobClient.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapred/TestJobClientGetJob.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/JobClient.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/Cluster.java Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient -- Key: MAPREDUCE-4346 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Reporter: Ahmed Radwan Assignee: Ahmed Radwan Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, MAPREDUCE-4346_rev3.patch, MAPREDUCE-4346_rev4.patch The current implementation for JobTracker.getAllJobs() returns all submitted jobs in any state, in addition to retired jobs. This list can be long and represents an unneeded overhead especially in the case of clients only interested in jobs in specific state(s). It is beneficial to include a refined version where only jobs having specific statuses are returned and retired jobs are optional to include. I'll be uploading an initial patch momentarily. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4346) Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient
[ https://issues.apache.org/jira/browse/MAPREDUCE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402522#comment-13402522 ] Alejandro Abdelnur commented on MAPREDUCE-4346: --- @Arun, I'm working with Ahmed on this one. The use case we have is large clusters running 1000+ concurrent jobs, monitoring agents are querying the cluster for jobs in different statuses, most of the times this agents focus on running/just finished jobs. With the current API we are forced to query ALL jobs, including retired jobs (which increases significantly the number of jobs being returned), and do the filtering in the client side. This creates unnecessary load on the JT (serializing all jobs) and on the client (deserializing all jobs). Thus adding this new API, which does not break backwards compatibility will definitely help reducing this load. Regarding the support in MRv2, we currently have a the getAllJobs() method there as well, we can address it in the client side for sure (the fallback implementation Ahmed did in the client for MRv1). We could add and PB call to support the filtering on the RM side. While looking at MRv2 code I've noticed we are only querying the RM, this means that completed jobs will never be returned by this call. If I'm correct here, a solution would be for the client to call the HS to ask for jobs younger than X; this would be the equivalent of 'retired' jobs, and definitely the filtering would be useful as well for the same reasons explained above. Adding a refined version of JobTracker.getAllJobs() and exposing through the JobClient -- Key: MAPREDUCE-4346 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4346 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Reporter: Ahmed Radwan Assignee: Ahmed Radwan Attachments: MAPREDUCE-4346.patch, MAPREDUCE-4346_rev2.patch, MAPREDUCE-4346_rev3.patch, MAPREDUCE-4346_rev4.patch The current implementation for JobTracker.getAllJobs() returns all submitted jobs in any state, in addition to retired jobs. This list can be long and represents an unneeded overhead especially in the case of clients only interested in jobs in specific state(s). It is beneficial to include a refined version where only jobs having specific statuses are returned and retired jobs are optional to include. I'll be uploading an initial patch momentarily. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4355) Add RunningJob.getJobStatus()
[ https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated MAPREDUCE-4355: Description: Usecase: Read the start/end-time of a particular job. Currently, one has to iterate through JobClient.getAllJobStatuses() and iterate through them. JobClient.getJob(JobID) returns RunningJob, which doesn't hold the job's start time. Adding RunningJob.getJobStatus() solves the issue. was: To read the start-time of a particular job, one should not need to getAllJobs() and iterate through them. getJob(JobID) returns RunningJob, which doesn't hold the job's start time. Hence, we need to either add getJobStatus(JobID) to the API or add startTime to RunningJob. Doing the latter. Summary: Add RunningJob.getJobStatus() (was: Add startTime to RunningJob) Add RunningJob.getJobStatus() - Key: MAPREDUCE-4355 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355 Project: Hadoop Map/Reduce Issue Type: New Feature Components: mrv1, mrv2 Affects Versions: 1.0.3, 2.0.0-alpha Reporter: Karthik Kambatla Assignee: Karthik Kambatla Fix For: 1.1.0, 2.0.1-alpha Attachments: MR-4355_mr1.patch, MR-4355_mr1.patch, MR-4355_mr2.patch Usecase: Read the start/end-time of a particular job. Currently, one has to iterate through JobClient.getAllJobStatuses() and iterate through them. JobClient.getJob(JobID) returns RunningJob, which doesn't hold the job's start time. Adding RunningJob.getJobStatus() solves the issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4355) Add RunningJob.getJobStatus()
[ https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated MAPREDUCE-4355: Attachment: MR-4355_mr1.patch Add RunningJob.getJobStatus() - Key: MAPREDUCE-4355 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355 Project: Hadoop Map/Reduce Issue Type: New Feature Components: mrv1, mrv2 Affects Versions: 1.0.3, 2.0.0-alpha Reporter: Karthik Kambatla Assignee: Karthik Kambatla Fix For: 1.1.0, 2.0.1-alpha Attachments: MR-4355_mr1.patch, MR-4355_mr1.patch, MR-4355_mr2.patch Usecase: Read the start/end-time of a particular job. Currently, one has to iterate through JobClient.getAllJobStatuses() and iterate through them. JobClient.getJob(JobID) returns RunningJob, which doesn't hold the job's start time. Adding RunningJob.getJobStatus() solves the issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4322) Fix command-line length abort issues on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402576#comment-13402576 ] Bikas Saha commented on MAPREDUCE-4322: --- Thanks for including all comments! +1. lgtm. Fix command-line length abort issues on Windows --- Key: MAPREDUCE-4322 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4322 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Environment: Windows, downstream applications with long aggregate classpaths Reporter: John Gordon Assignee: Ivan Mitic Attachments: MAPREDUCE-4322-branch-1-win(2).patch, MAPREDUCE-4322-branch-1-win(3).patch, MAPREDUCE-4322-branch-1-win(4).patch, MAPREDUCE-4322-branch-1-win(5).patch, MAPREDUCE-4322-branch-1-win.patch Original Estimate: 12h Remaining Estimate: 12h When a task is started on the tasktracker, it creates a small batch file to invoke java and runs that batch. Within the batch file, the invocation of Java currently has -classpath ${CLASSPATH} inline to the command. That line often exceeds 8000 characters. This is ok for most linux distributions because the line limit env variable is often set much higher than this. However, for Windows this cause cmd to abort execution. This surfaces in Hadoop as an unknown failure mode for the task. I think the easiest and most natural way to fix this is to push the -classpath option into a config file to take the longest variable part of the line and put it somewhere that scales better. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4355) Add RunningJob.getJobStatus()
[ https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated MAPREDUCE-4355: Attachment: (was: MR-4355_mr1.patch) Add RunningJob.getJobStatus() - Key: MAPREDUCE-4355 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355 Project: Hadoop Map/Reduce Issue Type: New Feature Components: mrv1, mrv2 Affects Versions: 1.0.3, 2.0.0-alpha Reporter: Karthik Kambatla Assignee: Karthik Kambatla Fix For: 1.1.0, 2.0.1-alpha Attachments: MR-4355_mr1.patch, MR-4355_mr2.patch Usecase: Read the start/end-time of a particular job. Currently, one has to iterate through JobClient.getAllJobStatuses() and iterate through them. JobClient.getJob(JobID) returns RunningJob, which doesn't hold the job's start time. Adding RunningJob.getJobStatus() solves the issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4355) Add RunningJob.getJobStatus()
[ https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated MAPREDUCE-4355: Attachment: MR-4355_mr2.patch Add RunningJob.getJobStatus() - Key: MAPREDUCE-4355 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355 Project: Hadoop Map/Reduce Issue Type: New Feature Components: mrv1, mrv2 Affects Versions: 1.0.3, 2.0.0-alpha Reporter: Karthik Kambatla Assignee: Karthik Kambatla Fix For: 1.1.0, 2.0.1-alpha Attachments: MR-4355_mr1.patch, MR-4355_mr2.patch Usecase: Read the start/end-time of a particular job. Currently, one has to iterate through JobClient.getAllJobStatuses() and iterate through them. JobClient.getJob(JobID) returns RunningJob, which doesn't hold the job's start time. Adding RunningJob.getJobStatus() solves the issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4355) Add RunningJob.getJobStatus()
[ https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated MAPREDUCE-4355: Attachment: (was: MR-4355_mr2.patch) Add RunningJob.getJobStatus() - Key: MAPREDUCE-4355 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355 Project: Hadoop Map/Reduce Issue Type: New Feature Components: mrv1, mrv2 Affects Versions: 1.0.3, 2.0.0-alpha Reporter: Karthik Kambatla Assignee: Karthik Kambatla Fix For: 1.1.0, 2.0.1-alpha Attachments: MR-4355_mr1.patch, MR-4355_mr2.patch Usecase: Read the start/end-time of a particular job. Currently, one has to iterate through JobClient.getAllJobStatuses() and iterate through them. JobClient.getJob(JobID) returns RunningJob, which doesn't hold the job's start time. Adding RunningJob.getJobStatus() solves the issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4355) Add RunningJob.getJobStatus()
[ https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated MAPREDUCE-4355: Status: Patch Available (was: Reopened) Submitting the MR1 and MR2 patches. - No tests for MR2 - just added a wrapper call to Job.getStatus() Add RunningJob.getJobStatus() - Key: MAPREDUCE-4355 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355 Project: Hadoop Map/Reduce Issue Type: New Feature Components: mrv1, mrv2 Affects Versions: 2.0.0-alpha, 1.0.3 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Fix For: 1.1.0, 2.0.1-alpha Attachments: MR-4355_mr1.patch, MR-4355_mr2.patch Usecase: Read the start/end-time of a particular job. Currently, one has to iterate through JobClient.getAllJobStatuses() and iterate through them. JobClient.getJob(JobID) returns RunningJob, which doesn't hold the job's start time. Adding RunningJob.getJobStatus() solves the issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4317) Job view ACL checks are too permissive
[ https://issues.apache.org/jira/browse/MAPREDUCE-4317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402607#comment-13402607 ] Alejandro Abdelnur commented on MAPREDUCE-4317: --- Karthik, Why 'job ==null' ? {code} +if (!jt.areACLsEnabled() || job == null) { + return myJob; +} {code} If job == null then myJob is also null (or even the call may fail) Shouldn't we check for job == null before trying to the myJob? Job view ACL checks are too permissive -- Key: MAPREDUCE-4317 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4317 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1 Affects Versions: 1.0.3 Reporter: Harsh J Assignee: Karthik Kambatla Attachments: MR-4317.patch, MR-4317.patch The class that does view-based checks, JSPUtil.JobWithViewAccessCheck, has the following internal member: {code}private boolean isViewAllowed = true;{code} Note that its true. Now, in the method that sets proper view-allowed rights, has: {code} if (user != null job != null jt.areACLsEnabled()) { final UserGroupInformation ugi = UserGroupInformation.createRemoteUser(user); try { ugi.doAs(new PrivilegedExceptionActionVoid() { public Void run() throws IOException, ServletException { // checks job view permission jt.getACLsManager().checkAccess(job, ugi, Operation.VIEW_JOB_DETAILS); return null; } }); } catch (AccessControlException e) { String errMsg = User + ugi.getShortUserName() + failed to view + jobid + !brbr + e.getMessage() + hra href=\jobtracker.jsp\Go back to JobTracker/abr; JSPUtil.setErrorAndForward(errMsg, request, response); myJob.setViewAccess(false); } catch (InterruptedException e) { String errMsg = Interrupted while trying to access + jobid + hra href=\jobtracker.jsp\Go back to JobTracker/abr; JSPUtil.setErrorAndForward(errMsg, request, response); myJob.setViewAccess(false); } } return myJob; {code} In the above snippet, you can notice that if user==null, which can happen if user is not http-authenticated (as its got via request.getRemoteUser()), can lead to the view being visible since the default is true and we didn't toggle the view to false for user == null case. Ideally the default of the view job ACL must be false, or we need an else clause that sets the view rights to false in case of a failure to find the user ID. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4355) Add RunningJob.getJobStatus()
[ https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402622#comment-13402622 ] Alejandro Abdelnur commented on MAPREDUCE-4355: --- The mr1 patch has a few false changes in the test class, please revert those. Please add a simple testcase for the mr2 case. Also, in the mr1 patch you are using 'updateStatus()' to update the jobstatus before returning the object. the method above uses 'ensureFreshStatus()', why the difference? Add RunningJob.getJobStatus() - Key: MAPREDUCE-4355 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355 Project: Hadoop Map/Reduce Issue Type: New Feature Components: mrv1, mrv2 Affects Versions: 1.0.3, 2.0.0-alpha Reporter: Karthik Kambatla Assignee: Karthik Kambatla Fix For: 1.1.0, 2.0.1-alpha Attachments: MR-4355_mr1.patch, MR-4355_mr2.patch Usecase: Read the start/end-time of a particular job. Currently, one has to iterate through JobClient.getAllJobStatuses() and iterate through them. JobClient.getJob(JobID) returns RunningJob, which doesn't hold the job's start time. Adding RunningJob.getJobStatus() solves the issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4355) Add RunningJob.getJobStatus()
[ https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402623#comment-13402623 ] Hadoop QA commented on MAPREDUCE-4355: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12533712/MR-4355_mr2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2523//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2523//console This message is automatically generated. Add RunningJob.getJobStatus() - Key: MAPREDUCE-4355 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355 Project: Hadoop Map/Reduce Issue Type: New Feature Components: mrv1, mrv2 Affects Versions: 1.0.3, 2.0.0-alpha Reporter: Karthik Kambatla Assignee: Karthik Kambatla Fix For: 1.1.0, 2.0.1-alpha Attachments: MR-4355_mr1.patch, MR-4355_mr2.patch Usecase: Read the start/end-time of a particular job. Currently, one has to iterate through JobClient.getAllJobStatuses() and iterate through them. JobClient.getJob(JobID) returns RunningJob, which doesn't hold the job's start time. Adding RunningJob.getJobStatus() solves the issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-4373) Fix Javadoc warnings in JobClient.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla resolved MAPREDUCE-4373. - Resolution: Won't Fix Release Note: The changes from MAPREDUCE-4355 have been reverted, and it doesn't suffer from the warnings anymore. Fix Javadoc warnings in JobClient. -- Key: MAPREDUCE-4373 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4373 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.0.1-alpha, 3.0.0 Reporter: Robert Joseph Evans Assignee: Karthik Kambatla It looks like MAPREDUCE-4355 added in two new javadoc warnings. {code} [WARNING] /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/JobClient.java:651: warning - @param argument jobid is not a parameter name. [WARNING] /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/JobClient.java:669: warning - @param argument jobid is not a parameter name. {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4355) Add RunningJob.getJobStatus()
[ https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated MAPREDUCE-4355: Attachment: MR-4355_mr1.patch Updated patch for MR1. ensureFreshStatus() calls updateStatus() only after a particular amount of time has passed since previous updateStatus(). For getJobStatus(), to get the latest status, we need to call updateStatus(). Do you suggest calling ensureFreshStatus() instead for consistency? Add RunningJob.getJobStatus() - Key: MAPREDUCE-4355 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355 Project: Hadoop Map/Reduce Issue Type: New Feature Components: mrv1, mrv2 Affects Versions: 1.0.3, 2.0.0-alpha Reporter: Karthik Kambatla Assignee: Karthik Kambatla Fix For: 1.1.0, 2.0.1-alpha Attachments: MR-4355_mr1.patch, MR-4355_mr1.patch, MR-4355_mr2.patch Usecase: Read the start/end-time of a particular job. Currently, one has to iterate through JobClient.getAllJobStatuses() and iterate through them. JobClient.getJob(JobID) returns RunningJob, which doesn't hold the job's start time. Adding RunningJob.getJobStatus() solves the issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4355) Add RunningJob.getJobStatus()
[ https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402640#comment-13402640 ] Hadoop QA commented on MAPREDUCE-4355: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12533723/MR-4355_mr1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified test files. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2524//console This message is automatically generated. Add RunningJob.getJobStatus() - Key: MAPREDUCE-4355 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355 Project: Hadoop Map/Reduce Issue Type: New Feature Components: mrv1, mrv2 Affects Versions: 1.0.3, 2.0.0-alpha Reporter: Karthik Kambatla Assignee: Karthik Kambatla Fix For: 1.1.0, 2.0.1-alpha Attachments: MR-4355_mr1.patch, MR-4355_mr1.patch, MR-4355_mr2.patch Usecase: Read the start/end-time of a particular job. Currently, one has to iterate through JobClient.getAllJobStatuses() and iterate through them. JobClient.getJob(JobID) returns RunningJob, which doesn't hold the job's start time. Adding RunningJob.getJobStatus() solves the issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4317) Job view ACL checks are too permissive
[ https://issues.apache.org/jira/browse/MAPREDUCE-4317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402641#comment-13402641 ] Karthik Kambatla commented on MAPREDUCE-4317: - Alejandro, The API (Javadoc below) mentions that the job will be null, if there doesn't exist a job with that JobID. The old API also has the same functionality. {code} /** * Validates if current user can view the job. * If user is not authorized to view the job, this method will modify the * response and forwards to an error page and returns Job with * viewJobAccess flag set to false. * @return JobWithViewAccessCheck object(contains JobInProgress object and * viewJobAccess flag). Callers of this method will check the flag * and decide if view should be allowed or not. Job will be null if * the job with given jobid doesnot exist at the JobTracker. */ {code} Job view ACL checks are too permissive -- Key: MAPREDUCE-4317 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4317 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1 Affects Versions: 1.0.3 Reporter: Harsh J Assignee: Karthik Kambatla Attachments: MR-4317.patch, MR-4317.patch The class that does view-based checks, JSPUtil.JobWithViewAccessCheck, has the following internal member: {code}private boolean isViewAllowed = true;{code} Note that its true. Now, in the method that sets proper view-allowed rights, has: {code} if (user != null job != null jt.areACLsEnabled()) { final UserGroupInformation ugi = UserGroupInformation.createRemoteUser(user); try { ugi.doAs(new PrivilegedExceptionActionVoid() { public Void run() throws IOException, ServletException { // checks job view permission jt.getACLsManager().checkAccess(job, ugi, Operation.VIEW_JOB_DETAILS); return null; } }); } catch (AccessControlException e) { String errMsg = User + ugi.getShortUserName() + failed to view + jobid + !brbr + e.getMessage() + hra href=\jobtracker.jsp\Go back to JobTracker/abr; JSPUtil.setErrorAndForward(errMsg, request, response); myJob.setViewAccess(false); } catch (InterruptedException e) { String errMsg = Interrupted while trying to access + jobid + hra href=\jobtracker.jsp\Go back to JobTracker/abr; JSPUtil.setErrorAndForward(errMsg, request, response); myJob.setViewAccess(false); } } return myJob; {code} In the above snippet, you can notice that if user==null, which can happen if user is not http-authenticated (as its got via request.getRemoteUser()), can lead to the view being visible since the default is true and we didn't toggle the view to false for user == null case. Ideally the default of the view job ACL must be false, or we need an else clause that sets the view rights to false in case of a failure to find the user ID. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4355) Add RunningJob.getJobStatus()
[ https://issues.apache.org/jira/browse/MAPREDUCE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402691#comment-13402691 ] Alejandro Abdelnur commented on MAPREDUCE-4355: --- regarding changing updateStatus() to ensureFreshStatus(), no I think updateStatus() is more appropriate. Add RunningJob.getJobStatus() - Key: MAPREDUCE-4355 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4355 Project: Hadoop Map/Reduce Issue Type: New Feature Components: mrv1, mrv2 Affects Versions: 1.0.3, 2.0.0-alpha Reporter: Karthik Kambatla Assignee: Karthik Kambatla Fix For: 1.1.0, 2.0.1-alpha Attachments: MR-4355_mr1.patch, MR-4355_mr1.patch, MR-4355_mr2.patch Usecase: Read the start/end-time of a particular job. Currently, one has to iterate through JobClient.getAllJobStatuses() and iterate through them. JobClient.getJob(JobID) returns RunningJob, which doesn't hold the job's start time. Adding RunningJob.getJobStatus() solves the issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-4365) Shipping Profiler Libraries by DistributedCache
[ https://issues.apache.org/jira/browse/MAPREDUCE-4365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Li resolved MAPREDUCE-4365. --- Resolution: Fixed Target Version/s: (was: 1.1.0) One way is to include the profiler library into the job jar and use relative path like ../../foo.library to locate it. Thanks Deveraj, Sid, Vinod and everyone! Shipping Profiler Libraries by DistributedCache --- Key: MAPREDUCE-4365 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4365 Project: Hadoop Map/Reduce Issue Type: New Feature Affects Versions: 1.0.3 Reporter: Jie Li Hadoop profiling is great for performance tuning and debugging, but currently we can only use Java built-in profilers such as HProf, and for other profilers we need to install them on all slave nodes first, which is inconvenient for large clusters and sometimes impossible for production clusters. Supporting shipping profiler libraries using DistributedCache will solve this problem. For example, in mapred.task.profile.params, we specify a profiler library from the DistributedCache using special place holders such as foo.jar, and Hadoop can look at the DistributedCache to replace foo.jar with the localized path before launching the child jvm. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4374) Fix child task environment variable config and add support for Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-4374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402782#comment-13402782 ] Ivan Mitic commented on MAPREDUCE-4374: --- +1, change looks good to me. Agree on your points for using '%' and ';' on Windows. Fix child task environment variable config and add support for Windows -- Key: MAPREDUCE-4374 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4374 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1-win Reporter: Chuan Liu Assignee: Chuan Liu Priority: Minor Attachments: MAPREDUCE-4374-branch-1-win.patch In HADOOP-2838, a new feature was introduced to set environment variables via the Hadoop config 'mapred.child.env' for child tasks. There are some further fixes and improvements around this feature, e.g. HADOOP-5981 were a bug fix; MAPREDUCE-478 broke the config into 'mapred.map.child.env' and 'mapred.reduce.child.env'. However the current implementation is still not complete. It does not match its documentation or original intend as I believe. Also, by using ‘:’ (colon) and ‘;’ (semicolon) in the configuration syntax, we will have problems using them on Windows because ‘:’ appears very often in Windows path as in “C:\”, and environment variables are used very often to hold path names. The Jira is created to fix the problem and provide support on Windows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4369) Fix streaming job failures with WindowsResourceCalculatorPlugin
[ https://issues.apache.org/jira/browse/MAPREDUCE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402792#comment-13402792 ] Ivan Mitic commented on MAPREDUCE-4369: --- Thanks for the change Bikas! A few questions/suggestions: 1. In {{WindowsResourceCalculatorPlugin#getProcResourceValues()}} you mention that some tests use JVM_PID. Do you happen to have a list of these tests? 2. Can you please refactor {{ResourceCalculatorPlugin#getResourceCalculatorPlugin()}} to accept processPid, and update call sites to pass the appropriate value (I see only 3 call sites). The cause of this bug in the first place is not having all call sites set the processPid accordingly. And then, if the passed-in processPid is null, you can fallback to {{System.getenv().get(JVM_PID)}}. Make sense? If I'm seeing things correctly, this way you might be able to clean up some of the newly introduced code. Fix streaming job failures with WindowsResourceCalculatorPlugin --- Key: MAPREDUCE-4369 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4369 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Bikas Saha Assignee: Bikas Saha Attachments: MAPREDUCE-4369.branch-1-win.1.patch Some streaming jobs use local mode job runs that do not start tasks trackers. In these cases, the jvm context is not setup and hence local mode execution causes the code to crash. Fix is to not not use ResourceCalculatorPlugin in such cases or make the local job run creating dummy jvm contexts. Choosing the first option because thats the current implicit behavior in Linux. The ProcfsBasedProcessTree (used inside the LinuxResourceCalculatorPlugin) does no real work when the process pid is not setup correctly. This is what happens when local job mode runs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4380) Empty Userlogs directory is getting created under logs directory
Devaraj K created MAPREDUCE-4380: Summary: Empty Userlogs directory is getting created under logs directory Key: MAPREDUCE-4380 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4380 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Devaraj K Empty Userlogs directory is getting created under logs directory. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4380) Empty Userlogs directory is getting created under logs directory
[ https://issues.apache.org/jira/browse/MAPREDUCE-4380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated MAPREDUCE-4380: - Component/s: nodemanager mrv2 Priority: Minor (was: Major) Affects Version/s: 3.0.0 2.0.0-alpha Empty Userlogs directory is getting created under logs directory Key: MAPREDUCE-4380 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4380 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, nodemanager Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Devaraj K Priority: Minor Empty Userlogs directory is getting created under logs directory. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira