[jira] [Commented] (MAPREDUCE-4639) CombineFileInputFormat#getSplits should throw IOException when input paths contain a directory
[ https://issues.apache.org/jira/browse/MAPREDUCE-4639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452764#comment-13452764 ] Jim Donofrio commented on MAPREDUCE-4639: - Ok I have a couple of questions: 1. Only the mapred FIF has recursive support, mapreduce FIF does not so I can add recursive support to FIF and CFIF would get it for free from listStatus. I can do this but this sounds like a new JIRA? 2. mapreduce FIF is subject to this same bug with directories, only mapred FIF checks if terminal paths are directories. 2. What do you think about checking for directories as terminal paths in listStatus instead of having FIF do the check in getSplits? This is cleaner but I guess this could break compatibility for custom IF's that extend FIF and use listStatus but have their own getSplits. Although I dont know why a custom FIF would want a directory. CombineFileInputFormat#getSplits should throw IOException when input paths contain a directory -- Key: MAPREDUCE-4639 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4639 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Reporter: Jim Donofrio Priority: Minor Attachments: MAPREDUCE-4639.patch FileInputFormat#getSplits throws an IOException when the input paths contain a directory. CombineFileInputFormat should do the same, otherwise the jo will not fail until the record reader is initialized when FileSystem#open will say that the directory does not exist. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4650) Provide an ability to show task counters for a given task type in Web UI
Hemanth Yamijala created MAPREDUCE-4650: --- Summary: Provide an ability to show task counters for a given task type in Web UI Key: MAPREDUCE-4650 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4650 Project: Hadoop Map/Reduce Issue Type: Improvement Components: webapps Reporter: Hemanth Yamijala Assignee: Hemanth Yamijala In the Web UI for Hadoop (post YARN), there is a view that lists counter values per task for a given counter (singlejobcounter page). I found this very useful to get a consolidated view of a specific counter across all tasks, for e.g. in debugging slow tasks. We can navigate to this view by selecting the specific counter from the Job Counters page. This JIRA is to allow for the singlejobcounter page to be prepopulated only with map task values or reduce task values (if one were interested in only a particular type of task). I understand there is a 'search' option on the singlejobcounter page that can be used to accomplish the same purpose. However, seems a little more usable to provide a direct filter before getting on to the page. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4650) Provide an ability to show task counters for a given task type in Web UI
[ https://issues.apache.org/jira/browse/MAPREDUCE-4650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452867#comment-13452867 ] Hemanth Yamijala commented on MAPREDUCE-4650: - I am thinking on the lines of hyperlinking the counter values in the map or reduce columns of the job counters page. Selecting any of these queries for singlejobcounter with an additional tasktype parameter in the URL and the rest of the filtering can happen in the view / model. With this change, the URL for the singlejobcounter looks like this: singlejobcounter/$JOBID/$COUNTERGROUP/$COUNTER/$TASKTYPE. If no TaskType is passed, it behaves as today. Otherwise the filter is applied. Provide an ability to show task counters for a given task type in Web UI Key: MAPREDUCE-4650 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4650 Project: Hadoop Map/Reduce Issue Type: Improvement Components: webapps Reporter: Hemanth Yamijala Assignee: Hemanth Yamijala In the Web UI for Hadoop (post YARN), there is a view that lists counter values per task for a given counter (singlejobcounter page). I found this very useful to get a consolidated view of a specific counter across all tasks, for e.g. in debugging slow tasks. We can navigate to this view by selecting the specific counter from the Job Counters page. This JIRA is to allow for the singlejobcounter page to be prepopulated only with map task values or reduce task values (if one were interested in only a particular type of task). I understand there is a 'search' option on the singlejobcounter page that can be used to accomplish the same purpose. However, seems a little more usable to provide a direct filter before getting on to the page. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4650) Provide an ability to show task counters for a given task type in Web UI
[ https://issues.apache.org/jira/browse/MAPREDUCE-4650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hemanth Yamijala updated MAPREDUCE-4650: Attachment: MAPREDUCE-4650.patch Attaching a patch that roughly demonstrates how this will work. It doesn't have tests etc and so is not complete. If it feels useful, I'll complete it. Provide an ability to show task counters for a given task type in Web UI Key: MAPREDUCE-4650 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4650 Project: Hadoop Map/Reduce Issue Type: Improvement Components: webapps Reporter: Hemanth Yamijala Assignee: Hemanth Yamijala Attachments: MAPREDUCE-4650.patch In the Web UI for Hadoop (post YARN), there is a view that lists counter values per task for a given counter (singlejobcounter page). I found this very useful to get a consolidated view of a specific counter across all tasks, for e.g. in debugging slow tasks. We can navigate to this view by selecting the specific counter from the Job Counters page. This JIRA is to allow for the singlejobcounter page to be prepopulated only with map task values or reduce task values (if one were interested in only a particular type of task). I understand there is a 'search' option on the singlejobcounter page that can be used to accomplish the same purpose. However, seems a little more usable to provide a direct filter before getting on to the page. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4517) Too many INFO messages written out during AM to RM heartbeat
[ https://issues.apache.org/jira/browse/MAPREDUCE-4517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452932#comment-13452932 ] James Kinley commented on MAPREDUCE-4517: - No new tests required. Only changed logging level from INFO to DEBUG. Too many INFO messages written out during AM to RM heartbeat Key: MAPREDUCE-4517 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4517 Project: Hadoop Map/Reduce Issue Type: Improvement Components: applicationmaster Reporter: James Kinley Priority: Minor Labels: patch Fix For: trunk Attachments: MAPREDUCE-4517.patch Too many INFO log messages written out during AM to RM heartbeat. Based on default frequency of 1000ms (scheduler.heartbeat.interval-ms) either 2 or 4 INFO messages are written out per second: LOG.info(Before Scheduling: + getStat()); ListContainer allocatedContainers = getResources(); LOG.info(After Scheduling: + getStat()); if (allocatedContainers.size() 0) { LOG.info(Before Assign: + getStat()); scheduledRequests.assign(allocatedContainers); LOG.info(After Assign: + getStat()); } These should probably be changed to DEBUG message to save the log growing too quickly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2993) Hamlet HTML elements are not closed properly. Every element should have proper end tag.
[ https://issues.apache.org/jira/browse/MAPREDUCE-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452967#comment-13452967 ] Senthil V Kumar commented on MAPREDUCE-2993: In the HamletSpec.java, I see that elements including td, tfoot, li, th have @Element(endTag=false) annotation. Is this done purposefully? Hamlet HTML elements are not closed properly. Every element should have proper end tag. --- Key: MAPREDUCE-2993 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2993 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0, 0.24.0 Reporter: Abhijit Suresh Shingate Original Estimate: 72h Remaining Estimate: 72h Execute org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebApp.testNodesPage() Verify the output on the console. {code:xml} table id=layout class=ui-widget-content thead tr td colspan=2 div id=header class=ui-widget div id=user Logged in as: null /div div id=logo img src=/static/hadoop-st.png /div h1 Nodes of the cluster /h1 /div tfoot tr td colspan=2 div id=footer class=ui-widget a href=http://hadoop.apache.org/;About Apache Hadoop/a /div tbody tr td id=navcell div id=nav h3 Cluster /h3 ul li a href=/null/clusterAbout/a li a href=/null/nodesNodes/a li a href=/null/appsApplications/a li a href=/null/schedulerScheduler/a /ul h3 Tools /h3 ul li a href=/confConfiguration/a li a href=/logsLocal logs/a li a href=/stacksServer stacks/a li a href=/metricsServer metrics/a /ul /div div id=themeswitcher /div td class=content table id=nodes thead tr th class=rack Rack th class=nodeaddress Node Address th class=nodehttpaddress Node HTTP Address th class=healthStatus Health-status th class=lastHealthUpdate Last health-update th class=healthReport Health-report th class=containers Containers tbody tr td rack0 td host0:123 td a href=http://localhost:0;localhost:0/a td Unhealthy td N/A td null tr td rack0 td host1:123 td a href=http://localhost:0;localhost:0/a td Unhealthy td N/A td null /tbody /table /tbody /table /html {code} Many html elements does not have end tag. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service
[ https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452997#comment-13452997 ] Avner BenHanoch commented on MAPREDUCE-4049: _Asokan, Arun, and the other folks,_ My patch is ready in the system with +1 overall in Hadoop QA. *I'll be happy if someone can review and approve my patch, or give me appropriate comments.* I promise the patch is very short and straightforward. I can’t progress with the comments I got so far, because these comments seemed to be relevant to the trunk itself regardless of my patch. Kindly please let me know how to proceed. This is the first time I am submitting a patch. Thanks, Avner plugin for generic shuffle service -- Key: MAPREDUCE-4049 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049 Project: Hadoop Map/Reduce Issue Type: Improvement Components: performance, task, tasktracker Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0 Reporter: Avner BenHanoch Labels: merge, plugin, rdma, shuffle Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Consumer Plugin TLD.rtf, Hadoop Shuffle Provider Plugin TLD.rtf, mapred-site.xml, mapreduce-4049.patch, mapreduce-4049.patch Support generic shuffle service as set of two plugins: ShuffleProvider ShuffleConsumer. This will satisfy the following needs: # Better shuffle and merge performance. For example: we are working on shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, or Infiniband) instead of using the current HTTP shuffle. Based on the fast RDMA shuffle, the plugin can also utilize a suitable merge approach during the intermediate merges. Hence, getting much better performance. # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden dependency of NodeManager with a specific version of mapreduce shuffle (currently targeted to 0.24.0). References: # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu from Auburn University with others, [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf] # I am attaching 2 documents with suggested Top Level Design for both plugins (currently, based on 1.0 branch) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4607) Race condition in ReduceTask completion can result in Task being incorrectly failed
[ https://issues.apache.org/jira/browse/MAPREDUCE-4607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom White updated MAPREDUCE-4607: - Resolution: Fixed Fix Version/s: 2.0.3-alpha Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I just committed this. Thanks Bikas! Race condition in ReduceTask completion can result in Task being incorrectly failed --- Key: MAPREDUCE-4607 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4607 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.0.0-alpha Reporter: Bikas Saha Assignee: Bikas Saha Fix For: 2.0.3-alpha Attachments: MAPREDUCE-4607.1.patch, MAPREDUCE-4607.2.patch, MAPREDUCE-4607.3.patch, MAPREDUCE-4607.4.patch, MAPREDUCE-4607.patch Problem reported by chackaravarthy in MAPREDUCE-4252 This problem has been handled when speculative task launched for map task and other attempt got failed (not killed) Can the similar kind of scenario can happen in case of reduce task? Consider the following scenario for reduce task in case of speculation (one attempt got killed): 1. A task attempt is started. 2. A speculative task attempt for the same task is started. 3. The first task attempt completes and causes the task to transition to SUCCEEDED. 4. Then speculative task attempt will be killed because of the completion of first attempt. As a result, internal error will be thrown from this attempt (TaskImpl.MapRetroactiveKilledTransition) and hence task attempt failure leads to job failure. TaskImpl.MapRetroactiveKilledTransition if (!TaskType.MAP.equals(task.getType())) { LOG.error(Unexpected event for REDUCE task + event.getType()); task.internalError(event.getType()); } So, do we need to have following code in MapRetroactiveKilledTransition also just like in MapRetroactiveFailureTransition. if (event instanceof TaskTAttemptEvent) { TaskTAttemptEvent castEvent = (TaskTAttemptEvent) event; if (task.getState() == TaskState.SUCCEEDED !castEvent.getTaskAttemptID().equals(task.successfulAttempt)) { // don't allow a different task attempt to override a previous // succeeded state return TaskState.SUCCEEDED; } } please check whether this is a valid case and give your suggestion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4607) Race condition in ReduceTask completion can result in Task being incorrectly failed
[ https://issues.apache.org/jira/browse/MAPREDUCE-4607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453046#comment-13453046 ] Hudson commented on MAPREDUCE-4607: --- Integrated in Hadoop-Common-trunk-Commit #2716 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2716/]) Update CHANGES.txt for MAPREDUCE-4607 commit. (Revision 1383423) MAPREDUCE-4607. Race condition in ReduceTask completion can result in Task being incorrectly failed. Contributed by Bikas Saha. (Revision 1383422) Result = SUCCESS tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1383423 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1383422 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestMRApp.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java Race condition in ReduceTask completion can result in Task being incorrectly failed --- Key: MAPREDUCE-4607 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4607 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.0.0-alpha Reporter: Bikas Saha Assignee: Bikas Saha Fix For: 2.0.3-alpha Attachments: MAPREDUCE-4607.1.patch, MAPREDUCE-4607.2.patch, MAPREDUCE-4607.3.patch, MAPREDUCE-4607.4.patch, MAPREDUCE-4607.patch Problem reported by chackaravarthy in MAPREDUCE-4252 This problem has been handled when speculative task launched for map task and other attempt got failed (not killed) Can the similar kind of scenario can happen in case of reduce task? Consider the following scenario for reduce task in case of speculation (one attempt got killed): 1. A task attempt is started. 2. A speculative task attempt for the same task is started. 3. The first task attempt completes and causes the task to transition to SUCCEEDED. 4. Then speculative task attempt will be killed because of the completion of first attempt. As a result, internal error will be thrown from this attempt (TaskImpl.MapRetroactiveKilledTransition) and hence task attempt failure leads to job failure. TaskImpl.MapRetroactiveKilledTransition if (!TaskType.MAP.equals(task.getType())) { LOG.error(Unexpected event for REDUCE task + event.getType()); task.internalError(event.getType()); } So, do we need to have following code in MapRetroactiveKilledTransition also just like in MapRetroactiveFailureTransition. if (event instanceof TaskTAttemptEvent) { TaskTAttemptEvent castEvent = (TaskTAttemptEvent) event; if (task.getState() == TaskState.SUCCEEDED !castEvent.getTaskAttemptID().equals(task.successfulAttempt)) { // don't allow a different task attempt to override a previous // succeeded state return TaskState.SUCCEEDED; } } please check whether this is a valid case and give your suggestion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4607) Race condition in ReduceTask completion can result in Task being incorrectly failed
[ https://issues.apache.org/jira/browse/MAPREDUCE-4607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453048#comment-13453048 ] Hudson commented on MAPREDUCE-4607: --- Integrated in Hadoop-Hdfs-trunk-Commit #2779 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2779/]) Update CHANGES.txt for MAPREDUCE-4607 commit. (Revision 1383423) MAPREDUCE-4607. Race condition in ReduceTask completion can result in Task being incorrectly failed. Contributed by Bikas Saha. (Revision 1383422) Result = SUCCESS tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1383423 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1383422 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestMRApp.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java Race condition in ReduceTask completion can result in Task being incorrectly failed --- Key: MAPREDUCE-4607 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4607 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.0.0-alpha Reporter: Bikas Saha Assignee: Bikas Saha Fix For: 2.0.3-alpha Attachments: MAPREDUCE-4607.1.patch, MAPREDUCE-4607.2.patch, MAPREDUCE-4607.3.patch, MAPREDUCE-4607.4.patch, MAPREDUCE-4607.patch Problem reported by chackaravarthy in MAPREDUCE-4252 This problem has been handled when speculative task launched for map task and other attempt got failed (not killed) Can the similar kind of scenario can happen in case of reduce task? Consider the following scenario for reduce task in case of speculation (one attempt got killed): 1. A task attempt is started. 2. A speculative task attempt for the same task is started. 3. The first task attempt completes and causes the task to transition to SUCCEEDED. 4. Then speculative task attempt will be killed because of the completion of first attempt. As a result, internal error will be thrown from this attempt (TaskImpl.MapRetroactiveKilledTransition) and hence task attempt failure leads to job failure. TaskImpl.MapRetroactiveKilledTransition if (!TaskType.MAP.equals(task.getType())) { LOG.error(Unexpected event for REDUCE task + event.getType()); task.internalError(event.getType()); } So, do we need to have following code in MapRetroactiveKilledTransition also just like in MapRetroactiveFailureTransition. if (event instanceof TaskTAttemptEvent) { TaskTAttemptEvent castEvent = (TaskTAttemptEvent) event; if (task.getState() == TaskState.SUCCEEDED !castEvent.getTaskAttemptID().equals(task.successfulAttempt)) { // don't allow a different task attempt to override a previous // succeeded state return TaskState.SUCCEEDED; } } please check whether this is a valid case and give your suggestion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4576) Large dist cache can block tasktracker heartbeat
[ https://issues.apache.org/jira/browse/MAPREDUCE-4576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453054#comment-13453054 ] Thomas Graves commented on MAPREDUCE-4576: -- +1. Thanks Bobby. Note the findbugs exist without this patch. [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no tests are needed for this patch. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] -1 findbugs. The patch appears to introduce 8 new Findbugs (version 1.3.9) warnings. [exec] [exec] Large dist cache can block tasktracker heartbeat Key: MAPREDUCE-4576 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4576 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.205.0, 1.0.0 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Attachments: MR-4576.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4576) Large dist cache can block tasktracker heartbeat
[ https://issues.apache.org/jira/browse/MAPREDUCE-4576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated MAPREDUCE-4576: - Resolution: Fixed Fix Version/s: 1.2.0 Status: Resolved (was: Patch Available) Large dist cache can block tasktracker heartbeat Key: MAPREDUCE-4576 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4576 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.205.0, 1.0.0 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Fix For: 1.2.0 Attachments: MR-4576.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4607) Race condition in ReduceTask completion can result in Task being incorrectly failed
[ https://issues.apache.org/jira/browse/MAPREDUCE-4607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453087#comment-13453087 ] Hudson commented on MAPREDUCE-4607: --- Integrated in Hadoop-Mapreduce-trunk-Commit #2740 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2740/]) Update CHANGES.txt for MAPREDUCE-4607 commit. (Revision 1383423) MAPREDUCE-4607. Race condition in ReduceTask completion can result in Task being incorrectly failed. Contributed by Bikas Saha. (Revision 1383422) Result = FAILURE tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1383423 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1383422 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestMRApp.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java Race condition in ReduceTask completion can result in Task being incorrectly failed --- Key: MAPREDUCE-4607 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4607 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.0.0-alpha Reporter: Bikas Saha Assignee: Bikas Saha Fix For: 2.0.3-alpha Attachments: MAPREDUCE-4607.1.patch, MAPREDUCE-4607.2.patch, MAPREDUCE-4607.3.patch, MAPREDUCE-4607.4.patch, MAPREDUCE-4607.patch Problem reported by chackaravarthy in MAPREDUCE-4252 This problem has been handled when speculative task launched for map task and other attempt got failed (not killed) Can the similar kind of scenario can happen in case of reduce task? Consider the following scenario for reduce task in case of speculation (one attempt got killed): 1. A task attempt is started. 2. A speculative task attempt for the same task is started. 3. The first task attempt completes and causes the task to transition to SUCCEEDED. 4. Then speculative task attempt will be killed because of the completion of first attempt. As a result, internal error will be thrown from this attempt (TaskImpl.MapRetroactiveKilledTransition) and hence task attempt failure leads to job failure. TaskImpl.MapRetroactiveKilledTransition if (!TaskType.MAP.equals(task.getType())) { LOG.error(Unexpected event for REDUCE task + event.getType()); task.internalError(event.getType()); } So, do we need to have following code in MapRetroactiveKilledTransition also just like in MapRetroactiveFailureTransition. if (event instanceof TaskTAttemptEvent) { TaskTAttemptEvent castEvent = (TaskTAttemptEvent) event; if (task.getState() == TaskState.SUCCEEDED !castEvent.getTaskAttemptID().equals(task.successfulAttempt)) { // don't allow a different task attempt to override a previous // succeeded state return TaskState.SUCCEEDED; } } please check whether this is a valid case and give your suggestion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4616) Improvement to MultipleOutputs javadocs
[ https://issues.apache.org/jira/browse/MAPREDUCE-4616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Burton updated MAPREDUCE-4616: --- Attachment: MAPREDUCE-4616.patch patch file for Jira issue MAPREDUCE-4616 Improvement to MultipleOutputs javadocs --- Key: MAPREDUCE-4616 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4616 Project: Hadoop Map/Reduce Issue Type: Improvement Components: documentation Affects Versions: 1.0.3 Reporter: Tony Burton Priority: Minor Labels: hadoop, mapreduce Fix For: trunk Attachments: MAPREDUCE-4616.patch In the new API, and using MultipleOutputs it is possible to segment output into directories by using MultipleOutputs.write(KEYOUT key, VALUEOUT value, String baseOutputPath) in the Reducer to determine the output directory, and by using LazyOutputFormat at the job-level config to suppress normal output [eg use LazyOutputFormat.setOutputFormatClass(job, TextOutputFormat.class); instead of job.setOutputFormatClass(TextOutputFormat.class);] This recreates the functionality previously provided in the old API by using MultipleTextOutputFormat (etc) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4616) Improvement to MultipleOutputs javadocs
[ https://issues.apache.org/jira/browse/MAPREDUCE-4616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Burton updated MAPREDUCE-4616: --- Status: Patch Available (was: Open) Documentation changes to describe how to use MultipleOutputs and LazyOutputFormat to mimic behaviour in the now-deprecated MultipleTextOutputFormat (and similar) Improvement to MultipleOutputs javadocs --- Key: MAPREDUCE-4616 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4616 Project: Hadoop Map/Reduce Issue Type: Improvement Components: documentation Affects Versions: 1.0.3 Reporter: Tony Burton Priority: Minor Labels: hadoop, mapreduce Fix For: trunk Attachments: MAPREDUCE-4616.patch In the new API, and using MultipleOutputs it is possible to segment output into directories by using MultipleOutputs.write(KEYOUT key, VALUEOUT value, String baseOutputPath) in the Reducer to determine the output directory, and by using LazyOutputFormat at the job-level config to suppress normal output [eg use LazyOutputFormat.setOutputFormatClass(job, TextOutputFormat.class); instead of job.setOutputFormatClass(TextOutputFormat.class);] This recreates the functionality previously provided in the old API by using MultipleTextOutputFormat (etc) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4616) Improvement to MultipleOutputs javadocs
[ https://issues.apache.org/jira/browse/MAPREDUCE-4616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453130#comment-13453130 ] Hadoop QA commented on MAPREDUCE-4616: -- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12544650/MAPREDUCE-4616.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +0 tests included. The patch appears to be a documentation patch that doesn't require tests. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2842//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2842//console This message is automatically generated. Improvement to MultipleOutputs javadocs --- Key: MAPREDUCE-4616 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4616 Project: Hadoop Map/Reduce Issue Type: Improvement Components: documentation Affects Versions: 1.0.3 Reporter: Tony Burton Priority: Minor Labels: hadoop, mapreduce Fix For: trunk Attachments: MAPREDUCE-4616.patch In the new API, and using MultipleOutputs it is possible to segment output into directories by using MultipleOutputs.write(KEYOUT key, VALUEOUT value, String baseOutputPath) in the Reducer to determine the output directory, and by using LazyOutputFormat at the job-level config to suppress normal output [eg use LazyOutputFormat.setOutputFormatClass(job, TextOutputFormat.class); instead of job.setOutputFormatClass(TextOutputFormat.class);] This recreates the functionality previously provided in the old API by using MultipleTextOutputFormat (etc) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4607) Race condition in ReduceTask completion can result in Task being incorrectly failed
[ https://issues.apache.org/jira/browse/MAPREDUCE-4607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453170#comment-13453170 ] Bikas Saha commented on MAPREDUCE-4607: --- Thanks! Race condition in ReduceTask completion can result in Task being incorrectly failed --- Key: MAPREDUCE-4607 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4607 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.0.0-alpha Reporter: Bikas Saha Assignee: Bikas Saha Fix For: 2.0.3-alpha Attachments: MAPREDUCE-4607.1.patch, MAPREDUCE-4607.2.patch, MAPREDUCE-4607.3.patch, MAPREDUCE-4607.4.patch, MAPREDUCE-4607.patch Problem reported by chackaravarthy in MAPREDUCE-4252 This problem has been handled when speculative task launched for map task and other attempt got failed (not killed) Can the similar kind of scenario can happen in case of reduce task? Consider the following scenario for reduce task in case of speculation (one attempt got killed): 1. A task attempt is started. 2. A speculative task attempt for the same task is started. 3. The first task attempt completes and causes the task to transition to SUCCEEDED. 4. Then speculative task attempt will be killed because of the completion of first attempt. As a result, internal error will be thrown from this attempt (TaskImpl.MapRetroactiveKilledTransition) and hence task attempt failure leads to job failure. TaskImpl.MapRetroactiveKilledTransition if (!TaskType.MAP.equals(task.getType())) { LOG.error(Unexpected event for REDUCE task + event.getType()); task.internalError(event.getType()); } So, do we need to have following code in MapRetroactiveKilledTransition also just like in MapRetroactiveFailureTransition. if (event instanceof TaskTAttemptEvent) { TaskTAttemptEvent castEvent = (TaskTAttemptEvent) event; if (task.getState() == TaskState.SUCCEEDED !castEvent.getTaskAttemptID().equals(task.successfulAttempt)) { // don't allow a different task attempt to override a previous // succeeded state return TaskState.SUCCEEDED; } } please check whether this is a valid case and give your suggestion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4647) We should only unjar jobjar if there is a lib directory in it.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4647: --- Attachment: MR-4647.txt This patch should add in the missing functionality. I have updated a few unit tests, but I also have tested this manually on a small cluster with security off. I verified that it is doing what is expected. We should only unjar jobjar if there is a lib directory in it. -- Key: MAPREDUCE-4647 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4647 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 0.23.3 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Attachments: MR-4647.txt For backwards compatibility we recently added made is so we would unjar the job.jar and add anything to the classpath in the lib directory of that jar. But this also slows job startup down a lot if the jar is large. We should only unjar it if actually doing so would add something new to the classpath. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4647) We should only unjar jobjar if there is a lib directory in it.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4647: --- Status: Patch Available (was: Open) This change is close to backwards compatible with MAPREDUCE-967. The big difference between the two is that the pieces of job.jar that are extracted, will be under a directory named job.jar instead of being in the main working directory of the application. We should only unjar jobjar if there is a lib directory in it. -- Key: MAPREDUCE-4647 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4647 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 0.23.3 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Attachments: MR-4647.txt For backwards compatibility we recently added made is so we would unjar the job.jar and add anything to the classpath in the lib directory of that jar. But this also slows job startup down a lot if the jar is large. We should only unjar it if actually doing so would add something new to the classpath. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4647) We should only unjar jobjar if there is a lib directory in it.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4647: --- Status: Open (was: Patch Available) Oops, need to rebase it on trunk. Seems the patch does not apply cleanly. We should only unjar jobjar if there is a lib directory in it. -- Key: MAPREDUCE-4647 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4647 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 0.23.3 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Attachments: MR-4647.txt For backwards compatibility we recently added made is so we would unjar the job.jar and add anything to the classpath in the lib directory of that jar. But this also slows job startup down a lot if the jar is large. We should only unjar it if actually doing so would add something new to the classpath. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4647) We should only unjar jobjar if there is a lib directory in it.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4647: --- Attachment: MR-4647.txt Patch rebased on trunk We should only unjar jobjar if there is a lib directory in it. -- Key: MAPREDUCE-4647 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4647 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 0.23.3 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Attachments: MR-4647.txt, MR-4647.txt For backwards compatibility we recently added made is so we would unjar the job.jar and add anything to the classpath in the lib directory of that jar. But this also slows job startup down a lot if the jar is large. We should only unjar it if actually doing so would add something new to the classpath. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4647) We should only unjar jobjar if there is a lib directory in it.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4647: --- Status: Patch Available (was: Open) We should only unjar jobjar if there is a lib directory in it. -- Key: MAPREDUCE-4647 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4647 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 0.23.3 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Attachments: MR-4647.txt, MR-4647.txt For backwards compatibility we recently added made is so we would unjar the job.jar and add anything to the classpath in the lib directory of that jar. But this also slows job startup down a lot if the jar is large. We should only unjar it if actually doing so would add something new to the classpath. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4647) We should only unjar jobjar if there is a lib directory in it.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453409#comment-13453409 ] Hadoop QA commented on MAPREDUCE-4647: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12544695/MR-4647.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 5 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. -1 findbugs. The patch appears to introduce 4 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2843//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2843//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-nodemanager.html Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2843//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2843//console This message is automatically generated. We should only unjar jobjar if there is a lib directory in it. -- Key: MAPREDUCE-4647 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4647 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 0.23.3 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Attachments: MR-4647.txt, MR-4647.txt For backwards compatibility we recently added made is so we would unjar the job.jar and add anything to the classpath in the lib directory of that jar. But this also slows job startup down a lot if the jar is large. We should only unjar it if actually doing so would add something new to the classpath. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4647) We should only unjar jobjar if there is a lib directory in it.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453414#comment-13453414 ] Robert Joseph Evans commented on MAPREDUCE-4647: Findbugs has some issues with how was checking for null when comparing two strings. I will make it more explicit but larger. I will also add in the check to be sure the mkdir worked. We should only unjar jobjar if there is a lib directory in it. -- Key: MAPREDUCE-4647 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4647 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 0.23.3 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Attachments: MR-4647.txt, MR-4647.txt For backwards compatibility we recently added made is so we would unjar the job.jar and add anything to the classpath in the lib directory of that jar. But this also slows job startup down a lot if the jar is large. We should only unjar it if actually doing so would add something new to the classpath. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4647) We should only unjar jobjar if there is a lib directory in it.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4647: --- Attachment: MR-4647.txt This version should have the findbugs fixed We should only unjar jobjar if there is a lib directory in it. -- Key: MAPREDUCE-4647 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4647 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 0.23.3 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Attachments: MR-4647.txt, MR-4647.txt, MR-4647.txt For backwards compatibility we recently added made is so we would unjar the job.jar and add anything to the classpath in the lib directory of that jar. But this also slows job startup down a lot if the jar is large. We should only unjar it if actually doing so would add something new to the classpath. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4649) mr-jobhistory-daemon.sh needs to be updated post YARN-1
[ https://issues.apache.org/jira/browse/MAPREDUCE-4649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453493#comment-13453493 ] Vinod Kumar Vavilapalli commented on MAPREDUCE-4649: bq. On this topic, what is the purpose of mr-jobhistory-daemon ? Is it just for starting the history server ? If yes, then should we really pass in the command to start, like Yes, today it is only for starting the history server. But the script is generic enough like the rest of the hadoop scripts to start arbitrary command bq. Can't it just be: mr-jobhistory-daemon.sh start and 'historyserver' is passed internally ? Instead I think we should rename it back to a generic mapred-daemon.sh to start arbitrary mapreduce specific daemons. Can you file a ticket? mr-jobhistory-daemon.sh needs to be updated post YARN-1 --- Key: MAPREDUCE-4649 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4649 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver Affects Versions: 0.23.3, 2.0.2-alpha Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Even today, JHS is assuming that YARN_HOME will be same as HADOOP_MAPRED_HOME besides other such assumptions. We need to fix it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4649) mr-jobhistory-daemon.sh needs to be updated post YARN-1
[ https://issues.apache.org/jira/browse/MAPREDUCE-4649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated MAPREDUCE-4649: --- Attachment: MAPREDUCE-4649-20120911.txt Patch to fix this, was a bit of work. mr-jobhistory-daemon.sh needs to be updated post YARN-1 --- Key: MAPREDUCE-4649 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4649 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver Affects Versions: 0.23.3, 2.0.2-alpha Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Attachments: MAPREDUCE-4649-20120911.txt Even today, JHS is assuming that YARN_HOME will be same as HADOOP_MAPRED_HOME besides other such assumptions. We need to fix it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4649) mr-jobhistory-daemon.sh needs to be updated post YARN-1
[ https://issues.apache.org/jira/browse/MAPREDUCE-4649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated MAPREDUCE-4649: --- Status: Patch Available (was: Open) mr-jobhistory-daemon.sh needs to be updated post YARN-1 --- Key: MAPREDUCE-4649 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4649 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver Affects Versions: 0.23.3, 2.0.2-alpha Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Attachments: MAPREDUCE-4649-20120911.txt Even today, JHS is assuming that YARN_HOME will be same as HADOOP_MAPRED_HOME besides other such assumptions. We need to fix it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4649) mr-jobhistory-daemon.sh needs to be updated post YARN-1
[ https://issues.apache.org/jira/browse/MAPREDUCE-4649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453513#comment-13453513 ] Hadoop QA commented on MAPREDUCE-4649: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12544725/MAPREDUCE-4649-20120911.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2845//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2845//console This message is automatically generated. mr-jobhistory-daemon.sh needs to be updated post YARN-1 --- Key: MAPREDUCE-4649 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4649 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver Affects Versions: 0.23.3, 2.0.2-alpha Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Attachments: MAPREDUCE-4649-20120911.txt Even today, JHS is assuming that YARN_HOME will be same as HADOOP_MAPRED_HOME besides other such assumptions. We need to fix it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4647) We should only unjar jobjar if there is a lib directory in it.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453522#comment-13453522 ] Hadoop QA commented on MAPREDUCE-4647: -- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12544720/MR-4647.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 5 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2844//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2844//console This message is automatically generated. We should only unjar jobjar if there is a lib directory in it. -- Key: MAPREDUCE-4647 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4647 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 0.23.3 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Attachments: MR-4647.txt, MR-4647.txt, MR-4647.txt For backwards compatibility we recently added made is so we would unjar the job.jar and add anything to the classpath in the lib directory of that jar. But this also slows job startup down a lot if the jar is large. We should only unjar it if actually doing so would add something new to the classpath. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4646) client does not receive job diagnostics for failed jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453597#comment-13453597 ] Vinod Kumar Vavilapalli commented on MAPREDUCE-4646: +1. This looks good. Pushing this. client does not receive job diagnostics for failed jobs --- Key: MAPREDUCE-4646 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4646 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0, 2.0.1-alpha Reporter: Jason Lowe Assignee: Jason Lowe Attachments: MAPREDUCE-4646.patch, MAPREDUCE-4646.patch, MAPREDUCE-4646.patch When a job fails the client is not showing any diagnostics. For example, running a fail job results in this not-so-helpful message from the client: {noformat} 2012-09-07 21:12:00,649 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1308)) - Job job_1347052207658_0001 failed with state FAILED due to: {noformat} ...and nothing else to go with it indicating what went wrong. The job diagnostics are apparently not making it back to the client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4646) client does not receive job diagnostics for failed jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated MAPREDUCE-4646: --- Resolution: Fixed Fix Version/s: 2.0.2-alpha Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I just committed this to trunk and branch-2. Will soon merge it into 2.0.2 and 0.23 when the branch story gets a bit clear to me. Thanks Jason! client does not receive job diagnostics for failed jobs --- Key: MAPREDUCE-4646 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4646 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0, 2.0.1-alpha Reporter: Jason Lowe Assignee: Jason Lowe Fix For: 2.0.2-alpha Attachments: MAPREDUCE-4646.patch, MAPREDUCE-4646.patch, MAPREDUCE-4646.patch When a job fails the client is not showing any diagnostics. For example, running a fail job results in this not-so-helpful message from the client: {noformat} 2012-09-07 21:12:00,649 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1308)) - Job job_1347052207658_0001 failed with state FAILED due to: {noformat} ...and nothing else to go with it indicating what went wrong. The job diagnostics are apparently not making it back to the client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4367) mapred job -kill tries to connect to history server
[ https://issues.apache.org/jira/browse/MAPREDUCE-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated MAPREDUCE-4367: --- Fix Version/s: (was: trunk) Status: Open (was: Patch Available) The issue is valid, I see it too. But the patch has lots of problems and can be simplified greatly. Have the following comments regarding the patch: - The patch unnecessarily creates two connections for every client, side stepping the ClientCache etc. Overall, you can limit the code changes to {{ClientServiceDelegate.getProxy()}}. This method can take in an additional parameter {{redirectToJHSIfNeeded}} which can be set to false for job-kill. If this parameter is set to true, {{getProxy()}} can simply log a message and return a {{NonRunningJob}} - We also need to do the same change for {{killTask()}} - Also, {{MiniMRYarnCluster}} doesn't need extra APIs like stopHistoryServer(), one can simply get a handle to JHS and stop it. mapred job -kill tries to connect to history server --- Key: MAPREDUCE-4367 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4367 Project: Hadoop Map/Reduce Issue Type: Improvement Components: client, mrv2 Affects Versions: 0.23.3 Reporter: Jason Lowe Assignee: Mayank Bansal Priority: Minor Attachments: MAPREDUCE-4367-trunk-v1.patch, MAPREDUCE-4367-trunk-v2.patch The {{mapred job -kill}} command attempts to connect to the history server, even though it is unrelated to the process of killing a job. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4646) client does not receive job diagnostics for failed jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453633#comment-13453633 ] Hudson commented on MAPREDUCE-4646: --- Integrated in Hadoop-Common-trunk-Commit #2722 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2722/]) MAPREDUCE-4646. Fixed MR framework to send diagnostic information correctly to clients in case of failed jobs also. Contributed by Jason Lowe. (Revision 1383709) Result = SUCCESS vinodkv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1383709 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRMContainerAllocator.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestJobImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRBuilderUtils.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestClientServiceDelegate.java client does not receive job diagnostics for failed jobs --- Key: MAPREDUCE-4646 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4646 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0, 2.0.1-alpha Reporter: Jason Lowe Assignee: Jason Lowe Fix For: 2.0.2-alpha Attachments: MAPREDUCE-4646.patch, MAPREDUCE-4646.patch, MAPREDUCE-4646.patch When a job fails the client is not showing any diagnostics. For example, running a fail job results in this not-so-helpful message from the client: {noformat} 2012-09-07 21:12:00,649 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1308)) - Job job_1347052207658_0001 failed with state FAILED due to: {noformat} ...and nothing else to go with it indicating what went wrong. The job diagnostics are apparently not making it back to the client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4646) client does not receive job diagnostics for failed jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453635#comment-13453635 ] Hudson commented on MAPREDUCE-4646: --- Integrated in Hadoop-Hdfs-trunk-Commit #2785 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2785/]) MAPREDUCE-4646. Fixed MR framework to send diagnostic information correctly to clients in case of failed jobs also. Contributed by Jason Lowe. (Revision 1383709) Result = SUCCESS vinodkv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1383709 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRMContainerAllocator.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestJobImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRBuilderUtils.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestClientServiceDelegate.java client does not receive job diagnostics for failed jobs --- Key: MAPREDUCE-4646 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4646 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0, 2.0.1-alpha Reporter: Jason Lowe Assignee: Jason Lowe Fix For: 2.0.2-alpha Attachments: MAPREDUCE-4646.patch, MAPREDUCE-4646.patch, MAPREDUCE-4646.patch When a job fails the client is not showing any diagnostics. For example, running a fail job results in this not-so-helpful message from the client: {noformat} 2012-09-07 21:12:00,649 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1308)) - Job job_1347052207658_0001 failed with state FAILED due to: {noformat} ...and nothing else to go with it indicating what went wrong. The job diagnostics are apparently not making it back to the client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2785) MiniMR cluster thread crashes if no hadoop log dir set
[ https://issues.apache.org/jira/browse/MAPREDUCE-2785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated MAPREDUCE-2785: --- Target Version/s: 1.1.0 Fix Version/s: (was: 1.1.0) MiniMR cluster thread crashes if no hadoop log dir set -- Key: MAPREDUCE-2785 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2785 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.20.203.0 Reporter: Steve Loughran Assignee: Steve Loughran Priority: Minor Attachments: MAPREDUCE-2785.patch I'm marking this as minor as it is most obvious in the MiniMRCluster, but the root cause is in the JT. If you instantiate an MiniMR Cluster without setting {{hadoop.job.history.location}} in the configuration and the system property {{hadoop.log.dir}} unset, then the JobHistory throws an NPE. In production, that would be picked up as a failure to start the JT. In the MiniMRCluster, all it does is crash the JT thread -which isn't noticed by the MiniMR cluster. You see the logged error, but the tests will just timeout waiting for things to come up 2011/08/08 17:46:26:427 CEST [ERROR][Thread-44] org.apache.hadoop.mapred.MiniMRCluster - Job tracker crashed java.lang.NullPointerException java.lang.NullPointerException at java.io.File.init(File.java:222) at org.apache.hadoop.mapred.JobHistory.initLogDir(JobHistory.java:531) at org.apache.hadoop.mapred.JobHistory.init(JobHistory.java:499) at org.apache.hadoop.mapred.JobTracker$2.run(JobTracker.java:2316) at org.apache.hadoop.mapred.JobTracker$2.run(JobTracker.java:2313) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.mapred.JobTracker.init(JobTracker.java:2313) at org.apache.hadoop.mapred.JobTracker.init(JobTracker.java:2171) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:300) at org.apache.hadoop.mapred.MiniMRCluster$JobTrackerRunner$1.run(MiniMRCluster.java:114) at org.apache.hadoop.mapred.MiniMRCluster$JobTrackerRunner$1.run(MiniMRCluster.java:112) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.mapred.MiniMRCluster$JobTrackerRunner.run(MiniMRCluster.java:112) at java.lang.Thread.run(Thread.java:662) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3076) TestSleepJob fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated MAPREDUCE-3076: --- Fix Version/s: (was: 1.1.0) TestSleepJob fails --- Key: MAPREDUCE-3076 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3076 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 0.20.205.0 Reporter: Arun C Murthy Assignee: Arun C Murthy Priority: Blocker Fix For: 0.20.205.0 Attachments: MAPREDUCE-3076.patch TestSleepJob fails, it was intended to be used in other tests for MAPREDUCE-2981. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3992) Reduce fetcher doesn't verify HTTP status code of response
[ https://issues.apache.org/jira/browse/MAPREDUCE-3992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated MAPREDUCE-3992: --- Fix Version/s: 1.1.0 Reduce fetcher doesn't verify HTTP status code of response -- Key: MAPREDUCE-3992 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3992 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1 Affects Versions: 0.23.1, 0.24.0, 1.0.1 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 1.1.0, 0.23.3, 2.0.2-alpha Attachments: mr-3992-branch-1.txt, mr-3992.txt Currently, the reduce fetch code doesn't check the HTTP status code of the response. This can lead to the following situation: - the map output servlet gets an IOException after setting the headers but before the first call to flush() - this causes it to send a response with a non-OK result code, including the exception text as the response body (response.sendError() does this if the response isn't committed) - it will still include the response headers indicating it's a valid response In the case of a merge-to-memory, the compression codec might then try to interpret the HTML response as compressed data, resulting in either a huge allocation (OOME) or some other nasty error. This bug seems to be present in MR1, but haven't checked trunk/MR2 yet. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4195) With invalid queueName request param, jobqueue_details.jsp shows NPE
[ https://issues.apache.org/jira/browse/MAPREDUCE-4195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated MAPREDUCE-4195: --- Fix Version/s: (was: 1.1.0) 1.2.0 With invalid queueName request param, jobqueue_details.jsp shows NPE Key: MAPREDUCE-4195 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4195 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 1.0.0 Reporter: Gera Shegalov Priority: Critical Fix For: 1.2.0 Attachments: MR-4195, MR-4195 When you access /jobqueue_details.jsp manually, instead of via a link, it has queueName set to null internally and this goes for a lookup into the scheduling info maps as well. As a result, if using FairScheduler, a Pool with String name = null gets created and this brings the scheduler down. I have not tested what happens to the CapacityScheduler, but ideally if no queueName is set in that jsp, it should fall back to 'default'. Otherwise, this brings down the JobTracker completely. FairScheduler must also add a check to not create a pool with 'null' name. The following is the strace that ensues: {code} ERROR org.mortbay.log: /jobqueue_details.jsp java.lang.NullPointerException at org.apache.hadoop.mapred.jobqueue_005fdetails_jsp._jspService(jobqueue_005fdetails_jsp.java:71) at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221) at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:829) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 9001, call heartbeat from XYZ:MNOP: error: java.io.IOException: java.lang.NullPointerException java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.mapred.SchedulingAlgorithms$FairShareComparator.compare(SchedulingAlgorithms.java:95) at org.apache.hadoop.mapred.SchedulingAlgorithms$FairShareComparator.compare(SchedulingAlgorithms.java:68) at java.util.Arrays.mergeSort(Unknown Source) at java.util.Arrays.sort(Unknown Source) at java.util.Collections.sort(Unknown Source) at org.apache.hadoop.mapred.FairScheduler.assignTasks(FairScheduler.java:435) at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:3226) at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:557) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1434) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1430) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Unknown Source) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1428) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4646) client does not receive job diagnostics for failed jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453654#comment-13453654 ] Hudson commented on MAPREDUCE-4646: --- Integrated in Hadoop-Mapreduce-trunk-Commit #2746 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2746/]) MAPREDUCE-4646. Fixed MR framework to send diagnostic information correctly to clients in case of failed jobs also. Contributed by Jason Lowe. (Revision 1383709) Result = FAILURE vinodkv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1383709 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRMContainerAllocator.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestJobImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRBuilderUtils.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestClientServiceDelegate.java client does not receive job diagnostics for failed jobs --- Key: MAPREDUCE-4646 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4646 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0, 2.0.1-alpha Reporter: Jason Lowe Assignee: Jason Lowe Fix For: 2.0.2-alpha Attachments: MAPREDUCE-4646.patch, MAPREDUCE-4646.patch, MAPREDUCE-4646.patch When a job fails the client is not showing any diagnostics. For example, running a fail job results in this not-so-helpful message from the client: {noformat} 2012-09-07 21:12:00,649 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1308)) - Job job_1347052207658_0001 failed with state FAILED due to: {noformat} ...and nothing else to go with it indicating what went wrong. The job diagnostics are apparently not making it back to the client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4308) Remove excessive split log messages
[ https://issues.apache.org/jira/browse/MAPREDUCE-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated MAPREDUCE-4308: --- Fix Version/s: (was: 1.1.0) Remove excessive split log messages --- Key: MAPREDUCE-4308 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4308 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker Affects Versions: 1.0.3 Reporter: Kihwal Lee Attachments: mapreduce-4308-branch-1.patch Job tracker currently prints out information on every split. {noformat} 2012-05-20 00:06:01,985 INFO org.apache.hadoop.mapred.JobInProgress: tip:task_201205100740_1745_m_00 has split on node:/192.168.0.1 /my.totally.madeup.host.com {noformat} I looked at one cluster and these messages were taking up more than 30% of the JT log. If jobs have large number of maps, it can be worse. I think it is reasonable to lower the log level of the statement from INFO to DEBUG. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4325) Rename ProcessTree.isSetsidAvailable
[ https://issues.apache.org/jira/browse/MAPREDUCE-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated MAPREDUCE-4325: --- Fix Version/s: (was: 1.1.0) Rename ProcessTree.isSetsidAvailable Key: MAPREDUCE-4325 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4325 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.0.0 Reporter: Bikas Saha Assignee: Bikas Saha The logical use of this member is to find out if processes can be grouped into a unit for process manipulation. eg. killing process groups etc. setsid is the Linux implementation and it leaks into the name. I suggest renaming it to isProcessGroupAvailable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4363) Hadoop 1.X, 2.X and trunk do not build on Fedora 17
[ https://issues.apache.org/jira/browse/MAPREDUCE-4363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated MAPREDUCE-4363: --- Fix Version/s: (was: 2.0.2-alpha) (was: 1.1.0) Hadoop 1.X, 2.X and trunk do not build on Fedora 17 --- Key: MAPREDUCE-4363 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4363 Project: Hadoop Map/Reduce Issue Type: Bug Components: build, pipes Affects Versions: 1.0.3, trunk Reporter: Bruno Mahé Assignee: Bruno Mahé Labels: bigtop Attachments: MAPREDUCE-4363.patch, MAPREDUCE-4363-trunk.patch I upgraded my machine to the latest Fedora 17 and now Apache Hadoop is failing to build. This seems related to the bump in version of gcc to 4.7.0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4451) fairscheduler fail to init job with kerberos authentication configured
[ https://issues.apache.org/jira/browse/MAPREDUCE-4451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated MAPREDUCE-4451: --- Fix Version/s: (was: 1.1.0) fairscheduler fail to init job with kerberos authentication configured -- Key: MAPREDUCE-4451 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4451 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/fair-share Affects Versions: 1.0.3 Reporter: Erik.fang Attachments: MAPREDUCE-4451_branch-1.patch, MAPREDUCE-4451_branch-1.patch, MAPREDUCE-4451_branch-1.patch, MAPREDUCE-4451_branch-1.patch Using FairScheduler in Hadoop 1.0.3 with kerberos authentication configured. Job initialization fails: {code} 2012-07-17 15:15:09,220 ERROR org.apache.hadoop.mapred.JobTracker: Job initialization failed: java.io.IOException: Call to /192.168.7.80:8020 failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] at org.apache.hadoop.ipc.Client.wrapException(Client.java:1129) at org.apache.hadoop.ipc.Client.call(Client.java:1097) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229) at $Proxy7.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:411) at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:125) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:329) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:294) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:100) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1411) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1429) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187) at org.apache.hadoop.security.Credentials.writeTokenStorageFile(Credentials.java:169) at org.apache.hadoop.mapred.JobInProgress.generateAndStoreTokens(JobInProgress.java:3558) at org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:696) at org.apache.hadoop.mapred.JobTracker.initJob(JobTracker.java:3911) at org.apache.hadoop.mapred.FairScheduler$JobInitializer$InitJob.run(FairScheduler.java:301) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:543) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136) at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:488) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:590) at org.apache.hadoop.ipc.Client$Connection.access$2100(Client.java:187) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1228) at org.apache.hadoop.ipc.Client.call(Client.java:1072) ... 20 more Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:194) at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:134) at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:385) at org.apache.hadoop.ipc.Client$Connection.access$1200(Client.java:187) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:583) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:580) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:579) ... 23 more
[jira] [Updated] (MAPREDUCE-4473) tasktracker rank on machines.jsp?type=active
[ https://issues.apache.org/jira/browse/MAPREDUCE-4473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated MAPREDUCE-4473: --- Target Version/s: 1.0.3, 1.0.2, 1.0.1, 1.0.0 (was: 1.0.0, 1.0.1, 1.0.2, 1.0.3) Fix Version/s: (was: 1.1.1) (was: 1.1.0) (was: 0.24.0) tasktracker rank on machines.jsp?type=active Key: MAPREDUCE-4473 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4473 Project: Hadoop Map/Reduce Issue Type: Improvement Components: tasktracker Affects Versions: 0.20.2, 0.21.0, 0.22.0, 0.23.0, 0.23.1, 1.0.0, 1.0.1, 1.0.2, 1.0.3 Reporter: jian fan Priority: Minor Labels: tasktracker Attachments: MAPREDUCE-4473.patch sometimes we need to simple judge which tasktracker is down from the page of machines.jsp?type=active -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira