[jira] [Updated] (MAPREDUCE-3789) CapacityTaskScheduler may perform unnecessary reservations in heterogenous tracker environments
[ https://issues.apache.org/jira/browse/MAPREDUCE-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated MAPREDUCE-3789: --- Attachment: MAPREDUCE-3789.patch Alejandro - Yes I ran the steps for reproduce on a live cluster as well, and with the fix in place the low-slot requirement job runs, while without it the slots are soaked up illogically by the other high mem one. Updated patch with your suggested refactors. CapacityTaskScheduler may perform unnecessary reservations in heterogenous tracker environments --- Key: MAPREDUCE-3789 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3789 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/capacity-sched, scheduler Affects Versions: 1.1.0 Reporter: Harsh J Assignee: Harsh J Priority: Critical Attachments: MAPREDUCE-3789.patch, MAPREDUCE-3789.patch, MAPREDUCE-3789.patch Briefly, to reproduce: * Run JT with CapacityTaskScheduler [Say, Cluster max map = 8G, Cluster map = 2G] * Run two TTs but with varied capacity, say, one with 4 map slot, another with 3 map slots. * Run a job with two tasks, each demanding mem worth 4 slots at least (Map mem = 7G or so). * Job will begin running on TT #1, but will also end up reserving the 3 slots on TT #2 cause it does not check for the maximum limit of slots when reserving (as it goes greedy, and hopes to gain more slots in future). * Other jobs that could've run on the TT #2 over 3 slots are thereby blocked out due to this illogical reservation. I've not yet tested MR2 for this so feel free to weigh in if it affects MR2 as well. For MR1, I've attached a test case initially to indicate this. A fix that checks reservations vs. max slots, to follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3789) CapacityTaskScheduler may perform unnecessary reservations in heterogenous tracker environments
[ https://issues.apache.org/jira/browse/MAPREDUCE-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205301#comment-13205301 ] Hadoop QA commented on MAPREDUCE-3789: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514086/MAPREDUCE-3789.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1833//console This message is automatically generated. CapacityTaskScheduler may perform unnecessary reservations in heterogenous tracker environments --- Key: MAPREDUCE-3789 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3789 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/capacity-sched, scheduler Affects Versions: 1.1.0 Reporter: Harsh J Assignee: Harsh J Priority: Critical Attachments: MAPREDUCE-3789.patch, MAPREDUCE-3789.patch, MAPREDUCE-3789.patch Briefly, to reproduce: * Run JT with CapacityTaskScheduler [Say, Cluster max map = 8G, Cluster map = 2G] * Run two TTs but with varied capacity, say, one with 4 map slot, another with 3 map slots. * Run a job with two tasks, each demanding mem worth 4 slots at least (Map mem = 7G or so). * Job will begin running on TT #1, but will also end up reserving the 3 slots on TT #2 cause it does not check for the maximum limit of slots when reserving (as it goes greedy, and hopes to gain more slots in future). * Other jobs that could've run on the TT #2 over 3 slots are thereby blocked out due to this illogical reservation. I've not yet tested MR2 for this so feel free to weigh in if it affects MR2 as well. For MR1, I've attached a test case initially to indicate this. A fix that checks reservations vs. max slots, to follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-3620) GrimdMix Stats at the end of GridMix are not reported correctly
[ https://issues.apache.org/jira/browse/MAPREDUCE-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Gummadi resolved MAPREDUCE-3620. - Resolution: Duplicate Amar's patch for MR3787 incorporates the fix for this issue also. GrimdMix Stats at the end of GridMix are not reported correctly --- Key: MAPREDUCE-3620 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3620 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/gridmix Affects Versions: 0.23.0 Reporter: Ravi Prakash Assignee: Amar Kamat Courtesy [~vinaythota] {quote} Job trace contains 1205 jobs and Gridmix start processing 1200 jobs after processing. However, after completion of gridmix run, execution summary details, it showed 1196 jobs are processed and remaining 4 jobs are missing. One log shows 1196 jobs processed and another log shows 1200 jobs are processed. {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3846) Restarted+Recovered AM hangs in some corner cases
[ https://issues.apache.org/jira/browse/MAPREDUCE-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205325#comment-13205325 ] Sharad Agarwal commented on MAPREDUCE-3846: --- should this be marked as duplicate of MAPREDUCE-3802 ? It is exactly the same behaviour for the AM hanging/failing in third generation. Restarted+Recovered AM hangs in some corner cases - Key: MAPREDUCE-3846 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3846 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: mrv2 Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli [~karams] found this while testing AM restart/recovery feature. After the first generation AM crashes (manually killed by kill -9), the second generation AM starts, but hangs after a while. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-3848) RM to issue slots on a specific machine to users with admin rights
RM to issue slots on a specific machine to users with admin rights -- Key: MAPREDUCE-3848 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3848 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv2 Affects Versions: 0.24.0 Reporter: Steve Loughran Priority: Minor The RM offers slots closest to the hosts that the AM ask for, based on which machine nearby has space. If you are using YARN to deploy admin-like applications (e.g. connectivity tests), you really do need to deploy on a specific machine, even if that machine has no free slots. It would be useful to have an option to say always allocate on this machine if it is live, and give access to that machine to admin users, even if there are no free slots on the server for normal jobs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3846) Restarted+Recovered AM hangs in some corner cases
[ https://issues.apache.org/jira/browse/MAPREDUCE-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205465#comment-13205465 ] Karam Singh commented on MAPREDUCE-3846: I faced this issue, consistently when ever I kill AM after all maps are finshed and only reduces are running Restarted+Recovered AM hangs in some corner cases - Key: MAPREDUCE-3846 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3846 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: mrv2 Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli [~karams] found this while testing AM restart/recovery feature. After the first generation AM crashes (manually killed by kill -9), the second generation AM starts, but hangs after a while. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3680) FifoScheduler web service rest API can print out invalid JSON
[ https://issues.apache.org/jira/browse/MAPREDUCE-3680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205477#comment-13205477 ] Hudson commented on MAPREDUCE-3680: --- Integrated in Hadoop-Hdfs-trunk-Commit #1783 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1783/]) MAPREDUCE-3680. FifoScheduler web service rest API can print out invalid JSON. (B Anil Kumar via tgraves) tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1242790 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/TestFifoScheduler.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServices.java FifoScheduler web service rest API can print out invalid JSON - Key: MAPREDUCE-3680 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3680 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Thomas Graves Attachments: MAPREDUCE-3680-1.patch, MAPREDUCE-3680.patch running a GET on the scheduler web services rest api (RM:port/ws/cluster/scheduler) with the FifoScheduler configured with no nodemanagers up yet and it prints out invalid json of NaN for the used Capacity: {scheduler:{schedulerInfo:{type:fifoScheduler,capacity:1.0,usedCapacity:NaN,qstate:RUNNING,minQueueMemoryCapacity:1024,maxQueueMemoryCapacity:10240,numNodes:0,usedNodeCapacity:0,availNodeCapacity:0,totalNodeCapacity:0,numContainers:0}}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3680) FifoScheduler web service rest API can print out invalid JSON
[ https://issues.apache.org/jira/browse/MAPREDUCE-3680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205479#comment-13205479 ] Hudson commented on MAPREDUCE-3680: --- Integrated in Hadoop-Common-trunk-Commit #1708 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1708/]) MAPREDUCE-3680. FifoScheduler web service rest API can print out invalid JSON. (B Anil Kumar via tgraves) tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1242790 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/TestFifoScheduler.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServices.java FifoScheduler web service rest API can print out invalid JSON - Key: MAPREDUCE-3680 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3680 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Thomas Graves Fix For: 0.23.2 Attachments: MAPREDUCE-3680-1.patch, MAPREDUCE-3680.patch running a GET on the scheduler web services rest api (RM:port/ws/cluster/scheduler) with the FifoScheduler configured with no nodemanagers up yet and it prints out invalid json of NaN for the used Capacity: {scheduler:{schedulerInfo:{type:fifoScheduler,capacity:1.0,usedCapacity:NaN,qstate:RUNNING,minQueueMemoryCapacity:1024,maxQueueMemoryCapacity:10240,numNodes:0,usedNodeCapacity:0,availNodeCapacity:0,totalNodeCapacity:0,numContainers:0}}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3680) FifoScheduler web service rest API can print out invalid JSON
[ https://issues.apache.org/jira/browse/MAPREDUCE-3680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated MAPREDUCE-3680: - Resolution: Fixed Fix Version/s: 0.23.2 Status: Resolved (was: Patch Available) Thanks B Anil Kumar! I've committed this to trunk and branch 0.23. FifoScheduler web service rest API can print out invalid JSON - Key: MAPREDUCE-3680 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3680 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Thomas Graves Fix For: 0.23.2 Attachments: MAPREDUCE-3680-1.patch, MAPREDUCE-3680.patch running a GET on the scheduler web services rest api (RM:port/ws/cluster/scheduler) with the FifoScheduler configured with no nodemanagers up yet and it prints out invalid json of NaN for the used Capacity: {scheduler:{schedulerInfo:{type:fifoScheduler,capacity:1.0,usedCapacity:NaN,qstate:RUNNING,minQueueMemoryCapacity:1024,maxQueueMemoryCapacity:10240,numNodes:0,usedNodeCapacity:0,availNodeCapacity:0,totalNodeCapacity:0,numContainers:0}}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3680) FifoScheduler web service rest API can print out invalid JSON
[ https://issues.apache.org/jira/browse/MAPREDUCE-3680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205481#comment-13205481 ] Hudson commented on MAPREDUCE-3680: --- Integrated in Hadoop-Mapreduce-trunk-Commit #1719 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1719/]) MAPREDUCE-3680. FifoScheduler web service rest API can print out invalid JSON. (B Anil Kumar via tgraves) tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1242790 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/TestFifoScheduler.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServices.java FifoScheduler web service rest API can print out invalid JSON - Key: MAPREDUCE-3680 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3680 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Thomas Graves Fix For: 0.23.2 Attachments: MAPREDUCE-3680-1.patch, MAPREDUCE-3680.patch running a GET on the scheduler web services rest api (RM:port/ws/cluster/scheduler) with the FifoScheduler configured with no nodemanagers up yet and it prints out invalid json of NaN for the used Capacity: {scheduler:{schedulerInfo:{type:fifoScheduler,capacity:1.0,usedCapacity:NaN,qstate:RUNNING,minQueueMemoryCapacity:1024,maxQueueMemoryCapacity:10240,numNodes:0,usedNodeCapacity:0,availNodeCapacity:0,totalNodeCapacity:0,numContainers:0}}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3680) FifoScheduler web service rest API can print out invalid JSON
[ https://issues.apache.org/jira/browse/MAPREDUCE-3680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205483#comment-13205483 ] Hudson commented on MAPREDUCE-3680: --- Integrated in Hadoop-Hdfs-0.23-Commit #523 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Commit/523/]) merge -r 1242789:1242790 from trunk. FIXES: MAPREDUCE-3680 tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1242792 Files : * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/TestFifoScheduler.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServices.java FifoScheduler web service rest API can print out invalid JSON - Key: MAPREDUCE-3680 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3680 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Thomas Graves Fix For: 0.23.2 Attachments: MAPREDUCE-3680-1.patch, MAPREDUCE-3680.patch running a GET on the scheduler web services rest api (RM:port/ws/cluster/scheduler) with the FifoScheduler configured with no nodemanagers up yet and it prints out invalid json of NaN for the used Capacity: {scheduler:{schedulerInfo:{type:fifoScheduler,capacity:1.0,usedCapacity:NaN,qstate:RUNNING,minQueueMemoryCapacity:1024,maxQueueMemoryCapacity:10240,numNodes:0,usedNodeCapacity:0,availNodeCapacity:0,totalNodeCapacity:0,numContainers:0}}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3680) FifoScheduler web service rest API can print out invalid JSON
[ https://issues.apache.org/jira/browse/MAPREDUCE-3680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205485#comment-13205485 ] Hudson commented on MAPREDUCE-3680: --- Integrated in Hadoop-Common-0.23-Commit #534 (See [https://builds.apache.org/job/Hadoop-Common-0.23-Commit/534/]) merge -r 1242789:1242790 from trunk. FIXES: MAPREDUCE-3680 tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1242792 Files : * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/TestFifoScheduler.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServices.java FifoScheduler web service rest API can print out invalid JSON - Key: MAPREDUCE-3680 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3680 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Thomas Graves Fix For: 0.23.2 Attachments: MAPREDUCE-3680-1.patch, MAPREDUCE-3680.patch running a GET on the scheduler web services rest api (RM:port/ws/cluster/scheduler) with the FifoScheduler configured with no nodemanagers up yet and it prints out invalid json of NaN for the used Capacity: {scheduler:{schedulerInfo:{type:fifoScheduler,capacity:1.0,usedCapacity:NaN,qstate:RUNNING,minQueueMemoryCapacity:1024,maxQueueMemoryCapacity:10240,numNodes:0,usedNodeCapacity:0,availNodeCapacity:0,totalNodeCapacity:0,numContainers:0}}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3680) FifoScheduler web service rest API can print out invalid JSON
[ https://issues.apache.org/jira/browse/MAPREDUCE-3680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205486#comment-13205486 ] Hudson commented on MAPREDUCE-3680: --- Integrated in Hadoop-Mapreduce-0.23-Commit #538 (See [https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Commit/538/]) merge -r 1242789:1242790 from trunk. FIXES: MAPREDUCE-3680 tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1242792 Files : * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/TestFifoScheduler.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServices.java FifoScheduler web service rest API can print out invalid JSON - Key: MAPREDUCE-3680 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3680 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Thomas Graves Fix For: 0.23.2 Attachments: MAPREDUCE-3680-1.patch, MAPREDUCE-3680.patch running a GET on the scheduler web services rest api (RM:port/ws/cluster/scheduler) with the FifoScheduler configured with no nodemanagers up yet and it prints out invalid json of NaN for the used Capacity: {scheduler:{schedulerInfo:{type:fifoScheduler,capacity:1.0,usedCapacity:NaN,qstate:RUNNING,minQueueMemoryCapacity:1024,maxQueueMemoryCapacity:10240,numNodes:0,usedNodeCapacity:0,availNodeCapacity:0,totalNodeCapacity:0,numContainers:0}}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3846) Restarted+Recovered AM hangs in some corner cases
[ https://issues.apache.org/jira/browse/MAPREDUCE-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karam Singh updated MAPREDUCE-3846: --- Priority: Critical (was: Major) Restarted+Recovered AM hangs in some corner cases - Key: MAPREDUCE-3846 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3846 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: mrv2 Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Priority: Critical [~karams] found this while testing AM restart/recovery feature. After the first generation AM crashes (manually killed by kill -9), the second generation AM starts, but hangs after a while. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3846) Restarted+Recovered AM hangs in some corner cases
[ https://issues.apache.org/jira/browse/MAPREDUCE-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205499#comment-13205499 ] Karam Singh commented on MAPREDUCE-3846: It also appeared for me in case when I killed AM at 150 secs, when more than 13000 out of 16800 maps were ran Marking it critical Restarted+Recovered AM hangs in some corner cases - Key: MAPREDUCE-3846 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3846 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: mrv2 Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli [~karams] found this while testing AM restart/recovery feature. After the first generation AM crashes (manually killed by kill -9), the second generation AM starts, but hangs after a while. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3825) Need generalized multi-token filesystem support
[ https://issues.apache.org/jira/browse/MAPREDUCE-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205518#comment-13205518 ] Daryn Sharp commented on MAPREDUCE-3825: I'm open to alternatives, but performing the elimination of dups is actually pretty simple: {code} static void obtainTokensForNamenodesInternal(Credentials credentials, Path[] ps, Configuration conf) throws IOException { --- start new code --- // use 2 passes to avoid redundant calls to the same filesystems // start by getting unique set of filesystems for all paths SetFileSystem pathFsSet = new HashSetFileSystem(); for (Path p : ps) { pathFsSet.add(p.getFileSystem(conf)); } // get the unique set of leaf filesystems SetFileSystem tokenFsSet = new HashSetFileSystem(); for (FileSystem fs : pathFsSet) { tokenFsSet.addAll(fs.getFileSystems()); } --- end new code --- // get all the tokens from the now flattened list of leaf filesystems for (FileSystem fs : tokenFsSet) { obtainTokensForNamenodesPrivate(fs, credentials, conf); } } {code} If many files are in the same filesystem, then a lot of necessary processing occurs, esp. in the case of viewfs. I may be misunderstanding this variation, but the acquisition of tokens via recursive calls will require more changes that may break non-hadoop distributed filesystems. I think it will require code duplication of the default {{getDelegationTokens(renewer, creds)}}, or a new api that overrides of this method can use to avoid getting dups. The proposed default implementation of {{FileSystem#getDelegations(renewer, creds)}} simply iterates {{this.getFileSystems()}} too. I'll write something up and then we can discuss a little more. Need generalized multi-token filesystem support --- Key: MAPREDUCE-3825 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3825 Project: Hadoop Map/Reduce Issue Type: Bug Components: security Affects Versions: 0.23.1, 0.24.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: MAPREDUCE-3825.patch This is the counterpart to HADOOP-7967. The token cache currently tries to assume a filesystem's token service key. The assumption generally worked while there was a one to one mapping of filesystem to token. With the advent of multi-token filesystems like viewfs, the token cache will try to use a service key (ie. for viewfs) that will never exist (because it really gets the mounted fs tokens). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (MAPREDUCE-2257) distcp can copy blocks in parallel
[ https://issues.apache.org/jira/browse/MAPREDUCE-2257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan reassigned MAPREDUCE-2257: --- Assignee: Mithun Radhakrishnan distcp can copy blocks in parallel -- Key: MAPREDUCE-2257 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2257 Project: Hadoop Map/Reduce Issue Type: Improvement Components: distcp Affects Versions: 0.21.0 Reporter: dhruba borthakur Assignee: Mithun Radhakrishnan Attachments: MAPREDUCE-2257.patch The minimum unit of work for a distcp task is a file. We have files that are greater than 1 TB with a block size of 1 GB. If we use distcp to copy these files, the tasks either take a long long long time or finally fails. A better way for distcp would be to copy all the source blocks in parallel, and then stich the blocks back to files at the destination via the HDFS Concat API (HDFS-222) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3843) Job summary log file found missing on the RM host
[ https://issues.apache.org/jira/browse/MAPREDUCE-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anupam Seth updated MAPREDUCE-3843: --- Attachment: MAPREDUCE-3843.patch Job summary log file found missing on the RM host - Key: MAPREDUCE-3843 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3843 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mrv2 Affects Versions: 0.23.0 Reporter: Anupam Seth Assignee: Anupam Seth Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3843.patch This bug was found by Phil Su as part of our testing. After MAPREDUCE-3354 went in, the Job summary log file seems to have gone missing on the RM host. The job summary log appears to be interspersed in yarn-mapredqa-historyserver-host.out. e.g. 12/02/09 15:57:21 INFO jobhistory.JobSummary: jobId=job_1328658619341_0011,submitTime=1328802904381,launchTime=1328802909977,firstMapTaskLaunchTime=1328802912116,firstReduceTaskLaunchTime=1328802915074,finishTime=1328802933797,resourc esPerMap=1024,resourcesPerReduce=2048,numMaps=10,numReduces=10,user=hadoopqa,queue=default,status=KILLED,mapSlotSeconds=0,reduceSlotSeconds=0 1) On the RM with older hadoop version where the job summary log does not exist mapredqa 10903 0.0 1.2 1424404 210240 ? Sl Feb07 0:19 /home/gs/java/jdk64/current/bin/java -Xmx1000m -Djava.net.preferIPv4Stack=true -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/ home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64 -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/home/gs/gridre/theoden/share/hadoop -Dhadoop.id.str=mapredqa -Dhadoop .root.logger=INFO,console -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/nat ive -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dmapred.jobsummary.logger=INFO,console -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer 2) On the RM with older hadoop version where the job summary log exists mapredqa 24851 0.0 0.5 1463280 90516 ? Sl Jan25 0:37 /home/gs/java/jdk64/current/bin/java -Dproc_historyserver -Xmx1000m -Dmapred.jobsummary.logger=INFO,JSA -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn.log -Dyarn.home.dir= -Dyarn.id.str= -Dyarn.root.logger=INFO,console -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir= -Dyarn.id.str=mapredqa -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dmapred.jobsummary.logger=INFO,JSA -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir=/home/gs/gridre/shelob/share/hadoop -Dhadoop.root.logger=INFO,DRFA -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -classpath /home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/share/hadoop/share/hadoop/common/lib/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/common/*:/home/gs/gridre/shelob/share/hadoop/hadoop-*-capacity-scheduler.jar:/home/gs/gridre/shelob/share/hadoop/hadoop-*-capacity-scheduler.jar:/home/gs/gridre/shelob/share/hadoop/hadoop-*-capacity-scheduler.jar:/home/gs/gridre/shelob/share/hadoop/share/hadoop/hdfs:/home/gs/gridre/shelob/share/hadoop/share/hadoop/hdfs/lib/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/hdfs/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/lib/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/*:/home/gs/java/jdk64/current/lib/tools.jar:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/lib/* org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer 1) On the RM with older hadoop version where the job summary log does not exist jobhistory ps shows using the option: -Dmapred.jobsummary.logger=INFO,console 2) On the RM with older hadoop version where the job summary log exists jobhistory ps shows using the option: -Dmapred.jobsummary.logger=INFO,JSA -Dmapred.jobsummary.logger=INFO,JSA -- This message is automatically
[jira] [Updated] (MAPREDUCE-3843) Job summary log file found missing on the RM host
[ https://issues.apache.org/jira/browse/MAPREDUCE-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anupam Seth updated MAPREDUCE-3843: --- Status: Patch Available (was: Open) Unit tests not applicable as simple script change. Have manually tested on a single node cluster. Job summary log file found missing on the RM host - Key: MAPREDUCE-3843 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3843 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mrv2 Affects Versions: 0.23.0 Reporter: Anupam Seth Assignee: Anupam Seth Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3843.patch This bug was found by Phil Su as part of our testing. After MAPREDUCE-3354 went in, the Job summary log file seems to have gone missing on the RM host. The job summary log appears to be interspersed in yarn-mapredqa-historyserver-host.out. e.g. 12/02/09 15:57:21 INFO jobhistory.JobSummary: jobId=job_1328658619341_0011,submitTime=1328802904381,launchTime=1328802909977,firstMapTaskLaunchTime=1328802912116,firstReduceTaskLaunchTime=1328802915074,finishTime=1328802933797,resourc esPerMap=1024,resourcesPerReduce=2048,numMaps=10,numReduces=10,user=hadoopqa,queue=default,status=KILLED,mapSlotSeconds=0,reduceSlotSeconds=0 1) On the RM with older hadoop version where the job summary log does not exist mapredqa 10903 0.0 1.2 1424404 210240 ? Sl Feb07 0:19 /home/gs/java/jdk64/current/bin/java -Xmx1000m -Djava.net.preferIPv4Stack=true -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/ home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64 -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/home/gs/gridre/theoden/share/hadoop -Dhadoop.id.str=mapredqa -Dhadoop .root.logger=INFO,console -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/nat ive -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dmapred.jobsummary.logger=INFO,console -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer 2) On the RM with older hadoop version where the job summary log exists mapredqa 24851 0.0 0.5 1463280 90516 ? Sl Jan25 0:37 /home/gs/java/jdk64/current/bin/java -Dproc_historyserver -Xmx1000m -Dmapred.jobsummary.logger=INFO,JSA -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn.log -Dyarn.home.dir= -Dyarn.id.str= -Dyarn.root.logger=INFO,console -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir= -Dyarn.id.str=mapredqa -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dmapred.jobsummary.logger=INFO,JSA -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir=/home/gs/gridre/shelob/share/hadoop -Dhadoop.root.logger=INFO,DRFA -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -classpath /home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/share/hadoop/share/hadoop/common/lib/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/common/*:/home/gs/gridre/shelob/share/hadoop/hadoop-*-capacity-scheduler.jar:/home/gs/gridre/shelob/share/hadoop/hadoop-*-capacity-scheduler.jar:/home/gs/gridre/shelob/share/hadoop/hadoop-*-capacity-scheduler.jar:/home/gs/gridre/shelob/share/hadoop/share/hadoop/hdfs:/home/gs/gridre/shelob/share/hadoop/share/hadoop/hdfs/lib/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/hdfs/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/lib/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/*:/home/gs/java/jdk64/current/lib/tools.jar:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/lib/* org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer 1) On the RM with older hadoop version where the job summary log does not exist jobhistory ps shows using the option: -Dmapred.jobsummary.logger=INFO,console 2) On the RM with older hadoop version where the job summary log exists jobhistory ps shows using the
[jira] [Commented] (MAPREDUCE-2257) distcp can copy blocks in parallel
[ https://issues.apache.org/jira/browse/MAPREDUCE-2257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205551#comment-13205551 ] Mahadev konar commented on MAPREDUCE-2257: -- Thanks for taking this up Mithun! distcp can copy blocks in parallel -- Key: MAPREDUCE-2257 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2257 Project: Hadoop Map/Reduce Issue Type: Improvement Components: distcp Affects Versions: 0.21.0 Reporter: dhruba borthakur Assignee: Mithun Radhakrishnan Attachments: MAPREDUCE-2257.patch The minimum unit of work for a distcp task is a file. We have files that are greater than 1 TB with a block size of 1 GB. If we use distcp to copy these files, the tasks either take a long long long time or finally fails. A better way for distcp would be to copy all the source blocks in parallel, and then stich the blocks back to files at the destination via the HDFS Concat API (HDFS-222) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3843) Job summary log file found missing on the RM host
[ https://issues.apache.org/jira/browse/MAPREDUCE-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205557#comment-13205557 ] Hadoop QA commented on MAPREDUCE-3843: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514118/MAPREDUCE-3843.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.yarn.util.TestLinuxResourceCalculatorPlugin +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1834//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1834//console This message is automatically generated. Job summary log file found missing on the RM host - Key: MAPREDUCE-3843 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3843 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mrv2 Affects Versions: 0.23.0 Reporter: Anupam Seth Assignee: Anupam Seth Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3843.patch This bug was found by Phil Su as part of our testing. After MAPREDUCE-3354 went in, the Job summary log file seems to have gone missing on the RM host. The job summary log appears to be interspersed in yarn-mapredqa-historyserver-host.out. e.g. 12/02/09 15:57:21 INFO jobhistory.JobSummary: jobId=job_1328658619341_0011,submitTime=1328802904381,launchTime=1328802909977,firstMapTaskLaunchTime=1328802912116,firstReduceTaskLaunchTime=1328802915074,finishTime=1328802933797,resourc esPerMap=1024,resourcesPerReduce=2048,numMaps=10,numReduces=10,user=hadoopqa,queue=default,status=KILLED,mapSlotSeconds=0,reduceSlotSeconds=0 1) On the RM with older hadoop version where the job summary log does not exist mapredqa 10903 0.0 1.2 1424404 210240 ? Sl Feb07 0:19 /home/gs/java/jdk64/current/bin/java -Xmx1000m -Djava.net.preferIPv4Stack=true -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/ home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64 -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/home/gs/gridre/theoden/share/hadoop -Dhadoop.id.str=mapredqa -Dhadoop .root.logger=INFO,console -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/nat ive -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dmapred.jobsummary.logger=INFO,console -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer 2) On the RM with older hadoop version where the job summary log exists mapredqa 24851 0.0 0.5 1463280 90516 ? Sl Jan25 0:37 /home/gs/java/jdk64/current/bin/java -Dproc_historyserver -Xmx1000m -Dmapred.jobsummary.logger=INFO,JSA -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn.log -Dyarn.home.dir= -Dyarn.id.str= -Dyarn.root.logger=INFO,console -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir= -Dyarn.id.str=mapredqa -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dmapred.jobsummary.logger=INFO,JSA -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir=/home/gs/gridre/shelob/share/hadoop -Dhadoop.root.logger=INFO,DRFA -Dyarn.root.logger=INFO,DRFA
[jira] [Updated] (MAPREDUCE-3843) Job summary log file found missing on the RM host
[ https://issues.apache.org/jira/browse/MAPREDUCE-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anupam Seth updated MAPREDUCE-3843: --- Status: Open (was: Patch Available) Job summary log file found missing on the RM host - Key: MAPREDUCE-3843 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3843 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mrv2 Affects Versions: 0.23.0 Reporter: Anupam Seth Assignee: Anupam Seth Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3843.patch This bug was found by Phil Su as part of our testing. After MAPREDUCE-3354 went in, the Job summary log file seems to have gone missing on the RM host. The job summary log appears to be interspersed in yarn-mapredqa-historyserver-host.out. e.g. 12/02/09 15:57:21 INFO jobhistory.JobSummary: jobId=job_1328658619341_0011,submitTime=1328802904381,launchTime=1328802909977,firstMapTaskLaunchTime=1328802912116,firstReduceTaskLaunchTime=1328802915074,finishTime=1328802933797,resourc esPerMap=1024,resourcesPerReduce=2048,numMaps=10,numReduces=10,user=hadoopqa,queue=default,status=KILLED,mapSlotSeconds=0,reduceSlotSeconds=0 1) On the RM with older hadoop version where the job summary log does not exist mapredqa 10903 0.0 1.2 1424404 210240 ? Sl Feb07 0:19 /home/gs/java/jdk64/current/bin/java -Xmx1000m -Djava.net.preferIPv4Stack=true -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/ home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64 -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/home/gs/gridre/theoden/share/hadoop -Dhadoop.id.str=mapredqa -Dhadoop .root.logger=INFO,console -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/nat ive -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dmapred.jobsummary.logger=INFO,console -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer 2) On the RM with older hadoop version where the job summary log exists mapredqa 24851 0.0 0.5 1463280 90516 ? Sl Jan25 0:37 /home/gs/java/jdk64/current/bin/java -Dproc_historyserver -Xmx1000m -Dmapred.jobsummary.logger=INFO,JSA -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn.log -Dyarn.home.dir= -Dyarn.id.str= -Dyarn.root.logger=INFO,console -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir= -Dyarn.id.str=mapredqa -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dmapred.jobsummary.logger=INFO,JSA -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir=/home/gs/gridre/shelob/share/hadoop -Dhadoop.root.logger=INFO,DRFA -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -classpath /home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/share/hadoop/share/hadoop/common/lib/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/common/*:/home/gs/gridre/shelob/share/hadoop/hadoop-*-capacity-scheduler.jar:/home/gs/gridre/shelob/share/hadoop/hadoop-*-capacity-scheduler.jar:/home/gs/gridre/shelob/share/hadoop/hadoop-*-capacity-scheduler.jar:/home/gs/gridre/shelob/share/hadoop/share/hadoop/hdfs:/home/gs/gridre/shelob/share/hadoop/share/hadoop/hdfs/lib/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/hdfs/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/lib/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/*:/home/gs/java/jdk64/current/lib/tools.jar:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/lib/* org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer 1) On the RM with older hadoop version where the job summary log does not exist jobhistory ps shows using the option: -Dmapred.jobsummary.logger=INFO,console 2) On the RM with older hadoop version where the job summary log exists jobhistory ps shows using the option: -Dmapred.jobsummary.logger=INFO,JSA -Dmapred.jobsummary.logger=INFO,JSA -- This message is
[jira] [Updated] (MAPREDUCE-3843) Job summary log file found missing on the RM host
[ https://issues.apache.org/jira/browse/MAPREDUCE-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anupam Seth updated MAPREDUCE-3843: --- Attachment: MAPREDUCE-3843.patch Job summary log file found missing on the RM host - Key: MAPREDUCE-3843 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3843 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mrv2 Affects Versions: 0.23.0 Reporter: Anupam Seth Assignee: Anupam Seth Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch This bug was found by Phil Su as part of our testing. After MAPREDUCE-3354 went in, the Job summary log file seems to have gone missing on the RM host. The job summary log appears to be interspersed in yarn-mapredqa-historyserver-host.out. e.g. 12/02/09 15:57:21 INFO jobhistory.JobSummary: jobId=job_1328658619341_0011,submitTime=1328802904381,launchTime=1328802909977,firstMapTaskLaunchTime=1328802912116,firstReduceTaskLaunchTime=1328802915074,finishTime=1328802933797,resourc esPerMap=1024,resourcesPerReduce=2048,numMaps=10,numReduces=10,user=hadoopqa,queue=default,status=KILLED,mapSlotSeconds=0,reduceSlotSeconds=0 1) On the RM with older hadoop version where the job summary log does not exist mapredqa 10903 0.0 1.2 1424404 210240 ? Sl Feb07 0:19 /home/gs/java/jdk64/current/bin/java -Xmx1000m -Djava.net.preferIPv4Stack=true -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/ home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64 -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/home/gs/gridre/theoden/share/hadoop -Dhadoop.id.str=mapredqa -Dhadoop .root.logger=INFO,console -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/nat ive -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dmapred.jobsummary.logger=INFO,console -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer 2) On the RM with older hadoop version where the job summary log exists mapredqa 24851 0.0 0.5 1463280 90516 ? Sl Jan25 0:37 /home/gs/java/jdk64/current/bin/java -Dproc_historyserver -Xmx1000m -Dmapred.jobsummary.logger=INFO,JSA -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn.log -Dyarn.home.dir= -Dyarn.id.str= -Dyarn.root.logger=INFO,console -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir= -Dyarn.id.str=mapredqa -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dmapred.jobsummary.logger=INFO,JSA -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir=/home/gs/gridre/shelob/share/hadoop -Dhadoop.root.logger=INFO,DRFA -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -classpath /home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/share/hadoop/share/hadoop/common/lib/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/common/*:/home/gs/gridre/shelob/share/hadoop/hadoop-*-capacity-scheduler.jar:/home/gs/gridre/shelob/share/hadoop/hadoop-*-capacity-scheduler.jar:/home/gs/gridre/shelob/share/hadoop/hadoop-*-capacity-scheduler.jar:/home/gs/gridre/shelob/share/hadoop/share/hadoop/hdfs:/home/gs/gridre/shelob/share/hadoop/share/hadoop/hdfs/lib/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/hdfs/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/lib/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/*:/home/gs/java/jdk64/current/lib/tools.jar:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/lib/* org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer 1) On the RM with older hadoop version where the job summary log does not exist jobhistory ps shows using the option: -Dmapred.jobsummary.logger=INFO,console 2) On the RM with older hadoop version where the job summary log exists jobhistory ps shows using the option: -Dmapred.jobsummary.logger=INFO,JSA
[jira] [Updated] (MAPREDUCE-3843) Job summary log file found missing on the RM host
[ https://issues.apache.org/jira/browse/MAPREDUCE-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anupam Seth updated MAPREDUCE-3843: --- Attachment: MAPREDUCE-3843.patch Fixing doc stuff Job summary log file found missing on the RM host - Key: MAPREDUCE-3843 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3843 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mrv2 Affects Versions: 0.23.0 Reporter: Anupam Seth Assignee: Anupam Seth Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch This bug was found by Phil Su as part of our testing. After MAPREDUCE-3354 went in, the Job summary log file seems to have gone missing on the RM host. The job summary log appears to be interspersed in yarn-mapredqa-historyserver-host.out. e.g. 12/02/09 15:57:21 INFO jobhistory.JobSummary: jobId=job_1328658619341_0011,submitTime=1328802904381,launchTime=1328802909977,firstMapTaskLaunchTime=1328802912116,firstReduceTaskLaunchTime=1328802915074,finishTime=1328802933797,resourc esPerMap=1024,resourcesPerReduce=2048,numMaps=10,numReduces=10,user=hadoopqa,queue=default,status=KILLED,mapSlotSeconds=0,reduceSlotSeconds=0 1) On the RM with older hadoop version where the job summary log does not exist mapredqa 10903 0.0 1.2 1424404 210240 ? Sl Feb07 0:19 /home/gs/java/jdk64/current/bin/java -Xmx1000m -Djava.net.preferIPv4Stack=true -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/ home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64 -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/home/gs/gridre/theoden/share/hadoop -Dhadoop.id.str=mapredqa -Dhadoop .root.logger=INFO,console -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/nat ive -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dmapred.jobsummary.logger=INFO,console -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer 2) On the RM with older hadoop version where the job summary log exists mapredqa 24851 0.0 0.5 1463280 90516 ? Sl Jan25 0:37 /home/gs/java/jdk64/current/bin/java -Dproc_historyserver -Xmx1000m -Dmapred.jobsummary.logger=INFO,JSA -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn.log -Dyarn.home.dir= -Dyarn.id.str= -Dyarn.root.logger=INFO,console -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir= -Dyarn.id.str=mapredqa -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dmapred.jobsummary.logger=INFO,JSA -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir=/home/gs/gridre/shelob/share/hadoop -Dhadoop.root.logger=INFO,DRFA -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -classpath /home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/share/hadoop/share/hadoop/common/lib/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/common/*:/home/gs/gridre/shelob/share/hadoop/hadoop-*-capacity-scheduler.jar:/home/gs/gridre/shelob/share/hadoop/hadoop-*-capacity-scheduler.jar:/home/gs/gridre/shelob/share/hadoop/hadoop-*-capacity-scheduler.jar:/home/gs/gridre/shelob/share/hadoop/share/hadoop/hdfs:/home/gs/gridre/shelob/share/hadoop/share/hadoop/hdfs/lib/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/hdfs/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/lib/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/*:/home/gs/java/jdk64/current/lib/tools.jar:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/lib/* org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer 1) On the RM with older hadoop version where the job summary log does not exist jobhistory ps shows using the option: -Dmapred.jobsummary.logger=INFO,console 2) On the RM with older hadoop version where the job summary log exists jobhistory ps shows using the option:
[jira] [Updated] (MAPREDUCE-3843) Job summary log file found missing on the RM host
[ https://issues.apache.org/jira/browse/MAPREDUCE-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anupam Seth updated MAPREDUCE-3843: --- Status: Patch Available (was: Open) Job summary log file found missing on the RM host - Key: MAPREDUCE-3843 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3843 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mrv2 Affects Versions: 0.23.0 Reporter: Anupam Seth Assignee: Anupam Seth Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch This bug was found by Phil Su as part of our testing. After MAPREDUCE-3354 went in, the Job summary log file seems to have gone missing on the RM host. The job summary log appears to be interspersed in yarn-mapredqa-historyserver-host.out. e.g. 12/02/09 15:57:21 INFO jobhistory.JobSummary: jobId=job_1328658619341_0011,submitTime=1328802904381,launchTime=1328802909977,firstMapTaskLaunchTime=1328802912116,firstReduceTaskLaunchTime=1328802915074,finishTime=1328802933797,resourc esPerMap=1024,resourcesPerReduce=2048,numMaps=10,numReduces=10,user=hadoopqa,queue=default,status=KILLED,mapSlotSeconds=0,reduceSlotSeconds=0 1) On the RM with older hadoop version where the job summary log does not exist mapredqa 10903 0.0 1.2 1424404 210240 ? Sl Feb07 0:19 /home/gs/java/jdk64/current/bin/java -Xmx1000m -Djava.net.preferIPv4Stack=true -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/ home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64 -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/home/gs/gridre/theoden/share/hadoop -Dhadoop.id.str=mapredqa -Dhadoop .root.logger=INFO,console -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/nat ive -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dmapred.jobsummary.logger=INFO,console -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer 2) On the RM with older hadoop version where the job summary log exists mapredqa 24851 0.0 0.5 1463280 90516 ? Sl Jan25 0:37 /home/gs/java/jdk64/current/bin/java -Dproc_historyserver -Xmx1000m -Dmapred.jobsummary.logger=INFO,JSA -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn.log -Dyarn.home.dir= -Dyarn.id.str= -Dyarn.root.logger=INFO,console -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir= -Dyarn.id.str=mapredqa -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dmapred.jobsummary.logger=INFO,JSA -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir=/home/gs/gridre/shelob/share/hadoop -Dhadoop.root.logger=INFO,DRFA -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -classpath /home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/share/hadoop/share/hadoop/common/lib/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/common/*:/home/gs/gridre/shelob/share/hadoop/hadoop-*-capacity-scheduler.jar:/home/gs/gridre/shelob/share/hadoop/hadoop-*-capacity-scheduler.jar:/home/gs/gridre/shelob/share/hadoop/hadoop-*-capacity-scheduler.jar:/home/gs/gridre/shelob/share/hadoop/share/hadoop/hdfs:/home/gs/gridre/shelob/share/hadoop/share/hadoop/hdfs/lib/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/hdfs/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/lib/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/*:/home/gs/java/jdk64/current/lib/tools.jar:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/lib/* org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer 1) On the RM with older hadoop version where the job summary log does not exist jobhistory ps shows using the option: -Dmapred.jobsummary.logger=INFO,console 2) On the RM with older hadoop version where the job summary log exists jobhistory ps shows using the option: -Dmapred.jobsummary.logger=INFO,JSA
[jira] [Commented] (MAPREDUCE-3843) Job summary log file found missing on the RM host
[ https://issues.apache.org/jira/browse/MAPREDUCE-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205587#comment-13205587 ] Hadoop QA commented on MAPREDUCE-3843: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514126/MAPREDUCE-3843.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.yarn.util.TestLinuxResourceCalculatorPlugin +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1835//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1835//console This message is automatically generated. Job summary log file found missing on the RM host - Key: MAPREDUCE-3843 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3843 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mrv2 Affects Versions: 0.23.0 Reporter: Anupam Seth Assignee: Anupam Seth Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch This bug was found by Phil Su as part of our testing. After MAPREDUCE-3354 went in, the Job summary log file seems to have gone missing on the RM host. The job summary log appears to be interspersed in yarn-mapredqa-historyserver-host.out. e.g. 12/02/09 15:57:21 INFO jobhistory.JobSummary: jobId=job_1328658619341_0011,submitTime=1328802904381,launchTime=1328802909977,firstMapTaskLaunchTime=1328802912116,firstReduceTaskLaunchTime=1328802915074,finishTime=1328802933797,resourc esPerMap=1024,resourcesPerReduce=2048,numMaps=10,numReduces=10,user=hadoopqa,queue=default,status=KILLED,mapSlotSeconds=0,reduceSlotSeconds=0 1) On the RM with older hadoop version where the job summary log does not exist mapredqa 10903 0.0 1.2 1424404 210240 ? Sl Feb07 0:19 /home/gs/java/jdk64/current/bin/java -Xmx1000m -Djava.net.preferIPv4Stack=true -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/ home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64 -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/home/gs/gridre/theoden/share/hadoop -Dhadoop.id.str=mapredqa -Dhadoop .root.logger=INFO,console -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/nat ive -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dmapred.jobsummary.logger=INFO,console -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer 2) On the RM with older hadoop version where the job summary log exists mapredqa 24851 0.0 0.5 1463280 90516 ? Sl Jan25 0:37 /home/gs/java/jdk64/current/bin/java -Dproc_historyserver -Xmx1000m -Dmapred.jobsummary.logger=INFO,JSA -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn.log -Dyarn.home.dir= -Dyarn.id.str= -Dyarn.root.logger=INFO,console -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir= -Dyarn.id.str=mapredqa -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dmapred.jobsummary.logger=INFO,JSA -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir=/home/gs/gridre/shelob/share/hadoop -Dhadoop.root.logger=INFO,DRFA -Dyarn.root.logger=INFO,DRFA
[jira] [Updated] (MAPREDUCE-3825) Need generalized multi-token filesystem support
[ https://issues.apache.org/jira/browse/MAPREDUCE-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated MAPREDUCE-3825: --- Attachment: TokenCache.pdf Attach proposed design doc. Need generalized multi-token filesystem support --- Key: MAPREDUCE-3825 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3825 Project: Hadoop Map/Reduce Issue Type: Bug Components: security Affects Versions: 0.23.1, 0.24.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: MAPREDUCE-3825.patch, TokenCache.pdf This is the counterpart to HADOOP-7967. The token cache currently tries to assume a filesystem's token service key. The assumption generally worked while there was a one to one mapping of filesystem to token. With the advent of multi-token filesystems like viewfs, the token cache will try to use a service key (ie. for viewfs) that will never exist (because it really gets the mounted fs tokens). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3825) Need generalized multi-token filesystem support
[ https://issues.apache.org/jira/browse/MAPREDUCE-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205623#comment-13205623 ] Hadoop QA commented on MAPREDUCE-3825: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514135/TokenCache.pdf against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1836//console This message is automatically generated. Need generalized multi-token filesystem support --- Key: MAPREDUCE-3825 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3825 Project: Hadoop Map/Reduce Issue Type: Bug Components: security Affects Versions: 0.23.1, 0.24.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: MAPREDUCE-3825.patch, TokenCache.pdf This is the counterpart to HADOOP-7967. The token cache currently tries to assume a filesystem's token service key. The assumption generally worked while there was a one to one mapping of filesystem to token. With the advent of multi-token filesystems like viewfs, the token cache will try to use a service key (ie. for viewfs) that will never exist (because it really gets the mounted fs tokens). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3844) Problem in setting the childTmpDir in MapReduceChildJVM
[ https://issues.apache.org/jira/browse/MAPREDUCE-3844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205628#comment-13205628 ] Eli Collins commented on MAPREDUCE-3844: Looks like this is a dupe of MAPREDUCE-3716? Problem in setting the childTmpDir in MapReduceChildJVM --- Key: MAPREDUCE-3844 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3844 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0, 0.23.1 Reporter: Ahmed Radwan Assignee: Ahmed Radwan Priority: Blocker Attachments: MAPREDUCE-3844.patch, MAPREDUCE-3844_rev2.patch We have seen this issue during a Hive test. Where Hive tries to create a temp file using File.createTempFile(..) and it throws: {code} Exception in thread main java.io.IOException: No such file or directory at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.checkAndCreate(File.java:1704) at java.io.File.createTempFile(File.java:1792) at java.io.File.createTempFile(File.java:1828) at Test.main(Test.java:13) {code} Because it literally sees $PWD/tmp as the temp directory path. $PWD need to be evaluated before being used in setting the property java.io.tmpdir in MapReduceChildJVM.java. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-3849) Change TokenCache's reading of the binary token file
Change TokenCache's reading of the binary token file Key: MAPREDUCE-3849 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3849 Project: Hadoop Map/Reduce Issue Type: Bug Components: security Affects Versions: 0.23.1, 0.24.0 Reporter: Daryn Sharp Assignee: Daryn Sharp When obtaining the tokens for a {{FileSystem}}, the {{TokenCache}} will read the binary token file if a token is not already in the {{Credentials}}. However, it will overwrite any existing tokens in the {{Credentials}} with the contents of the binary token file if a single token is missing. This may cause new tokens to be replaced with invalid/cancelled tokens from the binary file. The new tokens will not be canceled, and thus leak in the namenode until they expire. The binary tokens should be merged with, but not replace, existing tokens in the {{Credentials}}. The code that reads the binary token file is prefaced with: {code} //TODO: Need to come up with a better place to put //this block of code to do with reading the file {code} Also, the loading of the binary token file is the only reason that the {{TokenCache}} has to use {{getCanonicalService}}. If this linkage can be broken, then the 1-to-1 filesystem to token service coupling may be removed. And use of {{getCanonicalService}} can be removed in a subsequent jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3849) Change TokenCache's reading of the binary token file
[ https://issues.apache.org/jira/browse/MAPREDUCE-3849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated MAPREDUCE-3849: --- Target Version/s: 0.23.0, 0.24.0 (was: 0.24.0, 0.23.0) Issue Type: Improvement (was: Bug) Change TokenCache's reading of the binary token file Key: MAPREDUCE-3849 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3849 Project: Hadoop Map/Reduce Issue Type: Improvement Components: security Affects Versions: 0.23.1, 0.24.0 Reporter: Daryn Sharp Assignee: Daryn Sharp When obtaining the tokens for a {{FileSystem}}, the {{TokenCache}} will read the binary token file if a token is not already in the {{Credentials}}. However, it will overwrite any existing tokens in the {{Credentials}} with the contents of the binary token file if a single token is missing. This may cause new tokens to be replaced with invalid/cancelled tokens from the binary file. The new tokens will not be canceled, and thus leak in the namenode until they expire. The binary tokens should be merged with, but not replace, existing tokens in the {{Credentials}}. The code that reads the binary token file is prefaced with: {code} //TODO: Need to come up with a better place to put //this block of code to do with reading the file {code} Also, the loading of the binary token file is the only reason that the {{TokenCache}} has to use {{getCanonicalService}}. If this linkage can be broken, then the 1-to-1 filesystem to token service coupling may be removed. And use of {{getCanonicalService}} can be removed in a subsequent jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3849) Change TokenCache's reading of the binary token file
[ https://issues.apache.org/jira/browse/MAPREDUCE-3849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated MAPREDUCE-3849: --- Attachment: MAPREDUCE-3849.patch Change TokenCache's reading of the binary token file Key: MAPREDUCE-3849 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3849 Project: Hadoop Map/Reduce Issue Type: Improvement Components: security Affects Versions: 0.23.1, 0.24.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: MAPREDUCE-3849.patch When obtaining the tokens for a {{FileSystem}}, the {{TokenCache}} will read the binary token file if a token is not already in the {{Credentials}}. However, it will overwrite any existing tokens in the {{Credentials}} with the contents of the binary token file if a single token is missing. This may cause new tokens to be replaced with invalid/cancelled tokens from the binary file. The new tokens will not be canceled, and thus leak in the namenode until they expire. The binary tokens should be merged with, but not replace, existing tokens in the {{Credentials}}. The code that reads the binary token file is prefaced with: {code} //TODO: Need to come up with a better place to put //this block of code to do with reading the file {code} Also, the loading of the binary token file is the only reason that the {{TokenCache}} has to use {{getCanonicalService}}. If this linkage can be broken, then the 1-to-1 filesystem to token service coupling may be removed. And use of {{getCanonicalService}} can be removed in a subsequent jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-3850) Avoid redundant calls for tokens in TokenCache
Avoid redundant calls for tokens in TokenCache -- Key: MAPREDUCE-3850 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3850 Project: Hadoop Map/Reduce Issue Type: Improvement Components: security Affects Versions: 0.23.1, 0.24.0 Reporter: Daryn Sharp Assignee: Daryn Sharp The {{TokenCache}} will repeatedly call the same filesystem for tokens. This is inefficient and can easily be changed to only call each filesystem once. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3849) Change TokenCache's reading of the binary token file
[ https://issues.apache.org/jira/browse/MAPREDUCE-3849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated MAPREDUCE-3849: --- Target Version/s: 0.23.0, 0.24.0 (was: 0.24.0, 0.23.0) Status: Patch Available (was: Open) Change TokenCache's reading of the binary token file Key: MAPREDUCE-3849 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3849 Project: Hadoop Map/Reduce Issue Type: Improvement Components: security Affects Versions: 0.23.1, 0.24.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: MAPREDUCE-3849.patch When obtaining the tokens for a {{FileSystem}}, the {{TokenCache}} will read the binary token file if a token is not already in the {{Credentials}}. However, it will overwrite any existing tokens in the {{Credentials}} with the contents of the binary token file if a single token is missing. This may cause new tokens to be replaced with invalid/cancelled tokens from the binary file. The new tokens will not be canceled, and thus leak in the namenode until they expire. The binary tokens should be merged with, but not replace, existing tokens in the {{Credentials}}. The code that reads the binary token file is prefaced with: {code} //TODO: Need to come up with a better place to put //this block of code to do with reading the file {code} Also, the loading of the binary token file is the only reason that the {{TokenCache}} has to use {{getCanonicalService}}. If this linkage can be broken, then the 1-to-1 filesystem to token service coupling may be removed. And use of {{getCanonicalService}} can be removed in a subsequent jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3844) Problem in setting the childTmpDir in MapReduceChildJVM
[ https://issues.apache.org/jira/browse/MAPREDUCE-3844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205680#comment-13205680 ] Ahmed Radwan commented on MAPREDUCE-3844: - @Eli, yes it seems like that. We are picking MAPREDUCE-3716 patch to see if it resolves this Hive issue. I'll update the ticket after testing. Problem in setting the childTmpDir in MapReduceChildJVM --- Key: MAPREDUCE-3844 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3844 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0, 0.23.1 Reporter: Ahmed Radwan Assignee: Ahmed Radwan Priority: Blocker Attachments: MAPREDUCE-3844.patch, MAPREDUCE-3844_rev2.patch We have seen this issue during a Hive test. Where Hive tries to create a temp file using File.createTempFile(..) and it throws: {code} Exception in thread main java.io.IOException: No such file or directory at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.checkAndCreate(File.java:1704) at java.io.File.createTempFile(File.java:1792) at java.io.File.createTempFile(File.java:1828) at Test.main(Test.java:13) {code} Because it literally sees $PWD/tmp as the temp directory path. $PWD need to be evaluated before being used in setting the property java.io.tmpdir in MapReduceChildJVM.java. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3770) [Rumen] Zombie.getJobConf() results into NPE
[ https://issues.apache.org/jira/browse/MAPREDUCE-3770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205684#comment-13205684 ] Vinod Kumar Vavilapalli commented on MAPREDUCE-3770: This was meant to be merged into 0.23.1 but wasn't. It was only on trunk and branch-0.23. I just merged it into 0.23.1. [Rumen] Zombie.getJobConf() results into NPE Key: MAPREDUCE-3770 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3770 Project: Hadoop Map/Reduce Issue Type: Bug Components: tools/rumen Affects Versions: 0.23.0, 0.24.0 Reporter: Amar Kamat Assignee: Amar Kamat Priority: Critical Labels: job-name, rumen Fix For: 0.23.1 Attachments: GridmixJobNameBug-v1.0.patch The error trace is as follows {code} java.lang.NullPointerException at java.util.Hashtable.put(Hashtable.java:394) at java.util.Properties.setProperty(Properties.java:143) at org.apache.hadoop.conf.Configuration.set(Configuration.java:623) at org.apache.hadoop.mapred.JobConf.setJobName(JobConf.java:1322) at org.apache.hadoop.tools.rumen.ZombieJob.getJobConf(ZombieJob.java:139) at org.apache.hadoop.mapred.gridmix.DistributedCacheEmulator.updateHDFSDistCacheFilesList(DistributedCacheEmulator.java:315) at org.apache.hadoop.mapred.gridmix.DistributedCacheEmulator.buildDistCacheFilesList(DistributedCacheEmulator.java:280) at org.apache.hadoop.mapred.gridmix.DistributedCacheEmulator.setupGenerateDistCacheData(DistributedCacheEmulator.java:253) at org.apache.hadoop.mapred.gridmix.Gridmix.setupDistCacheEmulation(Gridmix.java:528) at org.apache.hadoop.mapred.gridmix.Gridmix.setupEmulation(Gridmix.java:501) at org.apache.hadoop.mapred.gridmix.Gridmix.start(Gridmix.java:433) at org.apache.hadoop.mapred.gridmix.Gridmix.runJob(Gridmix.java:380) at org.apache.hadoop.mapred.gridmix.Gridmix.access$000(Gridmix.java:56) at org.apache.hadoop.mapred.gridmix.Gridmix$1.run(Gridmix.java:313) at org.apache.hadoop.mapred.gridmix.Gridmix$1.run(Gridmix.java:311) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157) at org.apache.hadoop.mapred.gridmix.Gridmix.run(Gridmix.java:311) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69) at org.apache.hadoop.mapred.gridmix.Gridmix.main(Gridmix.java:606) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:200) {code} The bug seems to be in {{ZombieJob#getName()}} where a not-null check for jobName.getValue() is missing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3849) Change TokenCache's reading of the binary token file
[ https://issues.apache.org/jira/browse/MAPREDUCE-3849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205683#comment-13205683 ] Hadoop QA commented on MAPREDUCE-3849: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514141/MAPREDUCE-3849.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.yarn.util.TestLinuxResourceCalculatorPlugin +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1837//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1837//console This message is automatically generated. Change TokenCache's reading of the binary token file Key: MAPREDUCE-3849 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3849 Project: Hadoop Map/Reduce Issue Type: Improvement Components: security Affects Versions: 0.23.1, 0.24.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: MAPREDUCE-3849.patch When obtaining the tokens for a {{FileSystem}}, the {{TokenCache}} will read the binary token file if a token is not already in the {{Credentials}}. However, it will overwrite any existing tokens in the {{Credentials}} with the contents of the binary token file if a single token is missing. This may cause new tokens to be replaced with invalid/cancelled tokens from the binary file. The new tokens will not be canceled, and thus leak in the namenode until they expire. The binary tokens should be merged with, but not replace, existing tokens in the {{Credentials}}. The code that reads the binary token file is prefaced with: {code} //TODO: Need to come up with a better place to put //this block of code to do with reading the file {code} Also, the loading of the binary token file is the only reason that the {{TokenCache}} has to use {{getCanonicalService}}. If this linkage can be broken, then the 1-to-1 filesystem to token service coupling may be removed. And use of {{getCanonicalService}} can be removed in a subsequent jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3850) Avoid redundant calls for tokens in TokenCache
[ https://issues.apache.org/jira/browse/MAPREDUCE-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated MAPREDUCE-3850: --- Attachment: MAPREDUCE-3850.patch Avoid redundant calls for tokens in TokenCache -- Key: MAPREDUCE-3850 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3850 Project: Hadoop Map/Reduce Issue Type: Improvement Components: security Affects Versions: 0.23.1, 0.24.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: MAPREDUCE-3850.patch The {{TokenCache}} will repeatedly call the same filesystem for tokens. This is inefficient and can easily be changed to only call each filesystem once. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3825) Need generalized multi-token filesystem support
[ https://issues.apache.org/jira/browse/MAPREDUCE-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205701#comment-13205701 ] Sanjay Radia commented on MAPREDUCE-3825: - Dryn, Read your doc. The solution you are proposing (let me call it solution 3) seems to needs 3 APIs: a) FileSystem#getDelegationToken(renewer) - Same as Solution 1's method. b) FileSystem#getDelegationToken(renewer, credentials) c) FileSystem#getFIleSystems() - same as Solution 1's getEmbededFileSystems(). I can't figure out if and how you are using (b) FileSystem#getDelegationToken(renewer, credentials). I am arguing that solution 1 or 2 are sufficient. Your solution seems to be close to solution 1 and i can't figure out how and when you are using API (b) @Dryn but performing the elimination of dups is actually pretty simple: *Yes yes yes* - that is what is used in solution 1. My text says eliminates the duplicate file systems. I am not saying that this code is hard to write - we are in violent agreement. I like Solution 1. - it has clean APIs. But solution 2 has an advantage - see my comment at [solution2Advantage | https://issues.apache.org/jira/browse/MAPREDUCE-3825?focusedCommentId=13205209page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13205209] Questions for you: A) Do you believe your solution 3 is different from solution 1. If so why do you need to add api (b). B) Do you agree or disagree with the advantage of solution 2 that i commented - i am reluctantly beginning to agree with it after talking to the MR folks based on the reasons given in my comment. Need generalized multi-token filesystem support --- Key: MAPREDUCE-3825 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3825 Project: Hadoop Map/Reduce Issue Type: Bug Components: security Affects Versions: 0.23.1, 0.24.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: MAPREDUCE-3825.patch, TokenCache.pdf This is the counterpart to HADOOP-7967. The token cache currently tries to assume a filesystem's token service key. The assumption generally worked while there was a one to one mapping of filesystem to token. With the advent of multi-token filesystems like viewfs, the token cache will try to use a service key (ie. for viewfs) that will never exist (because it really gets the mounted fs tokens). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3850) Avoid redundant calls for tokens in TokenCache
[ https://issues.apache.org/jira/browse/MAPREDUCE-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated MAPREDUCE-3850: --- Target Version/s: 0.23.1, 0.24.0 (was: 0.24.0, 0.23.1) Status: Patch Available (was: Open) Avoid redundant calls for tokens in TokenCache -- Key: MAPREDUCE-3850 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3850 Project: Hadoop Map/Reduce Issue Type: Improvement Components: security Affects Versions: 0.23.1, 0.24.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: MAPREDUCE-3850.patch The {{TokenCache}} will repeatedly call the same filesystem for tokens. This is inefficient and can easily be changed to only call each filesystem once. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3825) Need generalized multi-token filesystem support
[ https://issues.apache.org/jira/browse/MAPREDUCE-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205712#comment-13205712 ] Daryn Sharp commented on MAPREDUCE-3825: I tried to explain in the last paragraph of the doc the alternate patch that I haven't posted which is what I'd ideally like to see: * {{TokenCache#obtainTokensForNamenodes}} does *not* use {{getFileSystems()}}, thus it does not flatten the filesystems * {{TokenCache#obtainTokensForNamenodes}} does *not* use {{getCanonicalServiceName()}} so it longer has a cross-dep on {{FileSystem}} * {{TokenCache#obtainTokensForNamenodes}} should only call {{getDelegationTokens(renewer, creds)}} on each path's filesystem * {{FileSystem#getFileSystems}} is used internally by {{getDelegationTokens(renewer, creds)}} The advantage of solution #2 is based on a misunderstanding. Those requirements don't need to be met at all. I was proposing TokenCache do that to reduce the unnecessary/redundant calls. Need generalized multi-token filesystem support --- Key: MAPREDUCE-3825 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3825 Project: Hadoop Map/Reduce Issue Type: Bug Components: security Affects Versions: 0.23.1, 0.24.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: MAPREDUCE-3825.patch, TokenCache.pdf This is the counterpart to HADOOP-7967. The token cache currently tries to assume a filesystem's token service key. The assumption generally worked while there was a one to one mapping of filesystem to token. With the advent of multi-token filesystems like viewfs, the token cache will try to use a service key (ie. for viewfs) that will never exist (because it really gets the mounted fs tokens). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3789) CapacityTaskScheduler may perform unnecessary reservations in heterogenous tracker environments
[ https://issues.apache.org/jira/browse/MAPREDUCE-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205721#comment-13205721 ] Alejandro Abdelnur commented on MAPREDUCE-3789: --- +1 CapacityTaskScheduler may perform unnecessary reservations in heterogenous tracker environments --- Key: MAPREDUCE-3789 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3789 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/capacity-sched, scheduler Affects Versions: 1.1.0 Reporter: Harsh J Assignee: Harsh J Priority: Critical Attachments: MAPREDUCE-3789.patch, MAPREDUCE-3789.patch, MAPREDUCE-3789.patch Briefly, to reproduce: * Run JT with CapacityTaskScheduler [Say, Cluster max map = 8G, Cluster map = 2G] * Run two TTs but with varied capacity, say, one with 4 map slot, another with 3 map slots. * Run a job with two tasks, each demanding mem worth 4 slots at least (Map mem = 7G or so). * Job will begin running on TT #1, but will also end up reserving the 3 slots on TT #2 cause it does not check for the maximum limit of slots when reserving (as it goes greedy, and hopes to gain more slots in future). * Other jobs that could've run on the TT #2 over 3 slots are thereby blocked out due to this illogical reservation. I've not yet tested MR2 for this so feel free to weigh in if it affects MR2 as well. For MR1, I've attached a test case initially to indicate this. A fix that checks reservations vs. max slots, to follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3825) Need generalized multi-token filesystem support
[ https://issues.apache.org/jira/browse/MAPREDUCE-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205724#comment-13205724 ] Sanjay Radia commented on MAPREDUCE-3825: - Typo: (a) should be FileSystem#getDelegationToken*S*(renewer) - Same as Solution 1's method. ie Tokens not Token Need generalized multi-token filesystem support --- Key: MAPREDUCE-3825 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3825 Project: Hadoop Map/Reduce Issue Type: Bug Components: security Affects Versions: 0.23.1, 0.24.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: MAPREDUCE-3825.patch, TokenCache.pdf This is the counterpart to HADOOP-7967. The token cache currently tries to assume a filesystem's token service key. The assumption generally worked while there was a one to one mapping of filesystem to token. With the advent of multi-token filesystems like viewfs, the token cache will try to use a service key (ie. for viewfs) that will never exist (because it really gets the mounted fs tokens). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3850) Avoid redundant calls for tokens in TokenCache
[ https://issues.apache.org/jira/browse/MAPREDUCE-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205733#comment-13205733 ] Hadoop QA commented on MAPREDUCE-3850: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514148/MAPREDUCE-3850.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.yarn.util.TestLinuxResourceCalculatorPlugin +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1838//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1838//console This message is automatically generated. Avoid redundant calls for tokens in TokenCache -- Key: MAPREDUCE-3850 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3850 Project: Hadoop Map/Reduce Issue Type: Improvement Components: security Affects Versions: 0.23.1, 0.24.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: MAPREDUCE-3850.patch The {{TokenCache}} will repeatedly call the same filesystem for tokens. This is inefficient and can easily be changed to only call each filesystem once. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3843) Job summary log file found missing on the RM host
[ https://issues.apache.org/jira/browse/MAPREDUCE-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205748#comment-13205748 ] Thomas Graves commented on MAPREDUCE-3843: -- Anupam, if you don't mind could you change the rest of the commands in the ClusterSetup to use hadoop-daemon.sh and yarn-daemon.sh appropriately? I tested starting and stopping the job history server. Everything worked and the log files (both out and log) and job summary file (mapred-jobsummary.log) were all created properly using the log4j.properties config below. it was started with the command: $HADOOP_COMMON_HOME/sbin/mr-jobhistory-daemon.sh --config $HADOOP_CONF_DIR/ start historyserver. I think we need to do some cleanup of the mr-jobhistory-daemon.sh and other scripts since right now its intermixing HADOOP and YARN variables. I think the log4j.properties template conf file in the tree needs to also be updated. But I think that can be done in follow up jiras. hadoop.mapreduce.jobsummary.logger=${hadoop.root.logger} hadoop.mapreduce.jobsummary.log.file=hadoop-mapreduce.jobsummary.log mapred.jobsummary.logger=INFO,console log4j.logger.org.apache.hadoop.mapreduce.jobhistory.JobSummary=${mapred.jobsummary.logger} log4j.additivity.org.apache.hadoop.mapreduce.jobhistory.JobSummary=false log4j.appender.JSA=org.apache.log4j.DailyRollingFileAppender log4j.appender.JSA.File=${hadoop.log.dir}/mapred-jobsummary.log log4j.appender.JSA.layout=org.apache.log4j.PatternLayout log4j.appender.JSA.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n log4j.appender.JSA.DatePattern=.-MM-dd Job summary log file found missing on the RM host - Key: MAPREDUCE-3843 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3843 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mrv2 Affects Versions: 0.23.0 Reporter: Anupam Seth Assignee: Anupam Seth Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch This bug was found by Phil Su as part of our testing. After MAPREDUCE-3354 went in, the Job summary log file seems to have gone missing on the RM host. The job summary log appears to be interspersed in yarn-mapredqa-historyserver-host.out. e.g. 12/02/09 15:57:21 INFO jobhistory.JobSummary: jobId=job_1328658619341_0011,submitTime=1328802904381,launchTime=1328802909977,firstMapTaskLaunchTime=1328802912116,firstReduceTaskLaunchTime=1328802915074,finishTime=1328802933797,resourc esPerMap=1024,resourcesPerReduce=2048,numMaps=10,numReduces=10,user=hadoopqa,queue=default,status=KILLED,mapSlotSeconds=0,reduceSlotSeconds=0 1) On the RM with older hadoop version where the job summary log does not exist mapredqa 10903 0.0 1.2 1424404 210240 ? Sl Feb07 0:19 /home/gs/java/jdk64/current/bin/java -Xmx1000m -Djava.net.preferIPv4Stack=true -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/ home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64 -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/home/gs/gridre/theoden/share/hadoop -Dhadoop.id.str=mapredqa -Dhadoop .root.logger=INFO,console -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/nat ive -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dmapred.jobsummary.logger=INFO,console -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer 2) On the RM with older hadoop version where the job summary log exists mapredqa 24851 0.0 0.5 1463280 90516 ? Sl Jan25 0:37 /home/gs/java/jdk64/current/bin/java -Dproc_historyserver -Xmx1000m -Dmapred.jobsummary.logger=INFO,JSA -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn.log -Dyarn.home.dir= -Dyarn.id.str= -Dyarn.root.logger=INFO,console -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir= -Dyarn.id.str=mapredqa -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dmapred.jobsummary.logger=INFO,JSA -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir=/home/gs/gridre/shelob/share/hadoop
[jira] [Commented] (MAPREDUCE-3849) Change TokenCache's reading of the binary token file
[ https://issues.apache.org/jira/browse/MAPREDUCE-3849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205760#comment-13205760 ] Daryn Sharp commented on MAPREDUCE-3849: Failed test is not related to this patch, and it has been failing since yesterday. Change TokenCache's reading of the binary token file Key: MAPREDUCE-3849 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3849 Project: Hadoop Map/Reduce Issue Type: Improvement Components: security Affects Versions: 0.23.1, 0.24.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: MAPREDUCE-3849.patch When obtaining the tokens for a {{FileSystem}}, the {{TokenCache}} will read the binary token file if a token is not already in the {{Credentials}}. However, it will overwrite any existing tokens in the {{Credentials}} with the contents of the binary token file if a single token is missing. This may cause new tokens to be replaced with invalid/cancelled tokens from the binary file. The new tokens will not be canceled, and thus leak in the namenode until they expire. The binary tokens should be merged with, but not replace, existing tokens in the {{Credentials}}. The code that reads the binary token file is prefaced with: {code} //TODO: Need to come up with a better place to put //this block of code to do with reading the file {code} Also, the loading of the binary token file is the only reason that the {{TokenCache}} has to use {{getCanonicalService}}. If this linkage can be broken, then the 1-to-1 filesystem to token service coupling may be removed. And use of {{getCanonicalService}} can be removed in a subsequent jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3850) Avoid redundant calls for tokens in TokenCache
[ https://issues.apache.org/jira/browse/MAPREDUCE-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205761#comment-13205761 ] Daryn Sharp commented on MAPREDUCE-3850: Failed test is not related to this patch, and it has been failing since yesterday. Avoid redundant calls for tokens in TokenCache -- Key: MAPREDUCE-3850 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3850 Project: Hadoop Map/Reduce Issue Type: Improvement Components: security Affects Versions: 0.23.1, 0.24.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: MAPREDUCE-3850.patch The {{TokenCache}} will repeatedly call the same filesystem for tokens. This is inefficient and can easily be changed to only call each filesystem once. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3843) Job summary log file found missing on the RM host
[ https://issues.apache.org/jira/browse/MAPREDUCE-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anupam Seth updated MAPREDUCE-3843: --- Attachment: MAPREDUCE-3843.patch Thanks Tom for your feedback. I have incorporated your suggestion to change the rest of the start-up / stop cmds in the docs also. Job summary log file found missing on the RM host - Key: MAPREDUCE-3843 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3843 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mrv2 Affects Versions: 0.23.0 Reporter: Anupam Seth Assignee: Anupam Seth Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch This bug was found by Phil Su as part of our testing. After MAPREDUCE-3354 went in, the Job summary log file seems to have gone missing on the RM host. The job summary log appears to be interspersed in yarn-mapredqa-historyserver-host.out. e.g. 12/02/09 15:57:21 INFO jobhistory.JobSummary: jobId=job_1328658619341_0011,submitTime=1328802904381,launchTime=1328802909977,firstMapTaskLaunchTime=1328802912116,firstReduceTaskLaunchTime=1328802915074,finishTime=1328802933797,resourc esPerMap=1024,resourcesPerReduce=2048,numMaps=10,numReduces=10,user=hadoopqa,queue=default,status=KILLED,mapSlotSeconds=0,reduceSlotSeconds=0 1) On the RM with older hadoop version where the job summary log does not exist mapredqa 10903 0.0 1.2 1424404 210240 ? Sl Feb07 0:19 /home/gs/java/jdk64/current/bin/java -Xmx1000m -Djava.net.preferIPv4Stack=true -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/ home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64 -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/home/gs/gridre/theoden/share/hadoop -Dhadoop.id.str=mapredqa -Dhadoop .root.logger=INFO,console -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/nat ive -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dmapred.jobsummary.logger=INFO,console -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer 2) On the RM with older hadoop version where the job summary log exists mapredqa 24851 0.0 0.5 1463280 90516 ? Sl Jan25 0:37 /home/gs/java/jdk64/current/bin/java -Dproc_historyserver -Xmx1000m -Dmapred.jobsummary.logger=INFO,JSA -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn.log -Dyarn.home.dir= -Dyarn.id.str= -Dyarn.root.logger=INFO,console -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir= -Dyarn.id.str=mapredqa -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dmapred.jobsummary.logger=INFO,JSA -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir=/home/gs/gridre/shelob/share/hadoop -Dhadoop.root.logger=INFO,DRFA -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -classpath /home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/share/hadoop/share/hadoop/common/lib/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/common/*:/home/gs/gridre/shelob/share/hadoop/hadoop-*-capacity-scheduler.jar:/home/gs/gridre/shelob/share/hadoop/hadoop-*-capacity-scheduler.jar:/home/gs/gridre/shelob/share/hadoop/hadoop-*-capacity-scheduler.jar:/home/gs/gridre/shelob/share/hadoop/share/hadoop/hdfs:/home/gs/gridre/shelob/share/hadoop/share/hadoop/hdfs/lib/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/hdfs/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/lib/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/*:/home/gs/java/jdk64/current/lib/tools.jar:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/lib/* org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer 1) On the RM with older hadoop version where the job summary log does not exist jobhistory ps shows using the option: -Dmapred.jobsummary.logger=INFO,console 2) On the
[jira] [Updated] (MAPREDUCE-3843) Job summary log file found missing on the RM host
[ https://issues.apache.org/jira/browse/MAPREDUCE-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anupam Seth updated MAPREDUCE-3843: --- Status: Open (was: Patch Available) Job summary log file found missing on the RM host - Key: MAPREDUCE-3843 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3843 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mrv2 Affects Versions: 0.23.0 Reporter: Anupam Seth Assignee: Anupam Seth Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch This bug was found by Phil Su as part of our testing. After MAPREDUCE-3354 went in, the Job summary log file seems to have gone missing on the RM host. The job summary log appears to be interspersed in yarn-mapredqa-historyserver-host.out. e.g. 12/02/09 15:57:21 INFO jobhistory.JobSummary: jobId=job_1328658619341_0011,submitTime=1328802904381,launchTime=1328802909977,firstMapTaskLaunchTime=1328802912116,firstReduceTaskLaunchTime=1328802915074,finishTime=1328802933797,resourc esPerMap=1024,resourcesPerReduce=2048,numMaps=10,numReduces=10,user=hadoopqa,queue=default,status=KILLED,mapSlotSeconds=0,reduceSlotSeconds=0 1) On the RM with older hadoop version where the job summary log does not exist mapredqa 10903 0.0 1.2 1424404 210240 ? Sl Feb07 0:19 /home/gs/java/jdk64/current/bin/java -Xmx1000m -Djava.net.preferIPv4Stack=true -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/ home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64 -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/home/gs/gridre/theoden/share/hadoop -Dhadoop.id.str=mapredqa -Dhadoop .root.logger=INFO,console -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/nat ive -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dmapred.jobsummary.logger=INFO,console -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer 2) On the RM with older hadoop version where the job summary log exists mapredqa 24851 0.0 0.5 1463280 90516 ? Sl Jan25 0:37 /home/gs/java/jdk64/current/bin/java -Dproc_historyserver -Xmx1000m -Dmapred.jobsummary.logger=INFO,JSA -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn.log -Dyarn.home.dir= -Dyarn.id.str= -Dyarn.root.logger=INFO,console -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir= -Dyarn.id.str=mapredqa -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dmapred.jobsummary.logger=INFO,JSA -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir=/home/gs/gridre/shelob/share/hadoop -Dhadoop.root.logger=INFO,DRFA -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -classpath /home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/share/hadoop/share/hadoop/common/lib/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/common/*:/home/gs/gridre/shelob/share/hadoop/hadoop-*-capacity-scheduler.jar:/home/gs/gridre/shelob/share/hadoop/hadoop-*-capacity-scheduler.jar:/home/gs/gridre/shelob/share/hadoop/hadoop-*-capacity-scheduler.jar:/home/gs/gridre/shelob/share/hadoop/share/hadoop/hdfs:/home/gs/gridre/shelob/share/hadoop/share/hadoop/hdfs/lib/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/hdfs/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/lib/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/*:/home/gs/java/jdk64/current/lib/tools.jar:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/lib/* org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer 1) On the RM with older hadoop version where the job summary log does not exist jobhistory ps shows using the option: -Dmapred.jobsummary.logger=INFO,console 2) On the RM with older hadoop version where the job summary log exists jobhistory ps shows using the option: -Dmapred.jobsummary.logger=INFO,JSA
[jira] [Updated] (MAPREDUCE-3843) Job summary log file found missing on the RM host
[ https://issues.apache.org/jira/browse/MAPREDUCE-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anupam Seth updated MAPREDUCE-3843: --- Attachment: MAPREDUCE-3843.patch Job summary log file found missing on the RM host - Key: MAPREDUCE-3843 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3843 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mrv2 Affects Versions: 0.23.0 Reporter: Anupam Seth Assignee: Anupam Seth Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch This bug was found by Phil Su as part of our testing. After MAPREDUCE-3354 went in, the Job summary log file seems to have gone missing on the RM host. The job summary log appears to be interspersed in yarn-mapredqa-historyserver-host.out. e.g. 12/02/09 15:57:21 INFO jobhistory.JobSummary: jobId=job_1328658619341_0011,submitTime=1328802904381,launchTime=1328802909977,firstMapTaskLaunchTime=1328802912116,firstReduceTaskLaunchTime=1328802915074,finishTime=1328802933797,resourc esPerMap=1024,resourcesPerReduce=2048,numMaps=10,numReduces=10,user=hadoopqa,queue=default,status=KILLED,mapSlotSeconds=0,reduceSlotSeconds=0 1) On the RM with older hadoop version where the job summary log does not exist mapredqa 10903 0.0 1.2 1424404 210240 ? Sl Feb07 0:19 /home/gs/java/jdk64/current/bin/java -Xmx1000m -Djava.net.preferIPv4Stack=true -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/ home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64 -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/home/gs/gridre/theoden/share/hadoop -Dhadoop.id.str=mapredqa -Dhadoop .root.logger=INFO,console -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/nat ive -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dmapred.jobsummary.logger=INFO,console -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer 2) On the RM with older hadoop version where the job summary log exists mapredqa 24851 0.0 0.5 1463280 90516 ? Sl Jan25 0:37 /home/gs/java/jdk64/current/bin/java -Dproc_historyserver -Xmx1000m -Dmapred.jobsummary.logger=INFO,JSA -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn.log -Dyarn.home.dir= -Dyarn.id.str= -Dyarn.root.logger=INFO,console -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir= -Dyarn.id.str=mapredqa -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dmapred.jobsummary.logger=INFO,JSA -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir=/home/gs/gridre/shelob/share/hadoop -Dhadoop.root.logger=INFO,DRFA -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -classpath /home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/share/hadoop/share/hadoop/common/lib/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/common/*:/home/gs/gridre/shelob/share/hadoop/hadoop-*-capacity-scheduler.jar:/home/gs/gridre/shelob/share/hadoop/hadoop-*-capacity-scheduler.jar:/home/gs/gridre/shelob/share/hadoop/hadoop-*-capacity-scheduler.jar:/home/gs/gridre/shelob/share/hadoop/share/hadoop/hdfs:/home/gs/gridre/shelob/share/hadoop/share/hadoop/hdfs/lib/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/hdfs/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/lib/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/*:/home/gs/java/jdk64/current/lib/tools.jar:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/lib/* org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer 1) On the RM with older hadoop version where the job summary log does not exist jobhistory ps shows using the option: -Dmapred.jobsummary.logger=INFO,console 2) On the RM with older hadoop version where the job summary log exists jobhistory ps shows using the option:
[jira] [Created] (MAPREDUCE-3851) Allow more aggressive action on detection of the jetty issue
Allow more aggressive action on detection of the jetty issue Key: MAPREDUCE-3851 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3851 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 1.0.0 Reporter: Kihwal Lee Fix For: 1.1.0, 1.0.1 MAPREDUCE-2529 added the useful failure detection mechanism. In this jira, I propose we add a periodic check inside TT and configurable action to self-destruct. Blacklisting helps but is not enough. Hung jetty still accepts connection and it takes very long time for clients to fail out. Short jobs are delayed for hours because of this. This feature will be a nice companion to MAPREDUCE-3184. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3843) Job summary log file found missing on the RM host
[ https://issues.apache.org/jira/browse/MAPREDUCE-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205802#comment-13205802 ] Hadoop QA commented on MAPREDUCE-3843: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514166/MAPREDUCE-3843.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.yarn.util.TestLinuxResourceCalculatorPlugin +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1839//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1839//console This message is automatically generated. Job summary log file found missing on the RM host - Key: MAPREDUCE-3843 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3843 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mrv2 Affects Versions: 0.23.0 Reporter: Anupam Seth Assignee: Anupam Seth Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch This bug was found by Phil Su as part of our testing. After MAPREDUCE-3354 went in, the Job summary log file seems to have gone missing on the RM host. The job summary log appears to be interspersed in yarn-mapredqa-historyserver-host.out. e.g. 12/02/09 15:57:21 INFO jobhistory.JobSummary: jobId=job_1328658619341_0011,submitTime=1328802904381,launchTime=1328802909977,firstMapTaskLaunchTime=1328802912116,firstReduceTaskLaunchTime=1328802915074,finishTime=1328802933797,resourc esPerMap=1024,resourcesPerReduce=2048,numMaps=10,numReduces=10,user=hadoopqa,queue=default,status=KILLED,mapSlotSeconds=0,reduceSlotSeconds=0 1) On the RM with older hadoop version where the job summary log does not exist mapredqa 10903 0.0 1.2 1424404 210240 ? Sl Feb07 0:19 /home/gs/java/jdk64/current/bin/java -Xmx1000m -Djava.net.preferIPv4Stack=true -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/ home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64 -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/home/gs/gridre/theoden/share/hadoop -Dhadoop.id.str=mapredqa -Dhadoop .root.logger=INFO,console -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/nat ive -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dmapred.jobsummary.logger=INFO,console -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer 2) On the RM with older hadoop version where the job summary log exists mapredqa 24851 0.0 0.5 1463280 90516 ? Sl Jan25 0:37 /home/gs/java/jdk64/current/bin/java -Dproc_historyserver -Xmx1000m -Dmapred.jobsummary.logger=INFO,JSA -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn.log -Dyarn.home.dir= -Dyarn.id.str= -Dyarn.root.logger=INFO,console -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir= -Dyarn.id.str=mapredqa -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dmapred.jobsummary.logger=INFO,JSA -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir=/home/gs/gridre/shelob/share/hadoop -Dhadoop.root.logger=INFO,DRFA
[jira] [Commented] (MAPREDUCE-3843) Job summary log file found missing on the RM host
[ https://issues.apache.org/jira/browse/MAPREDUCE-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205811#comment-13205811 ] Hadoop QA commented on MAPREDUCE-3843: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514166/MAPREDUCE-3843.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.yarn.util.TestLinuxResourceCalculatorPlugin +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1840//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1840//console This message is automatically generated. Job summary log file found missing on the RM host - Key: MAPREDUCE-3843 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3843 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mrv2 Affects Versions: 0.23.0 Reporter: Anupam Seth Assignee: Anupam Seth Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch This bug was found by Phil Su as part of our testing. After MAPREDUCE-3354 went in, the Job summary log file seems to have gone missing on the RM host. The job summary log appears to be interspersed in yarn-mapredqa-historyserver-host.out. e.g. 12/02/09 15:57:21 INFO jobhistory.JobSummary: jobId=job_1328658619341_0011,submitTime=1328802904381,launchTime=1328802909977,firstMapTaskLaunchTime=1328802912116,firstReduceTaskLaunchTime=1328802915074,finishTime=1328802933797,resourc esPerMap=1024,resourcesPerReduce=2048,numMaps=10,numReduces=10,user=hadoopqa,queue=default,status=KILLED,mapSlotSeconds=0,reduceSlotSeconds=0 1) On the RM with older hadoop version where the job summary log does not exist mapredqa 10903 0.0 1.2 1424404 210240 ? Sl Feb07 0:19 /home/gs/java/jdk64/current/bin/java -Xmx1000m -Djava.net.preferIPv4Stack=true -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/ home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64 -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/home/gs/gridre/theoden/share/hadoop -Dhadoop.id.str=mapredqa -Dhadoop .root.logger=INFO,console -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/nat ive -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dmapred.jobsummary.logger=INFO,console -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer 2) On the RM with older hadoop version where the job summary log exists mapredqa 24851 0.0 0.5 1463280 90516 ? Sl Jan25 0:37 /home/gs/java/jdk64/current/bin/java -Dproc_historyserver -Xmx1000m -Dmapred.jobsummary.logger=INFO,JSA -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn.log -Dyarn.home.dir= -Dyarn.id.str= -Dyarn.root.logger=INFO,console -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir= -Dyarn.id.str=mapredqa -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dmapred.jobsummary.logger=INFO,JSA -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir=/home/gs/gridre/shelob/share/hadoop -Dhadoop.root.logger=INFO,DRFA
[jira] [Created] (MAPREDUCE-3852) test TestLinuxResourceCalculatorPlugin failing
test TestLinuxResourceCalculatorPlugin failing -- Key: MAPREDUCE-3852 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3852 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1 Affects Versions: 0.23.2 Reporter: Thomas Graves Priority: Blocker tests are failing: org.apache.hadoop.yarn.util.TestLinuxResourceCalculatorPlugin.testParsingProcStatAndCpuFile org.apache.hadoop.yarn.util.TestLinuxResourceCalculatorPlugin.testParsingProcMemFile https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1831/testReport/junit/org.apache.hadoop.yarn.util/TestLinuxResourceCalculatorPlugin/testParsingProcStatAndCpuFile/ both with similar error: java.io.FileNotFoundException: /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/target/test-dir/MEMINFO_238849741 (No such file or directory) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3846) Restarted+Recovered AM hangs in some corner cases
[ https://issues.apache.org/jira/browse/MAPREDUCE-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205824#comment-13205824 ] Vinod Kumar Vavilapalli commented on MAPREDUCE-3846: Sharad, I think MAPREDUCE-3802 is different even though the exception trace is the same. What is happening here is with the second AM generation itself. For the erring task, there are multiple attempts. One of the attempts doesn't get logged to JobHistory because the TaskAttempt fails before launch itself. Today we log TaskAttempts and set start times only after the real JVM launch (Do you know why? May be we can change this?). Because of this, JobHistory knows about, say attempts 0,1 and 3. When we replay the completed tasks, the attempt numbers take 0,1,2 and so we get the NPE. Restarted+Recovered AM hangs in some corner cases - Key: MAPREDUCE-3846 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3846 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: mrv2 Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Priority: Critical Attachments: MAPREDUCE-3846-20120210.txt [~karams] found this while testing AM restart/recovery feature. After the first generation AM crashes (manually killed by kill -9), the second generation AM starts, but hangs after a while. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3846) Restarted+Recovered AM hangs in some corner cases
[ https://issues.apache.org/jira/browse/MAPREDUCE-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated MAPREDUCE-3846: --- Attachment: MAPREDUCE-3846-20120210.txt If we log all TaskAttempts (even before launch), we may perhaps avoid this, but I am not sure. So for now, I changed the attemptsNumbers generation during recovery to first use the numbers from previous generation and then jump after all those numbers are exhausted. I also made sure that attempts are replayed correctly in the order of original start times, otherwise (as my test revealed), we may be replaying in wrong order with wrong times. The test fails without the patch and passes with. Sharad, can you please look at the patch and see if it makes sense? Thanks in advance! Restarted+Recovered AM hangs in some corner cases - Key: MAPREDUCE-3846 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3846 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: mrv2 Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Priority: Critical Attachments: MAPREDUCE-3846-20120210.txt [~karams] found this while testing AM restart/recovery feature. After the first generation AM crashes (manually killed by kill -9), the second generation AM starts, but hangs after a while. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3846) Restarted+Recovered AM hangs in some corner cases
[ https://issues.apache.org/jira/browse/MAPREDUCE-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated MAPREDUCE-3846: --- Status: Patch Available (was: Open) Restarted+Recovered AM hangs in some corner cases - Key: MAPREDUCE-3846 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3846 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: mrv2 Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Priority: Critical Attachments: MAPREDUCE-3846-20120210.txt [~karams] found this while testing AM restart/recovery feature. After the first generation AM crashes (manually killed by kill -9), the second generation AM starts, but hangs after a while. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3852) test TestLinuxResourceCalculatorPlugin failing
[ https://issues.apache.org/jira/browse/MAPREDUCE-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205828#comment-13205828 ] Thomas Graves commented on MAPREDUCE-3852: -- I think this was introduced by https://issues.apache.org/jira/browse/HADOOP-8035 when it removed the hadoop-project/pom.xml plugin to create-testdirs. test TestLinuxResourceCalculatorPlugin failing -- Key: MAPREDUCE-3852 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3852 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1 Affects Versions: 0.23.2 Reporter: Thomas Graves Priority: Blocker tests are failing: org.apache.hadoop.yarn.util.TestLinuxResourceCalculatorPlugin.testParsingProcStatAndCpuFile org.apache.hadoop.yarn.util.TestLinuxResourceCalculatorPlugin.testParsingProcMemFile https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1831/testReport/junit/org.apache.hadoop.yarn.util/TestLinuxResourceCalculatorPlugin/testParsingProcStatAndCpuFile/ both with similar error: java.io.FileNotFoundException: /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/target/test-dir/MEMINFO_238849741 (No such file or directory) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3843) Job summary log file found missing on the RM host
[ https://issues.apache.org/jira/browse/MAPREDUCE-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated MAPREDUCE-3843: - Attachment: MAPREDUCE-3843.patch minor update to change HADOOP_PREFIX_HOME to just HADOOP_PREFIX. Job summary log file found missing on the RM host - Key: MAPREDUCE-3843 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3843 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mrv2 Affects Versions: 0.23.0 Reporter: Anupam Seth Assignee: Anupam Seth Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch This bug was found by Phil Su as part of our testing. After MAPREDUCE-3354 went in, the Job summary log file seems to have gone missing on the RM host. The job summary log appears to be interspersed in yarn-mapredqa-historyserver-host.out. e.g. 12/02/09 15:57:21 INFO jobhistory.JobSummary: jobId=job_1328658619341_0011,submitTime=1328802904381,launchTime=1328802909977,firstMapTaskLaunchTime=1328802912116,firstReduceTaskLaunchTime=1328802915074,finishTime=1328802933797,resourc esPerMap=1024,resourcesPerReduce=2048,numMaps=10,numReduces=10,user=hadoopqa,queue=default,status=KILLED,mapSlotSeconds=0,reduceSlotSeconds=0 1) On the RM with older hadoop version where the job summary log does not exist mapredqa 10903 0.0 1.2 1424404 210240 ? Sl Feb07 0:19 /home/gs/java/jdk64/current/bin/java -Xmx1000m -Djava.net.preferIPv4Stack=true -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/ home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64 -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/home/gs/gridre/theoden/share/hadoop -Dhadoop.id.str=mapredqa -Dhadoop .root.logger=INFO,console -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/nat ive -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dmapred.jobsummary.logger=INFO,console -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer 2) On the RM with older hadoop version where the job summary log exists mapredqa 24851 0.0 0.5 1463280 90516 ? Sl Jan25 0:37 /home/gs/java/jdk64/current/bin/java -Dproc_historyserver -Xmx1000m -Dmapred.jobsummary.logger=INFO,JSA -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn.log -Dyarn.home.dir= -Dyarn.id.str= -Dyarn.root.logger=INFO,console -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir= -Dyarn.id.str=mapredqa -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dmapred.jobsummary.logger=INFO,JSA -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir=/home/gs/gridre/shelob/share/hadoop -Dhadoop.root.logger=INFO,DRFA -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -classpath /home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/share/hadoop/share/hadoop/common/lib/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/common/*:/home/gs/gridre/shelob/share/hadoop/hadoop-*-capacity-scheduler.jar:/home/gs/gridre/shelob/share/hadoop/hadoop-*-capacity-scheduler.jar:/home/gs/gridre/shelob/share/hadoop/hadoop-*-capacity-scheduler.jar:/home/gs/gridre/shelob/share/hadoop/share/hadoop/hdfs:/home/gs/gridre/shelob/share/hadoop/share/hadoop/hdfs/lib/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/hdfs/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/lib/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/*:/home/gs/java/jdk64/current/lib/tools.jar:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/lib/* org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer 1) On the RM with older hadoop version where the job summary log does not exist jobhistory ps shows using the option: -Dmapred.jobsummary.logger=INFO,console 2) On the RM with older
[jira] [Commented] (MAPREDUCE-3825) Need generalized multi-token filesystem support
[ https://issues.apache.org/jira/browse/MAPREDUCE-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205833#comment-13205833 ] Sanjay Radia commented on MAPREDUCE-3825: - Here is how Solution 2 will be used and implemented. {code} // Credentials is a mapserviceName, Token void FileSystem#addDelegationTokens(renewer, credentials); // NewAPi for FileSystem // note the old #getDelegationTokens(...) methods in FileSystem are no longer needed. // A Useful Utility - so that the TokenCache in MR can be easily implemented FileUtil:GetTokens(renewer, path[] ps, credentials) { foreach (p in ps) { GetFileSystem(p).addDelegationTokens(renwer, credentials); return; } // Two implementation examples - viewfs and DistributedFileSystem ViewFileSystem#addDelegationTokens(renewer, credentials) {// contains embedded FSs as mounts foreach (mountFs in mountPoints) { mountFs.addDelegationTokens(renewer, credentials); } return; } DistributedFileSystem#addDelegationTokens(renewer, credentials) { // a leaf file system. // I am ignoring the race condition across contains() and add(); myServiceName = getCanonicalServiceName(); if (credentials.contains(myServiceName) { return; } myDelegationToken = getDTfromMyNN(); credentials.add(myServiceName, myDelegationToken); return; } {code} Need generalized multi-token filesystem support --- Key: MAPREDUCE-3825 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3825 Project: Hadoop Map/Reduce Issue Type: Bug Components: security Affects Versions: 0.23.1, 0.24.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: MAPREDUCE-3825.patch, TokenCache.pdf This is the counterpart to HADOOP-7967. The token cache currently tries to assume a filesystem's token service key. The assumption generally worked while there was a one to one mapping of filesystem to token. With the advent of multi-token filesystems like viewfs, the token cache will try to use a service key (ie. for viewfs) that will never exist (because it really gets the mounted fs tokens). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-3844) Problem in setting the childTmpDir in MapReduceChildJVM
[ https://issues.apache.org/jira/browse/MAPREDUCE-3844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmed Radwan resolved MAPREDUCE-3844. - Resolution: Fixed Confirming it is a duplicate for MAPREDUCE-3716. Closing. Problem in setting the childTmpDir in MapReduceChildJVM --- Key: MAPREDUCE-3844 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3844 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0, 0.23.1 Reporter: Ahmed Radwan Assignee: Ahmed Radwan Priority: Blocker Attachments: MAPREDUCE-3844.patch, MAPREDUCE-3844_rev2.patch We have seen this issue during a Hive test. Where Hive tries to create a temp file using File.createTempFile(..) and it throws: {code} Exception in thread main java.io.IOException: No such file or directory at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.checkAndCreate(File.java:1704) at java.io.File.createTempFile(File.java:1792) at java.io.File.createTempFile(File.java:1828) at Test.main(Test.java:13) {code} Because it literally sees $PWD/tmp as the temp directory path. $PWD need to be evaluated before being used in setting the property java.io.tmpdir in MapReduceChildJVM.java. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3680) FifoScheduler web service rest API can print out invalid JSON
[ https://issues.apache.org/jira/browse/MAPREDUCE-3680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205843#comment-13205843 ] Hudson commented on MAPREDUCE-3680: --- Integrated in Hadoop-Mapreduce-0.23-Build #188 (See [https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Build/188/]) merge -r 1242789:1242790 from trunk. FIXES: MAPREDUCE-3680 tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1242792 Files : * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/TestFifoScheduler.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServices.java FifoScheduler web service rest API can print out invalid JSON - Key: MAPREDUCE-3680 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3680 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0 Reporter: Thomas Graves Fix For: 0.23.2 Attachments: MAPREDUCE-3680-1.patch, MAPREDUCE-3680.patch running a GET on the scheduler web services rest api (RM:port/ws/cluster/scheduler) with the FifoScheduler configured with no nodemanagers up yet and it prints out invalid json of NaN for the used Capacity: {scheduler:{schedulerInfo:{type:fifoScheduler,capacity:1.0,usedCapacity:NaN,qstate:RUNNING,minQueueMemoryCapacity:1024,maxQueueMemoryCapacity:10240,numNodes:0,usedNodeCapacity:0,availNodeCapacity:0,totalNodeCapacity:0,numContainers:0}}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-3853) TestLinuxResourceCalculatorPlugin is failing on trunk and 0.23 branch.
TestLinuxResourceCalculatorPlugin is failing on trunk and 0.23 branch. -- Key: MAPREDUCE-3853 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3853 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Reporter: Mahadev konar Fix For: 0.23.2 Looks like the test is failing: https://builds.apache.org/view/G-L/view/Hadoop/job/Hadoop-Mapreduce-0.23-Build/188/console -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3802) If an MR AM dies twice it looks like the process freezes
[ https://issues.apache.org/jira/browse/MAPREDUCE-3802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205845#comment-13205845 ] Vinod Kumar Vavilapalli commented on MAPREDUCE-3802: MAPREDUCE-3846 is a related ticket with same symptom but a different cause. I uploaded a patch there, where instead of generating AttemptIDs sequentially when recovering, I am using the previous generation attemptIDs first before moving onto generating new ones for this generation. I thought for a while and now it seems to me that that patch will automatically fix this issue also, except I don't have a test validating this. Robert/Sharad, can you please look at my patch there and see if that fixes this? Thanks. If an MR AM dies twice it looks like the process freezes - Key: MAPREDUCE-3802 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3802 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: applicationmaster, mrv2 Affects Versions: 0.23.1, 0.24.0 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Priority: Critical Attachments: syslog It looks like recovering from an RM AM dieing works very well on a single failure. But if it fails multiple times we appear to get into a live lock situation. {noformat} yarn jar hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-*-SNAPSHOT.jar wordcount -Dyarn.app.mapreduce.am.log.level=DEBUG -Dmapreduce.job.reduces=30 input output 12/02/03 21:06:57 WARN conf.Configuration: fs.default.name is deprecated. Instead, use fs.defaultFS 12/02/03 21:06:57 WARN conf.Configuration: mapred.used.genericoptionsparser is deprecated. Instead, use mapreduce.client.genericoptionsparser.used 12/02/03 21:06:57 INFO input.FileInputFormat: Total input paths to process : 17 12/02/03 21:06:57 INFO util.NativeCodeLoader: Loaded the native-hadoop library 12/02/03 21:06:57 WARN snappy.LoadSnappy: Snappy native library not loaded 12/02/03 21:06:57 INFO mapreduce.JobSubmitter: number of splits:17 12/02/03 21:06:57 INFO mapred.ResourceMgrDelegate: Submitted application application_1328302034486_0003 to ResourceManager at HOST/IP:8040 12/02/03 21:06:57 INFO mapreduce.Job: The url to track the job: http://HOST:8088/proxy/application_1328302034486_0003/ 12/02/03 21:06:57 INFO mapreduce.Job: Running job: job_1328302034486_0003 12/02/03 21:07:03 INFO mapreduce.Job: Job job_1328302034486_0003 running in uber mode : false 12/02/03 21:07:03 INFO mapreduce.Job: map 0% reduce 0% 12/02/03 21:07:09 INFO mapreduce.Job: map 5% reduce 0% 12/02/03 21:07:10 INFO mapreduce.Job: map 17% reduce 0% #KILLED AM with kill -9 here 12/02/03 21:07:16 INFO mapreduce.Job: map 29% reduce 0% 12/02/03 21:07:17 INFO mapreduce.Job: map 35% reduce 0% 12/02/03 21:07:30 INFO mapreduce.Job: map 52% reduce 0% 12/02/03 21:07:35 INFO mapreduce.Job: map 58% reduce 0% 12/02/03 21:07:37 INFO mapreduce.Job: map 70% reduce 0% 12/02/03 21:07:41 INFO mapreduce.Job: map 76% reduce 0% 12/02/03 21:07:43 INFO mapreduce.Job: map 82% reduce 0% 12/02/03 21:07:44 INFO mapreduce.Job: map 88% reduce 0% 12/02/03 21:07:47 INFO mapreduce.Job: map 94% reduce 0% 12/02/03 21:07:49 INFO mapreduce.Job: map 100% reduce 0% 12/02/03 21:07:53 INFO mapreduce.Job: map 100% reduce 3% 12/02/03 21:08:00 INFO mapreduce.Job: map 100% reduce 6% 12/02/03 21:08:06 INFO mapreduce.Job: map 100% reduce 10% 12/02/03 21:08:12 INFO mapreduce.Job: map 100% reduce 13% 12/02/03 21:08:18 INFO mapreduce.Job: map 100% reduce 16% #killed AM with kill -9 here 12/02/03 21:08:20 INFO ipc.Client: Retrying connect to server: HOST/IP:44223. Already tried 0 time(s). 12/02/03 21:08:21 INFO ipc.Client: Retrying connect to server: HOST/IP:44223. Already tried 1 time(s). 12/02/03 21:08:22 INFO ipc.Client: Retrying connect to server: HOST/IP:44223. Already tried 2 time(s). 12/02/03 21:08:26 INFO mapreduce.Job: map 64% reduce 16% #It never makes any more progress... {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3843) Job summary log file found missing on the RM host
[ https://issues.apache.org/jira/browse/MAPREDUCE-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205846#comment-13205846 ] Thomas Graves commented on MAPREDUCE-3843: -- +1, waiting for jenkins to +1 and then will commit. Note the previous test failure isn't related to this jira - see MAPREDUCE-3852 Job summary log file found missing on the RM host - Key: MAPREDUCE-3843 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3843 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mrv2 Affects Versions: 0.23.0 Reporter: Anupam Seth Assignee: Anupam Seth Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch This bug was found by Phil Su as part of our testing. After MAPREDUCE-3354 went in, the Job summary log file seems to have gone missing on the RM host. The job summary log appears to be interspersed in yarn-mapredqa-historyserver-host.out. e.g. 12/02/09 15:57:21 INFO jobhistory.JobSummary: jobId=job_1328658619341_0011,submitTime=1328802904381,launchTime=1328802909977,firstMapTaskLaunchTime=1328802912116,firstReduceTaskLaunchTime=1328802915074,finishTime=1328802933797,resourc esPerMap=1024,resourcesPerReduce=2048,numMaps=10,numReduces=10,user=hadoopqa,queue=default,status=KILLED,mapSlotSeconds=0,reduceSlotSeconds=0 1) On the RM with older hadoop version where the job summary log does not exist mapredqa 10903 0.0 1.2 1424404 210240 ? Sl Feb07 0:19 /home/gs/java/jdk64/current/bin/java -Xmx1000m -Djava.net.preferIPv4Stack=true -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/ home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64 -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/home/gs/gridre/theoden/share/hadoop -Dhadoop.id.str=mapredqa -Dhadoop .root.logger=INFO,console -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/nat ive -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dmapred.jobsummary.logger=INFO,console -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer 2) On the RM with older hadoop version where the job summary log exists mapredqa 24851 0.0 0.5 1463280 90516 ? Sl Jan25 0:37 /home/gs/java/jdk64/current/bin/java -Dproc_historyserver -Xmx1000m -Dmapred.jobsummary.logger=INFO,JSA -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn.log -Dyarn.home.dir= -Dyarn.id.str= -Dyarn.root.logger=INFO,console -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir= -Dyarn.id.str=mapredqa -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dmapred.jobsummary.logger=INFO,JSA -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir=/home/gs/gridre/shelob/share/hadoop -Dhadoop.root.logger=INFO,DRFA -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -classpath /home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/share/hadoop/share/hadoop/common/lib/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/common/*:/home/gs/gridre/shelob/share/hadoop/hadoop-*-capacity-scheduler.jar:/home/gs/gridre/shelob/share/hadoop/hadoop-*-capacity-scheduler.jar:/home/gs/gridre/shelob/share/hadoop/hadoop-*-capacity-scheduler.jar:/home/gs/gridre/shelob/share/hadoop/share/hadoop/hdfs:/home/gs/gridre/shelob/share/hadoop/share/hadoop/hdfs/lib/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/hdfs/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/lib/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/*:/home/gs/java/jdk64/current/lib/tools.jar:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/lib/* org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer 1) On the RM with older hadoop version where the job summary log does not exist jobhistory ps shows
[jira] [Reopened] (MAPREDUCE-3844) Problem in setting the childTmpDir in MapReduceChildJVM
[ https://issues.apache.org/jira/browse/MAPREDUCE-3844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli reopened MAPREDUCE-3844: Thanks for confirming Ahmed. Reopening just to link the other JIRA and close this one correctly as a duplicate. Problem in setting the childTmpDir in MapReduceChildJVM --- Key: MAPREDUCE-3844 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3844 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0, 0.23.1 Reporter: Ahmed Radwan Assignee: Ahmed Radwan Priority: Blocker Attachments: MAPREDUCE-3844.patch, MAPREDUCE-3844_rev2.patch We have seen this issue during a Hive test. Where Hive tries to create a temp file using File.createTempFile(..) and it throws: {code} Exception in thread main java.io.IOException: No such file or directory at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.checkAndCreate(File.java:1704) at java.io.File.createTempFile(File.java:1792) at java.io.File.createTempFile(File.java:1828) at Test.main(Test.java:13) {code} Because it literally sees $PWD/tmp as the temp directory path. $PWD need to be evaluated before being used in setting the property java.io.tmpdir in MapReduceChildJVM.java. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-3844) Problem in setting the childTmpDir in MapReduceChildJVM
[ https://issues.apache.org/jira/browse/MAPREDUCE-3844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli resolved MAPREDUCE-3844. Resolution: Duplicate Problem in setting the childTmpDir in MapReduceChildJVM --- Key: MAPREDUCE-3844 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3844 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.0, 0.23.1 Reporter: Ahmed Radwan Assignee: Ahmed Radwan Priority: Blocker Attachments: MAPREDUCE-3844.patch, MAPREDUCE-3844_rev2.patch We have seen this issue during a Hive test. Where Hive tries to create a temp file using File.createTempFile(..) and it throws: {code} Exception in thread main java.io.IOException: No such file or directory at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.checkAndCreate(File.java:1704) at java.io.File.createTempFile(File.java:1792) at java.io.File.createTempFile(File.java:1828) at Test.main(Test.java:13) {code} Because it literally sees $PWD/tmp as the temp directory path. $PWD need to be evaluated before being used in setting the property java.io.tmpdir in MapReduceChildJVM.java. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-3853) TestLinuxResourceCalculatorPlugin is failing on trunk and 0.23 branch.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved MAPREDUCE-3853. -- Resolution: Duplicate this is dup of MAPREDUCE-3852 TestLinuxResourceCalculatorPlugin is failing on trunk and 0.23 branch. -- Key: MAPREDUCE-3853 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3853 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Reporter: Mahadev konar Fix For: 0.23.2 Looks like the test is failing: https://builds.apache.org/view/G-L/view/Hadoop/job/Hadoop-Mapreduce-0.23-Build/188/console -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3846) Restarted+Recovered AM hangs in some corner cases
[ https://issues.apache.org/jira/browse/MAPREDUCE-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205850#comment-13205850 ] Hadoop QA commented on MAPREDUCE-3846: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514172/MAPREDUCE-3846-20120210.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The patch appears to cause tar ant target to fail. +1 eclipse:eclipse. The patch built with eclipse:eclipse. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.yarn.util.TestLinuxResourceCalculatorPlugin +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1841//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1841//console This message is automatically generated. Restarted+Recovered AM hangs in some corner cases - Key: MAPREDUCE-3846 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3846 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: mrv2 Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Priority: Critical Attachments: MAPREDUCE-3846-20120210.txt [~karams] found this while testing AM restart/recovery feature. After the first generation AM crashes (manually killed by kill -9), the second generation AM starts, but hangs after a while. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3843) Job summary log file found missing on the RM host
[ https://issues.apache.org/jira/browse/MAPREDUCE-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205851#comment-13205851 ] Hadoop QA commented on MAPREDUCE-3843: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514175/MAPREDUCE-3843.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.yarn.util.TestLinuxResourceCalculatorPlugin +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1842//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1842//console This message is automatically generated. Job summary log file found missing on the RM host - Key: MAPREDUCE-3843 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3843 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mrv2 Affects Versions: 0.23.0 Reporter: Anupam Seth Assignee: Anupam Seth Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch This bug was found by Phil Su as part of our testing. After MAPREDUCE-3354 went in, the Job summary log file seems to have gone missing on the RM host. The job summary log appears to be interspersed in yarn-mapredqa-historyserver-host.out. e.g. 12/02/09 15:57:21 INFO jobhistory.JobSummary: jobId=job_1328658619341_0011,submitTime=1328802904381,launchTime=1328802909977,firstMapTaskLaunchTime=1328802912116,firstReduceTaskLaunchTime=1328802915074,finishTime=1328802933797,resourc esPerMap=1024,resourcesPerReduce=2048,numMaps=10,numReduces=10,user=hadoopqa,queue=default,status=KILLED,mapSlotSeconds=0,reduceSlotSeconds=0 1) On the RM with older hadoop version where the job summary log does not exist mapredqa 10903 0.0 1.2 1424404 210240 ? Sl Feb07 0:19 /home/gs/java/jdk64/current/bin/java -Xmx1000m -Djava.net.preferIPv4Stack=true -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/ home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64 -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/home/gs/gridre/theoden/share/hadoop -Dhadoop.id.str=mapredqa -Dhadoop .root.logger=INFO,console -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/nat ive -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dmapred.jobsummary.logger=INFO,console -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer 2) On the RM with older hadoop version where the job summary log exists mapredqa 24851 0.0 0.5 1463280 90516 ? Sl Jan25 0:37 /home/gs/java/jdk64/current/bin/java -Dproc_historyserver -Xmx1000m -Dmapred.jobsummary.logger=INFO,JSA -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn.log -Dyarn.home.dir= -Dyarn.id.str= -Dyarn.root.logger=INFO,console -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir= -Dyarn.id.str=mapredqa -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dmapred.jobsummary.logger=INFO,JSA -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir=/home/gs/gridre/shelob/share/hadoop
[jira] [Updated] (MAPREDUCE-3843) Job summary log file found missing on the RM host
[ https://issues.apache.org/jira/browse/MAPREDUCE-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated MAPREDUCE-3843: - Resolution: Fixed Status: Resolved (was: Patch Available) Thanks Anupam!! I committed this to trunk, branch-0.23, and the branch-0.23.1 Job summary log file found missing on the RM host - Key: MAPREDUCE-3843 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3843 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mrv2 Affects Versions: 0.23.0 Reporter: Anupam Seth Assignee: Anupam Seth Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch This bug was found by Phil Su as part of our testing. After MAPREDUCE-3354 went in, the Job summary log file seems to have gone missing on the RM host. The job summary log appears to be interspersed in yarn-mapredqa-historyserver-host.out. e.g. 12/02/09 15:57:21 INFO jobhistory.JobSummary: jobId=job_1328658619341_0011,submitTime=1328802904381,launchTime=1328802909977,firstMapTaskLaunchTime=1328802912116,firstReduceTaskLaunchTime=1328802915074,finishTime=1328802933797,resourc esPerMap=1024,resourcesPerReduce=2048,numMaps=10,numReduces=10,user=hadoopqa,queue=default,status=KILLED,mapSlotSeconds=0,reduceSlotSeconds=0 1) On the RM with older hadoop version where the job summary log does not exist mapredqa 10903 0.0 1.2 1424404 210240 ? Sl Feb07 0:19 /home/gs/java/jdk64/current/bin/java -Xmx1000m -Djava.net.preferIPv4Stack=true -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/ home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64 -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/home/gs/gridre/theoden/share/hadoop -Dhadoop.id.str=mapredqa -Dhadoop .root.logger=INFO,console -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/nat ive -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dmapred.jobsummary.logger=INFO,console -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer 2) On the RM with older hadoop version where the job summary log exists mapredqa 24851 0.0 0.5 1463280 90516 ? Sl Jan25 0:37 /home/gs/java/jdk64/current/bin/java -Dproc_historyserver -Xmx1000m -Dmapred.jobsummary.logger=INFO,JSA -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn.log -Dyarn.home.dir= -Dyarn.id.str= -Dyarn.root.logger=INFO,console -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir= -Dyarn.id.str=mapredqa -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dmapred.jobsummary.logger=INFO,JSA -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir=/home/gs/gridre/shelob/share/hadoop -Dhadoop.root.logger=INFO,DRFA -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -classpath /home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/conf/hadoop:/home/gs/gridre/shelob/share/hadoop/share/hadoop/common/lib/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/common/*:/home/gs/gridre/shelob/share/hadoop/hadoop-*-capacity-scheduler.jar:/home/gs/gridre/shelob/share/hadoop/hadoop-*-capacity-scheduler.jar:/home/gs/gridre/shelob/share/hadoop/hadoop-*-capacity-scheduler.jar:/home/gs/gridre/shelob/share/hadoop/share/hadoop/hdfs:/home/gs/gridre/shelob/share/hadoop/share/hadoop/hdfs/lib/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/hdfs/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/lib/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/*:/home/gs/java/jdk64/current/lib/tools.jar:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/*:/home/gs/gridre/shelob/share/hadoop/share/hadoop/mapreduce/lib/* org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer 1) On the RM with older hadoop version where the job summary log does not exist jobhistory ps shows using the option:
[jira] [Updated] (MAPREDUCE-3789) CapacityTaskScheduler may perform unnecessary reservations in heterogenous tracker environments
[ https://issues.apache.org/jira/browse/MAPREDUCE-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated MAPREDUCE-3789: --- Resolution: Fixed Fix Version/s: 1.1.0 Target Version/s: (was: 1.1.0) Status: Resolved (was: Patch Available) Thanks tucu, committed to branch-1. When I get time later, I will try the same on YARN and file a new JIRA for that if the bug still exists with its CS. CapacityTaskScheduler may perform unnecessary reservations in heterogenous tracker environments --- Key: MAPREDUCE-3789 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3789 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/capacity-sched, scheduler Affects Versions: 1.1.0 Reporter: Harsh J Assignee: Harsh J Priority: Critical Fix For: 1.1.0 Attachments: MAPREDUCE-3789.patch, MAPREDUCE-3789.patch, MAPREDUCE-3789.patch Briefly, to reproduce: * Run JT with CapacityTaskScheduler [Say, Cluster max map = 8G, Cluster map = 2G] * Run two TTs but with varied capacity, say, one with 4 map slot, another with 3 map slots. * Run a job with two tasks, each demanding mem worth 4 slots at least (Map mem = 7G or so). * Job will begin running on TT #1, but will also end up reserving the 3 slots on TT #2 cause it does not check for the maximum limit of slots when reserving (as it goes greedy, and hopes to gain more slots in future). * Other jobs that could've run on the TT #2 over 3 slots are thereby blocked out due to this illogical reservation. I've not yet tested MR2 for this so feel free to weigh in if it affects MR2 as well. For MR1, I've attached a test case initially to indicate this. A fix that checks reservations vs. max slots, to follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3843) Job summary log file found missing on the RM host
[ https://issues.apache.org/jira/browse/MAPREDUCE-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205879#comment-13205879 ] Hudson commented on MAPREDUCE-3843: --- Integrated in Hadoop-Mapreduce-0.23-Commit #541 (See [https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Commit/541/]) merge -r 1242975:1242976 from trunk. FIXES: MAPREDUCE-3843 tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1242977 Files : * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/bin/mr-jobhistory-daemon.sh * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ClusterSetup.apt.vm Job summary log file found missing on the RM host - Key: MAPREDUCE-3843 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3843 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mrv2 Affects Versions: 0.23.0 Reporter: Anupam Seth Assignee: Anupam Seth Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch This bug was found by Phil Su as part of our testing. After MAPREDUCE-3354 went in, the Job summary log file seems to have gone missing on the RM host. The job summary log appears to be interspersed in yarn-mapredqa-historyserver-host.out. e.g. 12/02/09 15:57:21 INFO jobhistory.JobSummary: jobId=job_1328658619341_0011,submitTime=1328802904381,launchTime=1328802909977,firstMapTaskLaunchTime=1328802912116,firstReduceTaskLaunchTime=1328802915074,finishTime=1328802933797,resourc esPerMap=1024,resourcesPerReduce=2048,numMaps=10,numReduces=10,user=hadoopqa,queue=default,status=KILLED,mapSlotSeconds=0,reduceSlotSeconds=0 1) On the RM with older hadoop version where the job summary log does not exist mapredqa 10903 0.0 1.2 1424404 210240 ? Sl Feb07 0:19 /home/gs/java/jdk64/current/bin/java -Xmx1000m -Djava.net.preferIPv4Stack=true -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/ home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64 -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/home/gs/gridre/theoden/share/hadoop -Dhadoop.id.str=mapredqa -Dhadoop .root.logger=INFO,console -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/nat ive -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dmapred.jobsummary.logger=INFO,console -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer 2) On the RM with older hadoop version where the job summary log exists mapredqa 24851 0.0 0.5 1463280 90516 ? Sl Jan25 0:37 /home/gs/java/jdk64/current/bin/java -Dproc_historyserver -Xmx1000m -Dmapred.jobsummary.logger=INFO,JSA -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn.log -Dyarn.home.dir= -Dyarn.id.str= -Dyarn.root.logger=INFO,console -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir= -Dyarn.id.str=mapredqa -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dmapred.jobsummary.logger=INFO,JSA -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir=/home/gs/gridre/shelob/share/hadoop -Dhadoop.root.logger=INFO,DRFA -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -classpath
[jira] [Commented] (MAPREDUCE-3843) Job summary log file found missing on the RM host
[ https://issues.apache.org/jira/browse/MAPREDUCE-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205880#comment-13205880 ] Hudson commented on MAPREDUCE-3843: --- Integrated in Hadoop-Mapreduce-trunk-Commit #1723 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1723/]) MAPREDUCE-3843. Job summary log file found missing on the RM host (Anupam Seth via tgraves) tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1242976 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/bin/mr-jobhistory-daemon.sh * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ClusterSetup.apt.vm Job summary log file found missing on the RM host - Key: MAPREDUCE-3843 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3843 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mrv2 Affects Versions: 0.23.0 Reporter: Anupam Seth Assignee: Anupam Seth Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch This bug was found by Phil Su as part of our testing. After MAPREDUCE-3354 went in, the Job summary log file seems to have gone missing on the RM host. The job summary log appears to be interspersed in yarn-mapredqa-historyserver-host.out. e.g. 12/02/09 15:57:21 INFO jobhistory.JobSummary: jobId=job_1328658619341_0011,submitTime=1328802904381,launchTime=1328802909977,firstMapTaskLaunchTime=1328802912116,firstReduceTaskLaunchTime=1328802915074,finishTime=1328802933797,resourc esPerMap=1024,resourcesPerReduce=2048,numMaps=10,numReduces=10,user=hadoopqa,queue=default,status=KILLED,mapSlotSeconds=0,reduceSlotSeconds=0 1) On the RM with older hadoop version where the job summary log does not exist mapredqa 10903 0.0 1.2 1424404 210240 ? Sl Feb07 0:19 /home/gs/java/jdk64/current/bin/java -Xmx1000m -Djava.net.preferIPv4Stack=true -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/ home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64 -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/home/gs/gridre/theoden/share/hadoop -Dhadoop.id.str=mapredqa -Dhadoop .root.logger=INFO,console -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/nat ive -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dmapred.jobsummary.logger=INFO,console -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer 2) On the RM with older hadoop version where the job summary log exists mapredqa 24851 0.0 0.5 1463280 90516 ? Sl Jan25 0:37 /home/gs/java/jdk64/current/bin/java -Dproc_historyserver -Xmx1000m -Dmapred.jobsummary.logger=INFO,JSA -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn.log -Dyarn.home.dir= -Dyarn.id.str= -Dyarn.root.logger=INFO,console -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir= -Dyarn.id.str=mapredqa -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dmapred.jobsummary.logger=INFO,JSA -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir=/home/gs/gridre/shelob/share/hadoop -Dhadoop.root.logger=INFO,DRFA -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -classpath
[jira] [Commented] (MAPREDUCE-3825) Need generalized multi-token filesystem support
[ https://issues.apache.org/jira/browse/MAPREDUCE-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205882#comment-13205882 ] Daryn Sharp commented on MAPREDUCE-3825: I think requiring every filesystem to bracket its token retrieval with identical check for my token and set my token is brittle. It's an invasive change that isn't backwards compatible, so any filesystem that doesn't properly do a copy-n-paste will cause duplicate tokens. If we want to universally change the behavior, we have to change the filesystems again. I feel it's much safer for a filesystem to implement primitives that a common method uses. My proposed FileSystem#getDelegationTokens does just that. All a filesystem has to do is implement getDelegationToken maybe getFileSystems if it has multiple tokens. Everything else is managed for the filesystem. I'd like to make FileSystem#getDelegationsTokens a final method to enforce the consistency and prevent any filesystem from trying to directly manipulate the credentials. If we want to change the implementation in the future, there's only one place, in our control, that needs to be changed. The sample code also prevents viewfs from shorting out on calls to the same filesystem. It can't be solved by uniquing the fs list. Otherwise, it's a repeat of the TokenCache 1-to-1 mapping of service to a specific token problem. We can't avoid this by uniquing the fs list in viewfs because the underlying mounts might have multiple filesystems, or it might be returning a null service (filtered) yet have a contained filesystem with a token. Once MAPREDUCE-3849 is incorporated, I can fix TokenCache to eliminate the 1-to-1 mapping problem by simply calling getDelegationTokens on the filesystems. Need generalized multi-token filesystem support --- Key: MAPREDUCE-3825 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3825 Project: Hadoop Map/Reduce Issue Type: Bug Components: security Affects Versions: 0.23.1, 0.24.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: MAPREDUCE-3825.patch, TokenCache.pdf This is the counterpart to HADOOP-7967. The token cache currently tries to assume a filesystem's token service key. The assumption generally worked while there was a one to one mapping of filesystem to token. With the advent of multi-token filesystems like viewfs, the token cache will try to use a service key (ie. for viewfs) that will never exist (because it really gets the mounted fs tokens). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3789) CapacityTaskScheduler may perform unnecessary reservations in heterogenous tracker environments
[ https://issues.apache.org/jira/browse/MAPREDUCE-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated MAPREDUCE-3789: --- Hadoop Flags: Reviewed CapacityTaskScheduler may perform unnecessary reservations in heterogenous tracker environments --- Key: MAPREDUCE-3789 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3789 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/capacity-sched, scheduler Affects Versions: 1.1.0 Reporter: Harsh J Assignee: Harsh J Priority: Critical Fix For: 1.1.0 Attachments: MAPREDUCE-3789.patch, MAPREDUCE-3789.patch, MAPREDUCE-3789.patch Briefly, to reproduce: * Run JT with CapacityTaskScheduler [Say, Cluster max map = 8G, Cluster map = 2G] * Run two TTs but with varied capacity, say, one with 4 map slot, another with 3 map slots. * Run a job with two tasks, each demanding mem worth 4 slots at least (Map mem = 7G or so). * Job will begin running on TT #1, but will also end up reserving the 3 slots on TT #2 cause it does not check for the maximum limit of slots when reserving (as it goes greedy, and hopes to gain more slots in future). * Other jobs that could've run on the TT #2 over 3 slots are thereby blocked out due to this illogical reservation. I've not yet tested MR2 for this so feel free to weigh in if it affects MR2 as well. For MR1, I've attached a test case initially to indicate this. A fix that checks reservations vs. max slots, to follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3843) Job summary log file found missing on the RM host
[ https://issues.apache.org/jira/browse/MAPREDUCE-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205896#comment-13205896 ] Hudson commented on MAPREDUCE-3843: --- Integrated in Hadoop-Common-0.23-Commit #537 (See [https://builds.apache.org/job/Hadoop-Common-0.23-Commit/537/]) merge -r 1242975:1242976 from trunk. FIXES: MAPREDUCE-3843 tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1242977 Files : * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/bin/mr-jobhistory-daemon.sh * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ClusterSetup.apt.vm Job summary log file found missing on the RM host - Key: MAPREDUCE-3843 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3843 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mrv2 Affects Versions: 0.23.0 Reporter: Anupam Seth Assignee: Anupam Seth Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch This bug was found by Phil Su as part of our testing. After MAPREDUCE-3354 went in, the Job summary log file seems to have gone missing on the RM host. The job summary log appears to be interspersed in yarn-mapredqa-historyserver-host.out. e.g. 12/02/09 15:57:21 INFO jobhistory.JobSummary: jobId=job_1328658619341_0011,submitTime=1328802904381,launchTime=1328802909977,firstMapTaskLaunchTime=1328802912116,firstReduceTaskLaunchTime=1328802915074,finishTime=1328802933797,resourc esPerMap=1024,resourcesPerReduce=2048,numMaps=10,numReduces=10,user=hadoopqa,queue=default,status=KILLED,mapSlotSeconds=0,reduceSlotSeconds=0 1) On the RM with older hadoop version where the job summary log does not exist mapredqa 10903 0.0 1.2 1424404 210240 ? Sl Feb07 0:19 /home/gs/java/jdk64/current/bin/java -Xmx1000m -Djava.net.preferIPv4Stack=true -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/ home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64 -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/home/gs/gridre/theoden/share/hadoop -Dhadoop.id.str=mapredqa -Dhadoop .root.logger=INFO,console -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/nat ive -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dmapred.jobsummary.logger=INFO,console -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer 2) On the RM with older hadoop version where the job summary log exists mapredqa 24851 0.0 0.5 1463280 90516 ? Sl Jan25 0:37 /home/gs/java/jdk64/current/bin/java -Dproc_historyserver -Xmx1000m -Dmapred.jobsummary.logger=INFO,JSA -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn.log -Dyarn.home.dir= -Dyarn.id.str= -Dyarn.root.logger=INFO,console -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir= -Dyarn.id.str=mapredqa -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dmapred.jobsummary.logger=INFO,JSA -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir=/home/gs/gridre/shelob/share/hadoop -Dhadoop.root.logger=INFO,DRFA -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -classpath
[jira] [Commented] (MAPREDUCE-3843) Job summary log file found missing on the RM host
[ https://issues.apache.org/jira/browse/MAPREDUCE-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205897#comment-13205897 ] Hudson commented on MAPREDUCE-3843: --- Integrated in Hadoop-Common-trunk-Commit #1712 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1712/]) MAPREDUCE-3843. Job summary log file found missing on the RM host (Anupam Seth via tgraves) tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1242976 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/bin/mr-jobhistory-daemon.sh * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ClusterSetup.apt.vm Job summary log file found missing on the RM host - Key: MAPREDUCE-3843 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3843 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mrv2 Affects Versions: 0.23.0 Reporter: Anupam Seth Assignee: Anupam Seth Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch This bug was found by Phil Su as part of our testing. After MAPREDUCE-3354 went in, the Job summary log file seems to have gone missing on the RM host. The job summary log appears to be interspersed in yarn-mapredqa-historyserver-host.out. e.g. 12/02/09 15:57:21 INFO jobhistory.JobSummary: jobId=job_1328658619341_0011,submitTime=1328802904381,launchTime=1328802909977,firstMapTaskLaunchTime=1328802912116,firstReduceTaskLaunchTime=1328802915074,finishTime=1328802933797,resourc esPerMap=1024,resourcesPerReduce=2048,numMaps=10,numReduces=10,user=hadoopqa,queue=default,status=KILLED,mapSlotSeconds=0,reduceSlotSeconds=0 1) On the RM with older hadoop version where the job summary log does not exist mapredqa 10903 0.0 1.2 1424404 210240 ? Sl Feb07 0:19 /home/gs/java/jdk64/current/bin/java -Xmx1000m -Djava.net.preferIPv4Stack=true -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/ home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64 -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/home/gs/gridre/theoden/share/hadoop -Dhadoop.id.str=mapredqa -Dhadoop .root.logger=INFO,console -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/nat ive -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dmapred.jobsummary.logger=INFO,console -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer 2) On the RM with older hadoop version where the job summary log exists mapredqa 24851 0.0 0.5 1463280 90516 ? Sl Jan25 0:37 /home/gs/java/jdk64/current/bin/java -Dproc_historyserver -Xmx1000m -Dmapred.jobsummary.logger=INFO,JSA -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn.log -Dyarn.home.dir= -Dyarn.id.str= -Dyarn.root.logger=INFO,console -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir= -Dyarn.id.str=mapredqa -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dmapred.jobsummary.logger=INFO,JSA -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir=/home/gs/gridre/shelob/share/hadoop -Dhadoop.root.logger=INFO,DRFA -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -classpath
[jira] [Created] (MAPREDUCE-3854) Reinstate environment variable tests in TestMiniMRChildTask
Reinstate environment variable tests in TestMiniMRChildTask --- Key: MAPREDUCE-3854 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3854 Project: Hadoop Map/Reduce Issue Type: Test Components: mrv2 Reporter: Tom White Fix For: 0.23.2 MAPREDUCE-3716 reinstated one of the tests in TestMiniMRChildTask, but there are two more which should be run. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3854) Reinstate environment variable tests in TestMiniMRChildTask
[ https://issues.apache.org/jira/browse/MAPREDUCE-3854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom White updated MAPREDUCE-3854: - Attachment: MAPREDUCE-3854.patch This patch adds the tests back, but they fail on the line that checks that HOME is set to /tmp (line 223, 273). Is it no longer possible to override HOME or is this exposing a bug? Reinstate environment variable tests in TestMiniMRChildTask --- Key: MAPREDUCE-3854 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3854 Project: Hadoop Map/Reduce Issue Type: Test Components: mrv2 Reporter: Tom White Fix For: 0.23.2 Attachments: MAPREDUCE-3854.patch MAPREDUCE-3716 reinstated one of the tests in TestMiniMRChildTask, but there are two more which should be run. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3843) Job summary log file found missing on the RM host
[ https://issues.apache.org/jira/browse/MAPREDUCE-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205905#comment-13205905 ] Hudson commented on MAPREDUCE-3843: --- Integrated in Hadoop-Hdfs-0.23-Commit #526 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Commit/526/]) merge -r 1242975:1242976 from trunk. FIXES: MAPREDUCE-3843 tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1242977 Files : * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/bin/mr-jobhistory-daemon.sh * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ClusterSetup.apt.vm Job summary log file found missing on the RM host - Key: MAPREDUCE-3843 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3843 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mrv2 Affects Versions: 0.23.0 Reporter: Anupam Seth Assignee: Anupam Seth Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch This bug was found by Phil Su as part of our testing. After MAPREDUCE-3354 went in, the Job summary log file seems to have gone missing on the RM host. The job summary log appears to be interspersed in yarn-mapredqa-historyserver-host.out. e.g. 12/02/09 15:57:21 INFO jobhistory.JobSummary: jobId=job_1328658619341_0011,submitTime=1328802904381,launchTime=1328802909977,firstMapTaskLaunchTime=1328802912116,firstReduceTaskLaunchTime=1328802915074,finishTime=1328802933797,resourc esPerMap=1024,resourcesPerReduce=2048,numMaps=10,numReduces=10,user=hadoopqa,queue=default,status=KILLED,mapSlotSeconds=0,reduceSlotSeconds=0 1) On the RM with older hadoop version where the job summary log does not exist mapredqa 10903 0.0 1.2 1424404 210240 ? Sl Feb07 0:19 /home/gs/java/jdk64/current/bin/java -Xmx1000m -Djava.net.preferIPv4Stack=true -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/ home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64 -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/home/gs/gridre/theoden/share/hadoop -Dhadoop.id.str=mapredqa -Dhadoop .root.logger=INFO,console -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/nat ive -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dmapred.jobsummary.logger=INFO,console -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer 2) On the RM with older hadoop version where the job summary log exists mapredqa 24851 0.0 0.5 1463280 90516 ? Sl Jan25 0:37 /home/gs/java/jdk64/current/bin/java -Dproc_historyserver -Xmx1000m -Dmapred.jobsummary.logger=INFO,JSA -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn.log -Dyarn.home.dir= -Dyarn.id.str= -Dyarn.root.logger=INFO,console -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir= -Dyarn.id.str=mapredqa -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dmapred.jobsummary.logger=INFO,JSA -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir=/home/gs/gridre/shelob/share/hadoop -Dhadoop.root.logger=INFO,DRFA -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -classpath
[jira] [Commented] (MAPREDUCE-3843) Job summary log file found missing on the RM host
[ https://issues.apache.org/jira/browse/MAPREDUCE-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205907#comment-13205907 ] Hudson commented on MAPREDUCE-3843: --- Integrated in Hadoop-Hdfs-trunk-Commit #1787 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1787/]) MAPREDUCE-3843. Job summary log file found missing on the RM host (Anupam Seth via tgraves) tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1242976 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/bin/mr-jobhistory-daemon.sh * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ClusterSetup.apt.vm Job summary log file found missing on the RM host - Key: MAPREDUCE-3843 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3843 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, mrv2 Affects Versions: 0.23.0 Reporter: Anupam Seth Assignee: Anupam Seth Priority: Critical Fix For: 0.23.1 Attachments: MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch, MAPREDUCE-3843.patch This bug was found by Phil Su as part of our testing. After MAPREDUCE-3354 went in, the Job summary log file seems to have gone missing on the RM host. The job summary log appears to be interspersed in yarn-mapredqa-historyserver-host.out. e.g. 12/02/09 15:57:21 INFO jobhistory.JobSummary: jobId=job_1328658619341_0011,submitTime=1328802904381,launchTime=1328802909977,firstMapTaskLaunchTime=1328802912116,firstReduceTaskLaunchTime=1328802915074,finishTime=1328802933797,resourc esPerMap=1024,resourcesPerReduce=2048,numMaps=10,numReduces=10,user=hadoopqa,queue=default,status=KILLED,mapSlotSeconds=0,reduceSlotSeconds=0 1) On the RM with older hadoop version where the job summary log does not exist mapredqa 10903 0.0 1.2 1424404 210240 ? Sl Feb07 0:19 /home/gs/java/jdk64/current/bin/java -Xmx1000m -Djava.net.preferIPv4Stack=true -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/ home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64 -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/home/gs/gridre/theoden/share/hadoop -Dhadoop.id.str=mapredqa -Dhadoop .root.logger=INFO,console -Djava.library.path=/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/native/Linux-amd64-64:/home/gs/gridre/theoden/share/hadoop/lib/nat ive -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dmapred.jobsummary.logger=INFO,console -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer 2) On the RM with older hadoop version where the job summary log exists mapredqa 24851 0.0 0.5 1463280 90516 ? Sl Jan25 0:37 /home/gs/java/jdk64/current/bin/java -Dproc_historyserver -Xmx1000m -Dmapred.jobsummary.logger=INFO,JSA -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn.log -Dyarn.home.dir= -Dyarn.id.str= -Dyarn.root.logger=INFO,console -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir= -Dyarn.id.str=mapredqa -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -Dyarn.policy.file=hadoop-policy.xml -Dmapred.jobsummary.logger=INFO,JSA -Dhadoop.log.dir=/home/gs/var/log/mapredqa -Dyarn.log.dir=/home/gs/var/log/mapredqa -Dhadoop.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.log.file=yarn-mapredqa-historyserver-host.log -Dyarn.home.dir=/home/gs/gridre/shelob/share/hadoop -Dhadoop.root.logger=INFO,DRFA -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/home/gs/gridre/shelob/share/hadoop/lib/native/Linux-amd64-64 -classpath
[jira] [Commented] (MAPREDUCE-3851) Allow more aggressive action on detection of the jetty issue
[ https://issues.apache.org/jira/browse/MAPREDUCE-3851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205928#comment-13205928 ] Todd Lipcon commented on MAPREDUCE-3851: Hey Kihwal. What's the distinction between this JIRA and MAPREDUCE-3184? Just making it configurable rather than doing a System.exit? Allow more aggressive action on detection of the jetty issue Key: MAPREDUCE-3851 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3851 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 1.0.0 Reporter: Kihwal Lee Fix For: 1.1.0, 1.0.1 MAPREDUCE-2529 added the useful failure detection mechanism. In this jira, I propose we add a periodic check inside TT and configurable action to self-destruct. Blacklisting helps but is not enough. Hung jetty still accepts connection and it takes very long time for clients to fail out. Short jobs are delayed for hours because of this. This feature will be a nice companion to MAPREDUCE-3184. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-3767) Fix and enable env tests in TestMiniMRChildTask
[ https://issues.apache.org/jira/browse/MAPREDUCE-3767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli resolved MAPREDUCE-3767. Resolution: Duplicate Fix and enable env tests in TestMiniMRChildTask --- Key: MAPREDUCE-3767 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3767 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.0 Reporter: Vinod Kumar Vavilapalli This test is ported to YARN+MR via MAPREDUCE-3716. We should try to enable the env tests also. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3854) Reinstate environment variable tests in TestMiniMRChildTask
[ https://issues.apache.org/jira/browse/MAPREDUCE-3854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205951#comment-13205951 ] Vinod Kumar Vavilapalli commented on MAPREDUCE-3854: Short answer for the HOME issue you pointed out is - no, we aren't supporting it anymore. With 1.0, we never fixed TaskTracker to cleanup the environment(MAPREDUCE-103), so things like HOME *needed to* be supported. In 0.23.*, it is now set by NodeManager after the ends' clean up. So, $HOME already points to the user's home dir, we shouldn't be needing it anymore. Reinstate environment variable tests in TestMiniMRChildTask --- Key: MAPREDUCE-3854 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3854 Project: Hadoop Map/Reduce Issue Type: Test Components: mrv2 Reporter: Tom White Fix For: 0.23.2 Attachments: MAPREDUCE-3854.patch MAPREDUCE-3716 reinstated one of the tests in TestMiniMRChildTask, but there are two more which should be run. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3851) Allow more aggressive action on detection of the jetty issue
[ https://issues.apache.org/jira/browse/MAPREDUCE-3851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205960#comment-13205960 ] Kihwal Lee commented on MAPREDUCE-3851: --- I was under the impression that the jetty health check in MAPREDUCE-2529 and the spinning jetty detection don't have the exact coverge. If MAPREDUCE-3184 alone can cover 100%, There is no reason to have this jira. In that case I will ask it to be included in 1.0.1. Allow more aggressive action on detection of the jetty issue Key: MAPREDUCE-3851 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3851 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 1.0.0 Reporter: Kihwal Lee Fix For: 1.1.0, 1.0.1 MAPREDUCE-2529 added the useful failure detection mechanism. In this jira, I propose we add a periodic check inside TT and configurable action to self-destruct. Blacklisting helps but is not enough. Hung jetty still accepts connection and it takes very long time for clients to fail out. Short jobs are delayed for hours because of this. This feature will be a nice companion to MAPREDUCE-3184. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3825) Need generalized multi-token filesystem support
[ https://issues.apache.org/jira/browse/MAPREDUCE-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205969#comment-13205969 ] Sanjay Radia commented on MAPREDUCE-3825: - Isn't backward compatible What isn't backward compatible? The 2 variations of getDelegationTokens were added in 0.23 and can be safely removed without and compatibility issues and getDelegationToken is deprecated. Need generalized multi-token filesystem support --- Key: MAPREDUCE-3825 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3825 Project: Hadoop Map/Reduce Issue Type: Bug Components: security Affects Versions: 0.23.1, 0.24.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: MAPREDUCE-3825.patch, TokenCache.pdf This is the counterpart to HADOOP-7967. The token cache currently tries to assume a filesystem's token service key. The assumption generally worked while there was a one to one mapping of filesystem to token. With the advent of multi-token filesystems like viewfs, the token cache will try to use a service key (ie. for viewfs) that will never exist (because it really gets the mounted fs tokens). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3852) test TestLinuxResourceCalculatorPlugin failing
[ https://issues.apache.org/jira/browse/MAPREDUCE-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated MAPREDUCE-3852: - Attachment: MAPREDUCE-3852.patch here is patch that reverts the change and fixes the tests. it doesn't seem to affect build times much. I didn't look into other ways to solve but if someone wants to take a look, go for it. build command: mvn clean install site package -Pdist -DskipTests -Dmaven.javadoc.skip=true ran multiple times, with this change it seems to take 4 to 10 seconds longer then without it. That is out of about 7 minutes build time. test TestLinuxResourceCalculatorPlugin failing -- Key: MAPREDUCE-3852 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3852 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1 Affects Versions: 0.23.2 Reporter: Thomas Graves Priority: Blocker Attachments: MAPREDUCE-3852.patch tests are failing: org.apache.hadoop.yarn.util.TestLinuxResourceCalculatorPlugin.testParsingProcStatAndCpuFile org.apache.hadoop.yarn.util.TestLinuxResourceCalculatorPlugin.testParsingProcMemFile https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1831/testReport/junit/org.apache.hadoop.yarn.util/TestLinuxResourceCalculatorPlugin/testParsingProcStatAndCpuFile/ both with similar error: java.io.FileNotFoundException: /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/target/test-dir/MEMINFO_238849741 (No such file or directory) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3852) test TestLinuxResourceCalculatorPlugin failing
[ https://issues.apache.org/jira/browse/MAPREDUCE-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated MAPREDUCE-3852: - Affects Version/s: 0.24.0 Status: Patch Available (was: Open) test TestLinuxResourceCalculatorPlugin failing -- Key: MAPREDUCE-3852 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3852 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1 Affects Versions: 0.24.0, 0.23.2 Reporter: Thomas Graves Priority: Blocker Attachments: MAPREDUCE-3852.patch tests are failing: org.apache.hadoop.yarn.util.TestLinuxResourceCalculatorPlugin.testParsingProcStatAndCpuFile org.apache.hadoop.yarn.util.TestLinuxResourceCalculatorPlugin.testParsingProcMemFile https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1831/testReport/junit/org.apache.hadoop.yarn.util/TestLinuxResourceCalculatorPlugin/testParsingProcStatAndCpuFile/ both with similar error: java.io.FileNotFoundException: /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/target/test-dir/MEMINFO_238849741 (No such file or directory) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (MAPREDUCE-3852) test TestLinuxResourceCalculatorPlugin failing
[ https://issues.apache.org/jira/browse/MAPREDUCE-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves reassigned MAPREDUCE-3852: Assignee: Thomas Graves test TestLinuxResourceCalculatorPlugin failing -- Key: MAPREDUCE-3852 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3852 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1 Affects Versions: 0.24.0, 0.23.2 Reporter: Thomas Graves Assignee: Thomas Graves Priority: Blocker Attachments: MAPREDUCE-3852.patch tests are failing: org.apache.hadoop.yarn.util.TestLinuxResourceCalculatorPlugin.testParsingProcStatAndCpuFile org.apache.hadoop.yarn.util.TestLinuxResourceCalculatorPlugin.testParsingProcMemFile https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1831/testReport/junit/org.apache.hadoop.yarn.util/TestLinuxResourceCalculatorPlugin/testParsingProcStatAndCpuFile/ both with similar error: java.io.FileNotFoundException: /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/target/test-dir/MEMINFO_238849741 (No such file or directory) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3852) test TestLinuxResourceCalculatorPlugin failing
[ https://issues.apache.org/jira/browse/MAPREDUCE-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205996#comment-13205996 ] Hadoop QA commented on MAPREDUCE-3852: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514193/MAPREDUCE-3852.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The patch appears to cause tar ant target to fail. -1 eclipse:eclipse. The patch failed to build with eclipse:eclipse. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed the unit tests build +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1843//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1843//console This message is automatically generated. test TestLinuxResourceCalculatorPlugin failing -- Key: MAPREDUCE-3852 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3852 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1 Affects Versions: 0.24.0, 0.23.2 Reporter: Thomas Graves Assignee: Thomas Graves Priority: Blocker Attachments: MAPREDUCE-3852.patch tests are failing: org.apache.hadoop.yarn.util.TestLinuxResourceCalculatorPlugin.testParsingProcStatAndCpuFile org.apache.hadoop.yarn.util.TestLinuxResourceCalculatorPlugin.testParsingProcMemFile https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1831/testReport/junit/org.apache.hadoop.yarn.util/TestLinuxResourceCalculatorPlugin/testParsingProcStatAndCpuFile/ both with similar error: java.io.FileNotFoundException: /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/target/test-dir/MEMINFO_238849741 (No such file or directory) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3852) test TestLinuxResourceCalculatorPlugin failing
[ https://issues.apache.org/jira/browse/MAPREDUCE-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13206000#comment-13206000 ] Thomas Graves commented on MAPREDUCE-3852: -- The jenkins build patched the wrong file. This patch is for the top level hadoop-project/pom.xml. The build does a patch -p1 which patched hadoop-mapreduce-project/pom.xml test TestLinuxResourceCalculatorPlugin failing -- Key: MAPREDUCE-3852 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3852 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1 Affects Versions: 0.24.0, 0.23.2 Reporter: Thomas Graves Assignee: Thomas Graves Priority: Blocker Attachments: MAPREDUCE-3852.patch tests are failing: org.apache.hadoop.yarn.util.TestLinuxResourceCalculatorPlugin.testParsingProcStatAndCpuFile org.apache.hadoop.yarn.util.TestLinuxResourceCalculatorPlugin.testParsingProcMemFile https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1831/testReport/junit/org.apache.hadoop.yarn.util/TestLinuxResourceCalculatorPlugin/testParsingProcStatAndCpuFile/ both with similar error: java.io.FileNotFoundException: /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/target/test-dir/MEMINFO_238849741 (No such file or directory) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira