[jira] Updated: (MAPREDUCE-1635) ResourceEstimator does not work after MAPREDUCE-842
[ https://issues.apache.org/jira/browse/MAPREDUCE-1635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod K V updated MAPREDUCE-1635: - Release Note: Fixed a bug related to resource estimation for disk-based scheduling by modifying TaskTracker to return correct map output size for the completed maps and -1 for other tasks or failures. (was: Fixed a bug in TaskTracker to return correct map output size for the completed maps and -1 for other tasks or failures.) ResourceEstimator does not work after MAPREDUCE-842 --- Key: MAPREDUCE-1635 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1635 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 0.21.0 Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Fix For: 0.21.0 Attachments: patch-1635-1.txt, patch-1635-ydist.txt, patch-1635.txt MAPREDUCE-842 changed Child's mapred.local.dir to have attemptDir as the base local directory. Also assumption is that org.apache.hadoop.mapred.MapOutputFile always gets Child's mapred.local.dir. But, MapOuptutFile.getOutputFile() is called from TaskTracker's conf, which does not find the output file. Thus TaskTracker.tryToGetOutputSize() always returns -1. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1635) ResourceEstimator does not work after MAPREDUCE-842
[ https://issues.apache.org/jira/browse/MAPREDUCE-1635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-1635: --- Release Note: Fixed a bug in TaskTracker to return correct map output size for the completed maps and -1 for other tasks or failures. ResourceEstimator does not work after MAPREDUCE-842 --- Key: MAPREDUCE-1635 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1635 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 0.21.0 Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Fix For: 0.22.0 Attachments: patch-1635-1.txt, patch-1635-ydist.txt, patch-1635.txt MAPREDUCE-842 changed Child's mapred.local.dir to have attemptDir as the base local directory. Also assumption is that org.apache.hadoop.mapred.MapOutputFile always gets Child's mapred.local.dir. But, MapOuptutFile.getOutputFile() is called from TaskTracker's conf, which does not find the output file. Thus TaskTracker.tryToGetOutputSize() always returns -1. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1635) ResourceEstimator does not work after MAPREDUCE-842
[ https://issues.apache.org/jira/browse/MAPREDUCE-1635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod K V updated MAPREDUCE-1635: - Status: Resolved (was: Patch Available) Resolution: Fixed I just committed this to trunk. Thanks Amareshwari! ResourceEstimator does not work after MAPREDUCE-842 --- Key: MAPREDUCE-1635 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1635 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 0.21.0 Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Fix For: 0.22.0 Attachments: patch-1635-1.txt, patch-1635-ydist.txt, patch-1635.txt MAPREDUCE-842 changed Child's mapred.local.dir to have attemptDir as the base local directory. Also assumption is that org.apache.hadoop.mapred.MapOutputFile always gets Child's mapred.local.dir. But, MapOuptutFile.getOutputFile() is called from TaskTracker's conf, which does not find the output file. Thus TaskTracker.tryToGetOutputSize() always returns -1. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (MAPREDUCE-1635) ResourceEstimator does not work after MAPREDUCE-842
[ https://issues.apache.org/jira/browse/MAPREDUCE-1635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod K V updated MAPREDUCE-1635: - Status: Open (was: Patch Available) ResourceEstimator does not work after MAPREDUCE-842 --- Key: MAPREDUCE-1635 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1635 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 0.21.0 Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Fix For: 0.22.0 Attachments: patch-1635-1.txt, patch-1635.txt MAPREDUCE-842 changed Child's mapred.local.dir to have attemptDir as the base local directory. Also assumption is that org.apache.hadoop.mapred.MapOutputFile always gets Child's mapred.local.dir. But, MapOuptutFile.getOutputFile() is called from TaskTracker's conf, which does not find the output file. Thus TaskTracker.tryToGetOutputSize() always returns -1. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (MAPREDUCE-1635) ResourceEstimator does not work after MAPREDUCE-842
[ https://issues.apache.org/jira/browse/MAPREDUCE-1635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod K V updated MAPREDUCE-1635: - Status: Patch Available (was: Open) Hadoop Flags: [Reviewed] +1. The patch looks good to me. Rerunning it through Hudson so I can commit it after it's blessings. ResourceEstimator does not work after MAPREDUCE-842 --- Key: MAPREDUCE-1635 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1635 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 0.21.0 Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Fix For: 0.22.0 Attachments: patch-1635-1.txt, patch-1635.txt MAPREDUCE-842 changed Child's mapred.local.dir to have attemptDir as the base local directory. Also assumption is that org.apache.hadoop.mapred.MapOutputFile always gets Child's mapred.local.dir. But, MapOuptutFile.getOutputFile() is called from TaskTracker's conf, which does not find the output file. Thus TaskTracker.tryToGetOutputSize() always returns -1. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (MAPREDUCE-1635) ResourceEstimator does not work after MAPREDUCE-842
[ https://issues.apache.org/jira/browse/MAPREDUCE-1635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-1635: --- Status: Open (was: Patch Available) ResourceEstimator does not work after MAPREDUCE-842 --- Key: MAPREDUCE-1635 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1635 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 0.21.0 Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Fix For: 0.22.0 Attachments: patch-1635.txt MAPREDUCE-842 changed Child's mapred.local.dir to have attemptDir as the base local directory. Also assumption is that org.apache.hadoop.mapred.MapOutputFile always gets Child's mapred.local.dir. But, MapOuptutFile.getOutputFile() is called from TaskTracker's conf, which does not find the output file. Thus TaskTracker.tryToGetOutputSize() always returns -1. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1635) ResourceEstimator does not work after MAPREDUCE-842
[ https://issues.apache.org/jira/browse/MAPREDUCE-1635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-1635: --- Attachment: patch-1635-1.txt Patch adds javadoc to MapOutputFile and changes catch(Exception) to catch(IOException) as suggested by Ravi. ResourceEstimator does not work after MAPREDUCE-842 --- Key: MAPREDUCE-1635 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1635 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 0.21.0 Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Fix For: 0.22.0 Attachments: patch-1635-1.txt, patch-1635.txt MAPREDUCE-842 changed Child's mapred.local.dir to have attemptDir as the base local directory. Also assumption is that org.apache.hadoop.mapred.MapOutputFile always gets Child's mapred.local.dir. But, MapOuptutFile.getOutputFile() is called from TaskTracker's conf, which does not find the output file. Thus TaskTracker.tryToGetOutputSize() always returns -1. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1635) ResourceEstimator does not work after MAPREDUCE-842
[ https://issues.apache.org/jira/browse/MAPREDUCE-1635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-1635: --- Status: Patch Available (was: Open) ResourceEstimator does not work after MAPREDUCE-842 --- Key: MAPREDUCE-1635 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1635 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 0.21.0 Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Fix For: 0.22.0 Attachments: patch-1635-1.txt, patch-1635.txt MAPREDUCE-842 changed Child's mapred.local.dir to have attemptDir as the base local directory. Also assumption is that org.apache.hadoop.mapred.MapOutputFile always gets Child's mapred.local.dir. But, MapOuptutFile.getOutputFile() is called from TaskTracker's conf, which does not find the output file. Thus TaskTracker.tryToGetOutputSize() always returns -1. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1635) ResourceEstimator does not work after MAPREDUCE-842
[ https://issues.apache.org/jira/browse/MAPREDUCE-1635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-1635: --- Attachment: patch-1635.txt I think the solution is to move the calculation of task output size to Task, instead of TaskTracker trying to construct the output file and failing. Task already has all the information of MapOutputFile. So, Task can set the output size in its last update, before sending umbilical.done(). Attached patch does the above fix. I added a MiniMR test to test task output sizes for map-only job, map-reduce job and a failed job. In trunk, the log saying reported output size... in TaskTracker.TaskInProgress.reportDone() does not make sense, because setOutputSize() happens after the reportDone() call. But, with the attached patch it makes sense. I validated that the log prints proper value with patch. Patch removes following null checks in the code : {code} - Path tmp_output = mapOutputFile.getOutputFile(); - if(tmp_output == null) -return 0; - FileSystem localFS = FileSystem.getLocal(conf); - FileStatus stat = localFS.getFileStatus(tmp_output); - if(stat == null) -return 0; {code} Because, mapOutputFile.getOutputFile() or localFS.getFileStatus(tmp_output) would never return null. Those calls either return proper value or throw an Exception. And the method handles Exception properly. Essentially these checks are unreachable code. Moreover, the return values deviate from the documentation that output size should be -1 if it can not be calculated. Also, TaskStatus.outputSize is initialized to -1 to take care of task failures. ResourceEstimator does not work after MAPREDUCE-842 --- Key: MAPREDUCE-1635 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1635 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 0.21.0 Reporter: Amareshwari Sriramadasu Fix For: 0.22.0 Attachments: patch-1635.txt MAPREDUCE-842 changed Child's mapred.local.dir to have attemptDir as the base local directory. Also assumption is that org.apache.hadoop.mapred.MapOutputFile always gets Child's mapred.local.dir. But, MapOuptutFile.getOutputFile() is called from TaskTracker's conf, which does not find the output file. Thus TaskTracker.tryToGetOutputSize() always returns zero. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1635) ResourceEstimator does not work after MAPREDUCE-842
[ https://issues.apache.org/jira/browse/MAPREDUCE-1635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated MAPREDUCE-1635: --- Description: MAPREDUCE-842 changed Child's mapred.local.dir to have attemptDir as the base local directory. Also assumption is that org.apache.hadoop.mapred.MapOutputFile always gets Child's mapred.local.dir. But, MapOuptutFile.getOutputFile() is called from TaskTracker's conf, which does not find the output file. Thus TaskTracker.tryToGetOutputSize() always returns -1. was: MAPREDUCE-842 changed Child's mapred.local.dir to have attemptDir as the base local directory. Also assumption is that org.apache.hadoop.mapred.MapOutputFile always gets Child's mapred.local.dir. But, MapOuptutFile.getOutputFile() is called from TaskTracker's conf, which does not find the output file. Thus TaskTracker.tryToGetOutputSize() always returns zero. ResourceEstimator does not work after MAPREDUCE-842 --- Key: MAPREDUCE-1635 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1635 Project: Hadoop Map/Reduce Issue Type: Bug Components: tasktracker Affects Versions: 0.21.0 Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Fix For: 0.22.0 Attachments: patch-1635.txt MAPREDUCE-842 changed Child's mapred.local.dir to have attemptDir as the base local directory. Also assumption is that org.apache.hadoop.mapred.MapOutputFile always gets Child's mapred.local.dir. But, MapOuptutFile.getOutputFile() is called from TaskTracker's conf, which does not find the output file. Thus TaskTracker.tryToGetOutputSize() always returns -1. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.