[ https://issues.apache.org/jira/browse/MAPREDUCE-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797352#action_12797352 ]
Scott Chen commented on MAPREDUCE-1218: --------------------------------------- Vinod, Thank you very much for the careful review. - LinuxResourceCalculatorPlugin.java {quote} - The method #getCpuUsage() in itself doesn't update the cpu-usage. So, if one makes two calls to this method separated by a time interval, the result won't reflect the updated cpu-usage unless calls to #getCumulativeCpuTime() are explicitly made in between. This should be fixed by calling #readProcStatFile() in this method also. The main method should be modified to call this method twice and the tests should also verify this. {quote} I agree. getCpuUsage() should be able to be called independently. I have added readProcStatFile() in this method. In testParsingProcStatAndCpuFile(), I have called this function twice without calling getCumulativeCpuTime() to verify that it can be used alone. {quote} - You've changed CPU_TIME_FORMAT from "^cpu[0-9]*[ \t]([0-9])[ \t]([0-9])[ \t]([0-9])[ \t]." to "^cpu[ \t]([0-9]*)[ \t]([0-9])[ \t]([0-9])[ \t].*". I guess the earlier is correct with cpu names not having any space/tab? {quote} Sorry, I should have explained this when I posted the patch. Originally I parse cpu0 cpu1 ... cpuN and sum there utime, stime and ntime. But latter I have found that cpu (without a number) contains the total of these values. So I have change the pattern to just parse this one line. This change is confusing. {quote} - Why even have MINIMUM_UPDATE_INTERVAL in updateCpuUsage()? If this is mainly used for making sure sampleTime is not equal to lastSampleTime, then we can do so directly and remove MINIMUM_UPDATE_INTERVAL altogether. {quote} This is for preventing people keep calling getCumulativeCpuTime (and now getCpuUsage) too often. Then the second update will have a very short update window. This may result inaccurate values. In this case, we can just skip the update and report the old value. Since the old value is still fresh in this case, it will still be quite accurate. - TaskTracker.java {quote} - To handle the deprecation of mapreduce.tasktracker.memorycalculatorplugin in TaskTracker, for memory calculations we should first try to use the class denoted by this configuration if present, otherwise only we should fall back to the new resource-calculator. To facilitate this we will also retain a deprecated TTConfig.TT_MEMORY_CALCULATOR_PLUGIN constant. {quote} I agree. Now it is kept in the patch. {quote} - Nit: +661 "LOG.info(" Using MemoryCalculatorPlugin : " + resourceCalculatorPlugin);" should instead be "LOG.info(" Using ResourceCalculatorPlugin : " + resourceCalculatorPlugin);" {quote} Thank you for the careful review. It is fixed. {quote} - Please annotate Dummy\{Resource|Memory\}CalculatorPlugin classes as @InterfaceAudience.Private because they both are only test specific. {quote} Have done the necessary change. {quote} - We should document mapreduce.tasktracker.resourcecalculatorplugin in mapred-default.xml and remove the documentation for mapreduce.tasktracker.memorycalculatorplugin from the same. {quote} Have done the necessary change. {quote} - Please convert TestLinuxResourceCalculatorPlugin and TestTTResourceReporting into Junit 4 testcases. {quote} I have added @Test and @After tags on these test cases. If there are other changes need to be done, please let me know. > Collecting cpu and memory usage for TaskTrackers > ------------------------------------------------ > > Key: MAPREDUCE-1218 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1218 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Affects Versions: 0.22.0 > Environment: linux > Reporter: Scott Chen > Assignee: Scott Chen > Fix For: 0.22.0 > > Attachments: MAPREDUCE-1218-rename.sh, MAPREDUCE-1218-v2.patch, > MAPREDUCE-1218-v3.patch, MAPREDUCE-1218-v4.patch, MAPREDUCE-1218.patch > > > The information can be used for resource aware scheduling. > Note that this is related to MAPREDUCE-220. There the per task resource > information is collected. > This one collects the per machine information. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.