[
https://issues.apache.org/jira/browse/MAPREDUCE-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797352#action_12797352
]
Scott Chen commented on MAPREDUCE-1218:
---------------------------------------
Vinod, Thank you very much for the careful review.
- LinuxResourceCalculatorPlugin.java
{quote}
- The method #getCpuUsage() in itself doesn't update the cpu-usage.
So, if one makes two calls to this method separated by a time interval, the
result won't reflect the updated cpu-usage unless calls to
#getCumulativeCpuTime() are explicitly made in between. This should be fixed by
calling #readProcStatFile() in this method also. The main method should be
modified to call this method twice and the tests should also verify this.
{quote}
I agree. getCpuUsage() should be able to be called independently. I have added
readProcStatFile() in this method. In testParsingProcStatAndCpuFile(), I have
called this function twice without calling getCumulativeCpuTime() to verify
that it can be used alone.
{quote}
- You've changed CPU_TIME_FORMAT from "^cpu[0-9]*[ \t]([0-9])[
\t]([0-9])[ \t]([0-9])[ \t]." to "^cpu[ \t]([0-9]*)[ \t]([0-9])[ \t]([0-9])[
\t].*". I guess the earlier is correct with cpu names not having any space/tab?
{quote}
Sorry, I should have explained this when I posted the patch. Originally I parse
cpu0 cpu1 ... cpuN and sum there utime, stime and ntime. But latter I have
found that cpu (without a number) contains the total of these values. So I have
change the pattern to just parse this one line. This change is confusing.
{quote}
- Why even have MINIMUM_UPDATE_INTERVAL in updateCpuUsage()? If this
is mainly used for making sure sampleTime is not equal to lastSampleTime, then
we can do so directly and remove MINIMUM_UPDATE_INTERVAL altogether.
{quote}
This is for preventing people keep calling getCumulativeCpuTime (and now
getCpuUsage) too often. Then the second update will have a very short update
window. This may result inaccurate values. In this case, we can just skip the
update and report the old value. Since the old value is still fresh in this
case, it will still be quite accurate.
- TaskTracker.java
{quote}
- To handle the deprecation of
mapreduce.tasktracker.memorycalculatorplugin in TaskTracker, for memory
calculations we should first try to use the class denoted by this configuration
if present, otherwise only we should fall back to the new resource-calculator.
To facilitate this we will also retain a deprecated
TTConfig.TT_MEMORY_CALCULATOR_PLUGIN constant.
{quote}
I agree. Now it is kept in the patch.
{quote}
- Nit: +661 "LOG.info(" Using MemoryCalculatorPlugin : " +
resourceCalculatorPlugin);" should instead be "LOG.info(" Using
ResourceCalculatorPlugin : " + resourceCalculatorPlugin);"
{quote}
Thank you for the careful review. It is fixed.
{quote}
- Please annotate Dummy\{Resource|Memory\}CalculatorPlugin classes as
@InterfaceAudience.Private because they both are only test specific.
{quote}
Have done the necessary change.
{quote}
- We should document mapreduce.tasktracker.resourcecalculatorplugin in
mapred-default.xml and remove the documentation for
mapreduce.tasktracker.memorycalculatorplugin from the same.
{quote}
Have done the necessary change.
{quote}
- Please convert TestLinuxResourceCalculatorPlugin and TestTTResourceReporting
into Junit 4 testcases.
{quote}
I have added @Test and @After tags on these test cases. If there are other
changes need to be done, please let me know.
> Collecting cpu and memory usage for TaskTrackers
> ------------------------------------------------
>
> Key: MAPREDUCE-1218
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1218
> Project: Hadoop Map/Reduce
> Issue Type: Sub-task
> Affects Versions: 0.22.0
> Environment: linux
> Reporter: Scott Chen
> Assignee: Scott Chen
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1218-rename.sh, MAPREDUCE-1218-v2.patch,
> MAPREDUCE-1218-v3.patch, MAPREDUCE-1218-v4.patch, MAPREDUCE-1218.patch
>
>
> The information can be used for resource aware scheduling.
> Note that this is related to MAPREDUCE-220. There the per task resource
> information is collected.
> This one collects the per machine information.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.