[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12909988#action_12909988
 ] 

Dick King commented on MAPREDUCE-2037:
--------------------------------------

Benchmarks to support more realistic validation of putative scheduler 
improvements would benefit from a gridmix3-like tool that can simulate the CPU 
usage patterns of the tasks of the emulated jobs.  That includes both the 
average loads of the various tasks, and also the time variation.  In order to 
develop this information, we need to capture the CPU usage of each task over 
time.

Fortunately, on linux systems, there's a way to capture this.  The 
{{/proc/n/stat}} information appears to capture everything I need.

I would plumb this using {{LinuxResourceCalculatorPlugin}} and {{TaskStatus}} .

The information will be placed in the job history files, in the task attempt 
end records.  This might be placed as a coded character string with a few dozen 
characters.

> Capturing interim progress times, CPU usage, and memory usage, when tasks 
> reach certain progress thresholds
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2037
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2037
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>            Reporter: Dick King
>            Assignee: Dick King
>             Fix For: 0.22.0
>
>
> We would like to capture the following information at certain progress 
> thresholds as a task runs:
>    * Time taken so far
>    * CPU load [either at the time the data are taken, or exponentially 
> smoothed]
>    * Memory load [also either at the time the data are taken, or 
> exponentially smoothed]
> This would be taken at intervals that depend on the task progress plateaus.  
> For example, reducers have three progress ranges -- [0-1/3], (1/3-2/3], and 
> (2/3-3/3] -- where fundamentally different activities happen.  Mappers have 
> different boundaries, I understand, that are not symmetrically placed.  Data 
> capture boundaries should coincide with activity boundaries.  For the state 
> information capture [CPU and memory] we should average over the covered 
> interval.
> This data would flow in with the heartbeats.  It would be placed in the job 
> history as part of the task attempt completion event, so it could be 
> processed by rumen or some similar tool and could drive a benchmark engine.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to