[ 
https://issues.apache.org/jira/browse/MAPREDUCE-751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12733522#action_12733522
 ] 

Jiaqi Tan commented on MAPREDUCE-751:
-------------------------------------

Is this a request for an implementation, or is this a project that's currently 
midway through implementation with (most of) the planned features, that will be 
released soon? There is some work on the Chukwa side of things on extracting 
and modeling job behavior from job history logs, currently for visualization, 
but we have some work also on Mathematically quantifying the job behavior. It 
would be interesting to see what the synergies are for using job history data.

> Rumen: a tool to extract job characterization data from job tracker logs
> ------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-751
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-751
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Dick King
>
>  We propose a new map/reduce component, rumen, which can be used to process 
> job history logs to produce any or all of the following:
>       * Retrospective info describing the statistical behavior of the
> amount of time it would have taken to launch a job into a certain
> percentage of the number of mapper slots in the log's cluster, given the
> load over the period covered by the log
>       * Statistical info as to the runtimes and shuffle times, etc. of
> the tasks and jobs covered by the log
>       * files describing detailed job trace information, and the
> network topology as inferred from the host locations and rack IDs that
> arise in the job tracker log.  In addition to this facility, rumen
> includes readers for this information to return job and detailed task
> information to other tools.
>         These other tools include a more advanced version of gridmix, and 
> also includes mumak: see blocked issues.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to