[jira] [Commented] (MAPREDUCE-6337) add a mode to replay MR job history files to the timeline service

Zhijie Shen (JIRA) Mon, 04 May 2015 15:10:57 -0700

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14527436#comment-14527436
 ]


Zhijie Shen commented on MAPREDUCE-6337:
----------------------------------------

Sangjin, thanks for the patch. Here're some high level comments:

1. I've a concern about the way to replay MR job history. Now the approach is 
to read all the history and convert it into entities, and write it once for a 
job. This may not reflect the realistic workload pattern, at least different 
from the current way MR puts the timeline data. Shall we add one more option to 
control 1) put all entities once per job, 2) put one entity per call and 3) 
repeatedly put entity per event. The third option is more close to current MR 
putting method, though it doesn't mean to be the optimal approach. Perhaps 
different options may affect the write performance.

2. TimelineEntityConverter is doing something similar to what we've done in 
MAPREDUCE-6237, but in a bit different way, and the entity composition is also 
slightly different, such as saving counter in metric. I think the reason why 
MAPREDUCE-6237 may not be reused is that we convert from XXXXInfo to entity 
while MAPREDUCE-6237 converts XXXXEvent to entity. Perhaps we want to refactor 
the code and consolidate the conversion later.

> add a mode to replay MR job history files to the timeline service
> -----------------------------------------------------------------
>
>                 Key: MAPREDUCE-6337
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6337
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>            Reporter: Sangjin Lee
>            Assignee: Sangjin Lee
>         Attachments: MAPREDUCE-6337-YARN-2928.001.patch, YARN-3438.000.patch
>
>
> The subtask covers the work on top of YARN-3437 to add a mode to replay MR 
> job history files to the timeline service storage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MAPREDUCE-6337) add a mode to replay MR job history files to the timeline service

Reply via email to