[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14161360#comment-14161360
 ] 

Zhijie Shen commented on MAPREDUCE-5933:
----------------------------------------

[~rkanter], thanks for being patient with my comments. The latest patch looks 
much better. I've applied the patch locally, and run through some MR examples. 
Here're some additional comments:

bq.  Alternatively, the REST API should be case insensitive

URL should be case-sensitive: 
http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.2.3

bq. Are you aware of any reason for that or if it's just an oversight and I 
should fix it?

I'm not aware of it, but it seems that MR job is not generating this event 
except the rumen tool.

IMHO, we can do the following instead of using TaskType as the entity type: at 
the task level, we only have one entity type - "MAPREDUCE_TASK", and put 
TaskType into the other info section if the information is available. The 
reason to put them under the same entity type is to make these TASK entities 
close to each other, and to be friendly to a range query of the tasks of one 
particular job. In fact, it seems that JOB_SETUP, JOB_CLEANUP, TASK_CLEANUP 
task event are not generated by the MR job.

1. Make keys be more consistent: "\[XXXX_\]COUNTERS" ->  
"\[XXXX_\]COUNTER_GROUPS", "name" -> "NAME", "displayName" -> "DISPLAY_NAME", 
"value" -> "VALUE".
{code}
  public JsonNode countersToJSON(Counters counters) {
{code}

2. The following event timestamp may be problematic, the timestamp should 
reflect the moment when the event happens instead of when the event is sent to 
the timeline server. Timeline server uses this timestamp to sort events. For 
example, if start event happens at 1 and is recorded at 4, while finish event 
happens at 2 and is recorded 3. We still need to consider start event happens 
before end event. It seems that most history events have the associated 
timestamp.
{code}
    tEvent.setTimestamp(System.currentTimeMillis());
{code}

3. All the entities are put into the default domain, since we haven't specify 
one (new feature from YARN-2102). In fact, we should allow MR job to config the 
domain where it wants to put the entities, but let's do it in another Jira.

4. In testTimelineEventHandling, would you please set meaningful timestamp for 
the events to make them be ordered logically, and check getEntities's response 
preserve the order.

5. It's good to have a end-to-end test like NotificationTestCase if it is not 
really a big addition (e.g. NotificationTestCase).

> Enable MR AM to post history events to the timeline server
> ----------------------------------------------------------
>
>                 Key: MAPREDUCE-5933
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5933
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>          Components: mr-am
>            Reporter: Zhijie Shen
>            Assignee: Robert Kanter
>         Attachments: MAPREDUCE-5933.patch, MAPREDUCE-5933.patch, 
> MAPREDUCE-5933.patch, MAPREDUCE-5933.patch, MAPREDUCE-5933.patch, 
> mr_timelineserver_response.txt
>
>
> Nowadays, MR AM collects the history events and writes it to HDFS for JHS to 
> source. With the timeline server, MR AM can put these events there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to