[ https://issues.apache.org/jira/browse/MAPREDUCE-5933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14161360#comment-14161360 ]
Zhijie Shen commented on MAPREDUCE-5933: ---------------------------------------- [~rkanter], thanks for being patient with my comments. The latest patch looks much better. I've applied the patch locally, and run through some MR examples. Here're some additional comments: bq. Alternatively, the REST API should be case insensitive URL should be case-sensitive: http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.2.3 bq. Are you aware of any reason for that or if it's just an oversight and I should fix it? I'm not aware of it, but it seems that MR job is not generating this event except the rumen tool. IMHO, we can do the following instead of using TaskType as the entity type: at the task level, we only have one entity type - "MAPREDUCE_TASK", and put TaskType into the other info section if the information is available. The reason to put them under the same entity type is to make these TASK entities close to each other, and to be friendly to a range query of the tasks of one particular job. In fact, it seems that JOB_SETUP, JOB_CLEANUP, TASK_CLEANUP task event are not generated by the MR job. 1. Make keys be more consistent: "\[XXXX_\]COUNTERS" -> "\[XXXX_\]COUNTER_GROUPS", "name" -> "NAME", "displayName" -> "DISPLAY_NAME", "value" -> "VALUE". {code} public JsonNode countersToJSON(Counters counters) { {code} 2. The following event timestamp may be problematic, the timestamp should reflect the moment when the event happens instead of when the event is sent to the timeline server. Timeline server uses this timestamp to sort events. For example, if start event happens at 1 and is recorded at 4, while finish event happens at 2 and is recorded 3. We still need to consider start event happens before end event. It seems that most history events have the associated timestamp. {code} tEvent.setTimestamp(System.currentTimeMillis()); {code} 3. All the entities are put into the default domain, since we haven't specify one (new feature from YARN-2102). In fact, we should allow MR job to config the domain where it wants to put the entities, but let's do it in another Jira. 4. In testTimelineEventHandling, would you please set meaningful timestamp for the events to make them be ordered logically, and check getEntities's response preserve the order. 5. It's good to have a end-to-end test like NotificationTestCase if it is not really a big addition (e.g. NotificationTestCase). > Enable MR AM to post history events to the timeline server > ---------------------------------------------------------- > > Key: MAPREDUCE-5933 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5933 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: mr-am > Reporter: Zhijie Shen > Assignee: Robert Kanter > Attachments: MAPREDUCE-5933.patch, MAPREDUCE-5933.patch, > MAPREDUCE-5933.patch, MAPREDUCE-5933.patch, MAPREDUCE-5933.patch, > mr_timelineserver_response.txt > > > Nowadays, MR AM collects the history events and writes it to HDFS for JHS to > source. With the timeline server, MR AM can put these events there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)