[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14180283#comment-14180283
 ] 

Zhijie Shen commented on MAPREDUCE-5933:
----------------------------------------

The latest patch looks much better. There're some minor comments about it.
1. SUBMIT_TIME is set twice
{code}
        tEvent.addEventInfo("SUBMIT_TIME", jse.getSubmitTime());
        tEvent.addEventInfo("QUEUE_NAME", jse.getJobQueueName());
        tEvent.addEventInfo("JOB_NAME", jse.getJobName());
        tEvent.addEventInfo("USER_NAME", jse.getUserName());
        tEvent.addEventInfo("SUBMIT_TIME", jse.getSubmitTime());
{code}

2. Make ""MAPREDUCE_JOB" and "MAPREDUCE_TASK" constants?

3. Would you please make task attempt Id obj toString? Otherwise, it will 
result in a nested structure in json content. Same for the other 
getXXXAttemptId() that is not followed by toString().
{code}
        tEvent.addEventInfo("SUCCESSFUL_TASK_ATTEMPT_ID",
                tfe2.getSuccessfulTaskAttemptId());
{code}

4. In addition to set related entity, it's better to add the job ID to the 
primary filter of a MR task entity, such that we can support a common query as 
follows:
{code}
http://localhost:8188/ws/v1/timeline/MAPREDUCE_TASK?primaryFilter=PARENT_JOB:job_1413998833197_0001
{code}

In fact, there could be some other optimization to speed up the potential 
queries. For example, to answer the query of JHS as follows:
{code}
http://10.22.2.115:19888/jobhistory/attempts/job_1413998833197_0001/m/SUCCESSFUL
{code}
It's good also have task type and task final state been put into the other info 
field for in-memory filtering or even put into primary filter field for index 
in the store (which is much more expensive store space usage).

I think we should do the store schema optimization according the particular 
queries in a separate Jira, as it seems not to be the straightforward addition 
to this patch. Let's focus on posting events in this one.

The patch is working properly on a insecure cluster. Will try this patch on a 
secure cluster too.

> Enable MR AM to post history events to the timeline server
> ----------------------------------------------------------
>
>                 Key: MAPREDUCE-5933
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5933
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>          Components: mr-am
>            Reporter: Zhijie Shen
>            Assignee: Robert Kanter
>         Attachments: MAPREDUCE-5933.patch, MAPREDUCE-5933.patch, 
> MAPREDUCE-5933.patch, MAPREDUCE-5933.patch, MAPREDUCE-5933.patch, 
> MAPREDUCE-5933.patch, mr_timelineserver_response.txt
>
>
> Nowadays, MR AM collects the history events and writes it to HDFS for JHS to 
> source. With the timeline server, MR AM can put these events there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to