[ 
https://issues.apache.org/jira/browse/TEZ-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14496800#comment-14496800
 ] 

Jason Lowe commented on TEZ-2319:
---------------------------------

MR does not dump the final state all at once, rather it is more like the 
SimpleHistoryLogger.  The JobHistoryEventHandler logs job/task/attempt 
start/stop events to the .jhist avro file in the staging directory as the job 
runs.  Once the job finishes it copies that jhist file over to the done 
intermediate directory for the job history server to pick up.  It does not dump 
it all at once from memory when the job completes.

Note that the MR AM is building the state over time in memory, not because it's 
logging to the jhist file along the way but because it has to provide a UI 
while the job is running.  It could dump the contents to the jhist file all at 
once when the job completes, but it also uses the jhist file as a recovery 
mechanism in case the AM crashes.

I think we'd be OK dumping the events to a file as we get them in a similar way 
to how JobHistoryEventHandler works in the MR AM.  Biggest concern is adding 
multiple logging mechanisms adds to the failure potential.  If we're generating 
events faster than the two loggers can process them then we'll start buffering 
events and putting pressure on the AM heap.

> DAG history in HDFS
> -------------------
>
>                 Key: TEZ-2319
>                 URL: https://issues.apache.org/jira/browse/TEZ-2319
>             Project: Apache Tez
>          Issue Type: New Feature
>            Reporter: Rohini Palaniswamy
>
>   We have processes, that parse jobconf.xml and job history details (map and 
> reduce task details, etc) in avro files from HDFS and load them into hive 
> tables for analysis for mapreduce jobs. Would like to have Tez also make this 
> information written to a history file in HDFS when AM or each DAG completes 
> so that we can do analytics on Tez jobs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to