[ 
https://issues.apache.org/jira/browse/YARN-1440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13831275#comment-13831275
 ] 

Jeff Zhang commented on YARN-1440:
----------------------------------

@ledion,  the current implementation will be one TFile per application, while 
your method will create one TFile per container which would generate more files.
I guess the reason why the original author adopt TFile is that TFile has one 
index block which allow user quickly find the value. In this way, user could 
quickly find one container's log of one application.  

> Yarn aggregated logs are difficult for external tools to understand
> -------------------------------------------------------------------
>
>                 Key: YARN-1440
>                 URL: https://issues.apache.org/jira/browse/YARN-1440
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: ledion bitincka
>              Labels: log-aggregation, logs, tfile, yarn
>
> The log aggregation feature in Yarn is awesome! However, the file type and 
> format in which the log files are aggregated into (TFile) should either be 
> much simpler or be made pluggable. The current TFile format forces anyone who 
> wants to see the files to either 
> a) use the web UI
> b) use the CLI tools (yarn logs)  or 
> c) write custom code to read the files 
> My suggestion would be to simplify the log collection by collecting and 
> writing the raw log files into a directory structure as follows: 
> {noformat}
> /{log-collection-dir}/{app-id}/{container-id}/{log-file-name} 
> {noformat}
> This way the application developers can (re)use a much wider array of tools 
> to process the logs. 
> For the readers who are not familiar with logs and their format you can find 
> more info the following two blog posts:
> http://hortonworks.com/blog/simplifying-user-logs-management-and-access-in-yarn/
> http://blogs.splunk.com/2013/11/18/hadoop-2-0-rant/



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to