[ 
https://issues.apache.org/jira/browse/MAPREDUCE-157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12742714#action_12742714
 ] 

Philip Zeyliger commented on MAPREDUCE-157:
-------------------------------------------

Avro would force you in to a schema, and I think having a schema is the only 
way to get stability in the format.  Yes, there's probably overhead, but if 
we're using Avro for other things (i.e., all RPCs), we may as well fix those 
overheads when we get to them.  (It may also be a net win to store the data in 
binary avro format, and write an "avrocat" to deserialize into text before 
pushing to tools like awk, but I do understand the desire for a text format.)

All that said, you have specific needs in mind here, and I'm mostly waxing 
poetical, so I'll certainly defer.

-- Philip

> Job History log file format is not friendly for external tools.
> ---------------------------------------------------------------
>
>                 Key: MAPREDUCE-157
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-157
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Owen O'Malley
>            Assignee: Jothi Padmanabhan
>
> Currently, parsing the job history logs with external tools is very difficult 
> because of the format. The most critical problem is that newlines aren't 
> escaped in the strings. That makes using tools like grep, sed, and awk very 
> tricky.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to