My reason for reverting the change to jobhistory in avro is more straightforward. Text is good. It has always been good. Even average humans can read text. And jobhistory is something we, as average humans, read a hundred times a day (and we think that every hadoop user should read their job history, which contains a lot of information about their jobs). Along with job configuration.
I have made my dissatisfaction with small-stuff stored as compact binaries known on a very public forum, facebook, known already. On Sep 21, 2009, at 20:42, "Jothi Padmanabhan (JIRA)" <j...@apache.org> wrote: > > [ > https://issues.apache.org/jira/browse/MAPREDUCE-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12758109#action_12758109 > > ] > > Jothi Padmanabhan commented on MAPREDUCE-1016: > ---------------------------------------------- > > I think storing it in JSON is a good idea too. Nevertheless, I think > the best way for other history consumers/tools to insulate > themselves from the underlying changes/incompatibilities would be to > use the JobHistoryParsing API's. At the least, they will abstract > out the underlying storage format. > >> Make the format of the Job History be JSON instead of Avro binary >> ----------------------------------------------------------------- >> >> Key: MAPREDUCE-1016 >> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1016 >> Project: Hadoop Map/Reduce >> Issue Type: Bug >> Reporter: Owen O'Malley >> Assignee: Doug Cutting >> Fix For: 0.21.0, 0.22.0 >> >> >> I forgot that one of the features that would be nice is to off load >> the job history display from the JobTracker. That will be a lot >> easier, if the job history is stored in JSON. Therefore, I think we >> should change the storage now to prevent incompatibilities later. > > -- > This message is automatically generated by JIRA. > - > You can reply to this email to add a comment to the issue online. >