Hi, Depending on the response you get here, you might also post the question separately on avro-user.
On Sat, Nov 26, 2011 at 1:46 PM, Leonardo Urbina <lurb...@mit.edu> wrote: > Hey everyone, > > First time posting to the list. I'm currently writing a hadoop job that > will run daily and whose output will be part of the part of the next day's > input. Also, the output will potentially be read by other programs for > later analysis. > > Since my program's output is used as part of the next day's input, it would > be nice if it was stored in some binary format that is easy to read the > next time around. But this format also needs to be readable by other > outside programs, not necessarily written in Java. After searching for a > while it seems that Avro is what I want to be using. In any case, I have > been looking around for a while and I can't seem to find a single example > of how to use Avro within a Hadoop job. > > It seems that in order to use Avro I need to change the io.serializations > value, however I don't know which value should be specified. Furthermore, I > found that there are classes Avro{Input,Output}Format but these use a > series of other Avro classes which, as far as I understand, seem need the > use of other Avro classes such as AvroWrapper, AvroKey, AvroValue, and as > far as I am concerned Avro* (with * replaced with pretty much any Hadoop > class name). It seems however that these are used so that the Avro format > is used throughout the Hadoop process to pass objects around. > > I just want to use Avro to save my output and read it again as input next > time around. So far I have been using SequenceFile{Input,Output}Format, and > have implemented the Writable interface in the relevant classes, however > this is not portable to other languages. Is there a way to use Avro without > a substantial rewrite (using Avro* classes) of my Hadoop job? Thanks in > advance, > > Best, > -Leo > > -- > Leo Urbina > Massachusetts Institute of Technology > Department of Electrical Engineering and Computer Science > Department of Mathematics > lurb...@mit.edu >