The avro documentaion here says it is possible but doesnt say how to configure the Avrojob in the driver.
http://avro.apache.org/docs/1.7.4/api/java/org/apache/avro/mapreduce/AvroMultipleOutputs.html -Nishanth On Thu, Jun 25, 2015 at 4:10 PM, Sam Groth <sgr...@yahoo-inc.com> wrote: > Looking at the example (http://avro.apache.org/docs/current/mr.html), I > don't think it would be possible to configure multiple output schemas in > one job. A JobConf can only set one writer schema with one output path ( > http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapred/JobConf.html). > I believe it is required that all output data from a job has the same > schema. I have not seen any use case where a map reduce job can have > multiple output schemas. > > > Sam > > > > > On Thursday, June 25, 2015 4:35 PM, Nishanth S <chinchu2...@gmail.com> > wrote: > > > Thank you Sam.I am trying to read only one binary file in map reduce and > split that into 4 avro files each having different schema.I am trying to do > this in one job but I am still not sure how to specify multipleoutput > schemas to an Avrojob instance.Do we need to create multiple instances of > Avrojob in the map reduce driver to do this?. > > Thanks, > Nishan > > On Thu, Jun 25, 2015 at 2:53 PM, Sam Groth <sgr...@yahoo-inc.com> wrote: > > If you process 4 files with schemas A, B, C, and D as the writer schemas, > then I would assume that you would want to specify the reader schema using > the setInput*Schema methods. Then you can set the writer schema with the > methods that you are calling. To be clear all data processed by the job > should have one reader schema that is determined when the data is read, and > there should also be one writer schema (possibly different from the reader > schema) when the data is written back to files. If you need to process the > data from each schema independently, you should probably create one job for > each schema. > > Disclaimer: I have never used the AvroJob interface directly; so this is > just me inferring what I think it should do based on my experience with > AvroStorage and the other language specific Avro interfaces. > > Hope this helps, > Sam > > > > On Thursday, June 25, 2015 12:53 PM, Nishanth S <chinchu2...@gmail.com> > wrote: > > > > Hello All, > > We are using avro 1.7.7 and hadoop 2.5.1 in our project.We need to > process a mixed mode binary file using map reduce and have the output as > multiple avro files and each of these avro files would have different avro > schemas.I looked at AvroMultipleOutputs class but did not completely > understand on what needs to be done in the driver class.This is a map only > job the output of which should be 4 different avro files(which has > different avro schemas) into different hdfs directories. > > Do we need to set all key and value avro schemas to Avrojob in driver > class? > > AvroJob.setOutputKeySchema(job, Schema.create(Schema.Type.NULL)); > AvroJob.setOutputValueSchema(job, A.getClassSchema()); > > > > Now if I have schemas B,C and D how would these be set to > AvroJob?.Thanks for your help. > > > Thanks, > Nishan > > > > > > > >