subject:"RE\: Avro Schema \+ GenericRecord to HadoopRDD"

Re: Avro Schema + GenericRecord to HadoopRDD

2014-07-30 Thread Laird, Benjamin

That makes sense, thanks Chris. I'm currently reworking my code to use the newAPIHadoopRDD with an AvroSequenceFileInputFormat (see below), but I think I'll run into the same issue. I'll give your suggestion a try. val avroRdd = sc.newAPIHadoopFile(fp, classOf[AvroSequenceFileInputFormat[AvroKey[

RE: Avro Schema + GenericRecord to HadoopRDD

2014-07-29 Thread Severs, Chris

Hi Benjamin, I think the best bet would be to use the Avro code generation stuff to generate a SpecificRecord for your schema and then change the reader to use your specific type rather than GenericRecord. Trying to read up the generic record and then do type inference and spit out a tuple is