That makes sense, thanks Chris.
I'm currently reworking my code to use the newAPIHadoopRDD with an
AvroSequenceFileInputFormat (see below), but I think I'll run into the
same issue. I'll give your suggestion a try.
val avroRdd = sc.newAPIHadoopFile(fp,
classOf[AvroSequenceFileInputFormat[AvroKey[
Hi Benjamin,
I think the best bet would be to use the Avro code generation stuff to generate
a SpecificRecord for your schema and then change the reader to use your
specific type rather than GenericRecord.
Trying to read up the generic record and then do type inference and spit out a
tuple is