We use avro objects in our project, and have a Kryo serializer for generic
Avro SpecificRecords. Take a look at:

https://github.com/bigdatagenomics/adam/blob/master/adam-core/src/main/scala/edu/berkeley/cs/amplab/adam/serialization/ADAMKryoRegistrator.scala

Also, Matt Massie has a good blog post about this at
http://zenfractal.com/2013/08/21/a-powerful-big-data-trio/.

Frank Austin Nothaft
fnoth...@berkeley.edu
fnoth...@eecs.berkeley.edu
202-340-0466


On Thu, Apr 3, 2014 at 7:16 AM, Ian O'Connell <i...@ianoconnell.com> wrote:

> Objects been transformed need to be one of these in flight. Source data
> can just use the mapreduce input formats, so anything you can do with
> mapred. doing an avro one for this you probably want one of :
>
> https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/*ProtoBuf*
>
> or just whatever your using at the moment to open them in a MR job
> probably could be re-purposed
>
>
> On Thu, Apr 3, 2014 at 7:11 AM, Ron Gonzalez <zlgonza...@yahoo.com> wrote:
>
>>
>>   Hi,
>>   I know that sources need to either be java serializable or use kryo
>> serialization.
>>   Does anyone have sample code that reads, transforms and writes avro
>> files in spark?
>>
>> Thanks,
>> Ron
>>
>
>

Reply via email to