Re: Avro serialization
Thanks will take a look... Sent from my iPad > On Apr 3, 2014, at 7:49 AM, FRANK AUSTIN NOTHAFT > wrote: > > We use avro objects in our project, and have a Kryo serializer for generic > Avro SpecificRecords. Take a look at: > > https://github.com/bigdatagenomics/adam/blob/master/adam-core/src/main/scala/edu/berkeley/cs/amplab/adam/serialization/ADAMKryoRegistrator.scala > > Also, Matt Massie has a good blog post about this at > http://zenfractal.com/2013/08/21/a-powerful-big-data-trio/. > > Frank Austin Nothaft > fnoth...@berkeley.edu > fnoth...@eecs.berkeley.edu > 202-340-0466 > > >> On Thu, Apr 3, 2014 at 7:16 AM, Ian O'Connell wrote: >> Objects been transformed need to be one of these in flight. Source data can >> just use the mapreduce input formats, so anything you can do with mapred. >> doing an avro one for this you probably want one of : >> https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/*ProtoBuf* >> >> or just whatever your using at the moment to open them in a MR job probably >> could be re-purposed >> >> >>> On Thu, Apr 3, 2014 at 7:11 AM, Ron Gonzalez wrote: >>> >>> Hi, >>> I know that sources need to either be java serializable or use kryo >>> serialization. >>> Does anyone have sample code that reads, transforms and writes avro files >>> in spark? >>> >>> Thanks, >>> Ron >
Re: Avro serialization
We use avro objects in our project, and have a Kryo serializer for generic Avro SpecificRecords. Take a look at: https://github.com/bigdatagenomics/adam/blob/master/adam-core/src/main/scala/edu/berkeley/cs/amplab/adam/serialization/ADAMKryoRegistrator.scala Also, Matt Massie has a good blog post about this at http://zenfractal.com/2013/08/21/a-powerful-big-data-trio/. Frank Austin Nothaft fnoth...@berkeley.edu fnoth...@eecs.berkeley.edu 202-340-0466 On Thu, Apr 3, 2014 at 7:16 AM, Ian O'Connell wrote: > Objects been transformed need to be one of these in flight. Source data > can just use the mapreduce input formats, so anything you can do with > mapred. doing an avro one for this you probably want one of : > > https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/*ProtoBuf* > > or just whatever your using at the moment to open them in a MR job > probably could be re-purposed > > > On Thu, Apr 3, 2014 at 7:11 AM, Ron Gonzalez wrote: > >> >> Hi, >> I know that sources need to either be java serializable or use kryo >> serialization. >> Does anyone have sample code that reads, transforms and writes avro >> files in spark? >> >> Thanks, >> Ron >> > >
Re: Avro serialization
Objects been transformed need to be one of these in flight. Source data can just use the mapreduce input formats, so anything you can do with mapred. doing an avro one for this you probably want one of : https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/*ProtoBuf* or just whatever your using at the moment to open them in a MR job probably could be re-purposed On Thu, Apr 3, 2014 at 7:11 AM, Ron Gonzalez wrote: > > Hi, > I know that sources need to either be java serializable or use kryo > serialization. > Does anyone have sample code that reads, transforms and writes avro > files in spark? > > Thanks, > Ron >
Avro serialization
Hi, I know that sources need to either be java serializable or use kryo serialization. Does anyone have sample code that reads, transforms and writes avro files in spark? Thanks, Ron