Re: Avro serialization

2014-04-04 Thread Ron Gonzalez
Thanks will take a look...

Sent from my iPad

> On Apr 3, 2014, at 7:49 AM, FRANK AUSTIN NOTHAFT  
> wrote:
> 
> We use avro objects in our project, and have a Kryo serializer for generic 
> Avro SpecificRecords. Take a look at:
> 
> https://github.com/bigdatagenomics/adam/blob/master/adam-core/src/main/scala/edu/berkeley/cs/amplab/adam/serialization/ADAMKryoRegistrator.scala
> 
> Also, Matt Massie has a good blog post about this at 
> http://zenfractal.com/2013/08/21/a-powerful-big-data-trio/.
> 
> Frank Austin Nothaft
> fnoth...@berkeley.edu
> fnoth...@eecs.berkeley.edu
> 202-340-0466
> 
> 
>> On Thu, Apr 3, 2014 at 7:16 AM, Ian O'Connell  wrote:
>> Objects been transformed need to be one of these in flight. Source data can 
>> just use the mapreduce input formats, so anything you can do with mapred. 
>> doing an avro one for this you probably want one of :
>> https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/*ProtoBuf*
>> 
>> or just whatever your using at the moment to open them in a MR job probably 
>> could be re-purposed
>> 
>> 
>>> On Thu, Apr 3, 2014 at 7:11 AM, Ron Gonzalez  wrote:
>>> 
>>> Hi,
>>>   I know that sources need to either be java serializable or use kryo 
>>> serialization.
>>>   Does anyone have sample code that reads, transforms and writes avro files 
>>> in spark?
>>> 
>>> Thanks,
>>> Ron
> 


Re: Avro serialization

2014-04-03 Thread FRANK AUSTIN NOTHAFT
We use avro objects in our project, and have a Kryo serializer for generic
Avro SpecificRecords. Take a look at:

https://github.com/bigdatagenomics/adam/blob/master/adam-core/src/main/scala/edu/berkeley/cs/amplab/adam/serialization/ADAMKryoRegistrator.scala

Also, Matt Massie has a good blog post about this at
http://zenfractal.com/2013/08/21/a-powerful-big-data-trio/.

Frank Austin Nothaft
fnoth...@berkeley.edu
fnoth...@eecs.berkeley.edu
202-340-0466


On Thu, Apr 3, 2014 at 7:16 AM, Ian O'Connell  wrote:

> Objects been transformed need to be one of these in flight. Source data
> can just use the mapreduce input formats, so anything you can do with
> mapred. doing an avro one for this you probably want one of :
>
> https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/*ProtoBuf*
>
> or just whatever your using at the moment to open them in a MR job
> probably could be re-purposed
>
>
> On Thu, Apr 3, 2014 at 7:11 AM, Ron Gonzalez  wrote:
>
>>
>>   Hi,
>>   I know that sources need to either be java serializable or use kryo
>> serialization.
>>   Does anyone have sample code that reads, transforms and writes avro
>> files in spark?
>>
>> Thanks,
>> Ron
>>
>
>


Re: Avro serialization

2014-04-03 Thread Ian O'Connell
Objects been transformed need to be one of these in flight. Source data can
just use the mapreduce input formats, so anything you can do with mapred.
doing an avro one for this you probably want one of :
https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/*ProtoBuf*

or just whatever your using at the moment to open them in a MR job probably
could be re-purposed


On Thu, Apr 3, 2014 at 7:11 AM, Ron Gonzalez  wrote:

>
>   Hi,
>   I know that sources need to either be java serializable or use kryo
> serialization.
>   Does anyone have sample code that reads, transforms and writes avro
> files in spark?
>
> Thanks,
> Ron
>


Avro serialization

2014-04-03 Thread Ron Gonzalez

Hi,
  I know that sources need to either be java serializable or use kryo 
serialization.
  Does anyone have sample code that reads, transforms and writes avro files in 
spark?

Thanks,
Ron