Re: Message Serialization

2013-08-08 Thread Jay Kreps
I think we discuss that a little in this paper: http://sites.computer.org/debull/A12june/pipeline.pdf -Jay On Thu, Aug 8, 2013 at 10:08 AM, Mark wrote: > I've read that LinkedIn uses Avro for their message serialization. Was > there any particular reason this was chosen say over something like

Re: Kafka/Hadoop consumers and producers

2013-08-08 Thread Andrew Psaltis
Felix, The Camus route is the direction I have headed for allot of the reasons that you described. The only wrinkle is we are still on Kafka 0.7.3 so I am in the process of back porting this patch: https://github.com/linkedin/camus/commit/87917a2aea46da9d21c8f67129f6463af52f7aa8 that is descri

Re: Kafka/Hadoop consumers and producers

2013-08-08 Thread Felix GV
The contrib code is simple and probably wouldn't require too much work to fix, but it's a lot less robust than Camus, so you would ideally need to do some work to make it solid against all edge cases, failure scenarios and performance bottlenecks... I would definitely recommend investing in Camus

Re: Message Serialization

2013-08-08 Thread Yang
I did a comparison between Thrift vs PB vs Avro about 3 years ago. at the time, Avro was faster than PB than Thrift. Avro has schema evolution (mentioned in the kafka paper). On Thu, Aug 8, 2013 at 10:08 AM, Mark wrote: > I've read that LinkedIn uses Avro for their message serialization. Was >

Message Serialization

2013-08-08 Thread Mark
I've read that LinkedIn uses Avro for their message serialization. Was there any particular reason this was chosen say over something like Thrift or ProtocolBuffers? Was the main motivating factor the native handling of Avro in Hadoop?

Re: Kafka/Hadoop consumers and producers

2013-08-08 Thread psaltis . andrew
We also have a need today to ETL from Kafka into Hadoop and we do not currently nor have any plans to use Avro. So is the official direction based on this discussion to ditch the Kafka contrib code and direct people to use Camus without Avro as Ken described or are both solutions going to surv