I think we discuss that a little in this paper:
http://sites.computer.org/debull/A12june/pipeline.pdf
-Jay
On Thu, Aug 8, 2013 at 10:08 AM, Mark wrote:
> I've read that LinkedIn uses Avro for their message serialization. Was
> there any particular reason this was chosen say over something like
Felix,
The Camus route is the direction I have headed for allot of the reasons
that you described. The only wrinkle is we are still on Kafka 0.7.3 so I am
in the process of back porting this patch:
https://github.com/linkedin/camus/commit/87917a2aea46da9d21c8f67129f6463af52f7aa8
that
is descri
The contrib code is simple and probably wouldn't require too much work to
fix, but it's a lot less robust than Camus, so you would ideally need to do
some work to make it solid against all edge cases, failure scenarios and
performance bottlenecks...
I would definitely recommend investing in Camus
I did a comparison between Thrift vs PB vs Avro about 3 years ago. at the
time, Avro was faster than PB than Thrift.
Avro has schema evolution (mentioned in the kafka paper).
On Thu, Aug 8, 2013 at 10:08 AM, Mark wrote:
> I've read that LinkedIn uses Avro for their message serialization. Was
>
I've read that LinkedIn uses Avro for their message serialization. Was there
any particular reason this was chosen say over something like Thrift or
ProtocolBuffers? Was the main motivating factor the native handling of Avro in
Hadoop?
We also have a need today to ETL from Kafka into Hadoop and we do not currently
nor have any plans to use Avro.
So is the official direction based on this discussion to ditch the Kafka
contrib code and direct people to use Camus without Avro as Ken described or
are both solutions going to surv