Re: Kafka Connect and Spark/Storm Comparisons

2015-11-25 Thread Jay Kreps
Hey Dave, We're separating the problem of getting data in and out of Kafka from the problem of transforming it. If you think about ETL (Extract, Transform, Load), what Kafka Connect does is E and L really really well and not T at all; the focus in stream processing systems is T with E and L being

Re: Kafka Connect and Spark/Storm Comparisons

2015-11-25 Thread Cody Koeninger
Spark's direct stream kafka integration should take advantage of data locality if you're running Spark executors on the same nodes as Kafka brokers. On Wed, Nov 25, 2015 at 9:50 AM, Dave Ariens wrote: > I just finished reading up on Kafka Connect< > http://kafka.apache.org/documentation.html#con

Kafka Connect and Spark/Storm Comparisons

2015-11-25 Thread Dave Ariens
I just finished reading up on Kafka Connect and am trying to wrap my head around where it fits within the big data ecosystem. Other than the high level overview provided in the docs I haven't heard much about this feature. My limited understan