Hi Rod,

The purpose of introducing  WAL mechanism in Spark Streaming as a general 
solution is to make all the receivers be benefit from this mechanism. 

Though as you said, external sources like Kafka have their own checkpoint 
mechanism, instead of storing data in WAL, we can only store metadata to WAL, 
and recover from the last committed offsets. But this requires sophisticated 
design of Kafka receiver with low-level API involved, also we need to take care 
of rebalance and fault tolerance things by ourselves. So right now instead of 
implementing a whole new receiver, we choose to implement a simple one, though 
the performance is not so good, it's much easier to understand and maintain.

The design purpose and implementation of reliable Kafka receiver can be found 
in (https://issues.apache.org/jira/browse/SPARK-4062). And in future, to 
improve the reliable Kafka receiver like what you mentioned is on our scheduler.

Thanks
Jerry


-----Original Message-----
From: RodrigoB [mailto:rodrigo.boav...@aspect.com] 
Sent: Wednesday, December 3, 2014 5:44 AM
To: u...@spark.incubator.apache.org
Subject: Re: Low Level Kafka Consumer for Spark

Dibyendu,

Just to make sure I will not be misunderstood - My concerns are referring to 
the Spark upcoming solution and not yours. I would to gather the perspective of 
someone which implemented recovery with Kafka a different way.

Tnks,
Rod



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Low-Level-Kafka-Consumer-for-Spark-tp11258p20196.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional 
commands, e-mail: user-h...@spark.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to