Re: Spark Kafka Batch Write guarantees

2019-04-01 Thread hemant singh
Thanks Shixiong, read in documentation as well that duplicates might exist because of task retries. On Mon, 1 Apr 2019 at 9:43 PM, Shixiong(Ryan) Zhu wrote: > The Kafka source doesn’t support transaction. You may see partial data or > duplicated data if a Spark task fails. > > On Wed, Mar 27,

Re: Spark Kafka Batch Write guarantees

2019-04-01 Thread Shixiong(Ryan) Zhu
The Kafka source doesn’t support transaction. You may see partial data or duplicated data if a Spark task fails. On Wed, Mar 27, 2019 at 1:15 AM hemant singh wrote: > We are using spark batch to write Dataframe to Kafka topic. The spark > write function with write.format(source = Kafka). > Does

Spark Kafka Batch Write guarantees

2019-03-27 Thread hemant singh
We are using spark batch to write Dataframe to Kafka topic. The spark write function with write.format(source = Kafka). Does spark provide similar guarantee like it provides with saving dataframe to disk; that partial data is not written to Kafka i.e. full dataframe is saved or if job fails no