Re: [structured-streaming]How to reset Kafka offset in readStream and read from beginning

2018-05-23 Thread Sushil Kotnala
You can use .option( "auto.offset.reset","earliest") while reading from kafka. With this, new stream will read from the first offset present for topic . On Wed, May 23, 2018 at 11:32 AM, karthikjay wrote: > Chris, > > Thank you for responding. I get it. > > But, if I am

Re: [structured-streaming]How to reset Kafka offset in readStream and read from beginning

2018-05-23 Thread karthikjay
Chris, Thank you for responding. I get it. But, if I am using a console sink without checkpoint location, I do not see any messages in the console in IntellijIDEA IDE. I do not explicitly specify checkpointLocation in this case. How do I clear the working directory data and force Spark to read

Re: [structured-streaming]How to reset Kafka offset in readStream and read from beginning

2018-05-22 Thread Bowden, Chris
You can delete the write ahead log directory you provided to the sink via the “checkpointLocation” option. From: karthikjay Sent: Tuesday, May 22, 2018 7:24:45 AM To: user@spark.apache.org Subject: [structured-streaming]How to reset Kafka

[structured-streaming]How to reset Kafka offset in readStream and read from beginning

2018-05-22 Thread karthikjay
I have the following readstream in Spark structured streaming reading data from Kafka val kafkaStreamingDF = spark .readStream .format("kafka") .option("kafka.bootstrap.servers", "...") .option("subscribe", "testtopic") .option("failOnDataLoss", "false")