I have the following readstream in Spark structured streaming reading data from Kafka
val kafkaStreamingDF = spark .readStream .format("kafka") .option("kafka.bootstrap.servers", "...") .option("subscribe", "testtopic") .option("failOnDataLoss", "false") .option("startingOffsets","earliest") .load() .selectExpr("CAST(value as STRING)", "CAST(topic as STRING)") As far as I know, every time I start the job, underneath the covers, Spark created new consumer, new consumer group and retrieves the last successful offset for the job(using the job name ?) and seeks to that offset and start reading from there. Is that the case ? If yes, how do I reset the offset to start and force my job to read from beginning ? -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org