Github user sirishaSindri commented on a diff in the pull request: https://github.com/apache/spark/pull/20836#discussion_r177276642 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaDataConsumer.scala --- @@ -279,9 +279,8 @@ private[kafka010] case class InternalKafkaConsumer( if (record.offset > offset) { // This may happen when some records aged out but their offsets already got verified if (failOnDataLoss) { - reportDataLoss(true, s"Cannot fetch records in [$offset, ${record.offset})") --- End diff -- @gaborgsomogyi Thank you Gaborgsomogyi for looking at it. For the batch queries, it will always fail if it fails to read any data from the provided offsets due to lost data. https://spark.apache.org/docs/latest/structured-streaming-kafka-integration.html . With this change,it wont fail .Instead It will return all the available messages within the requested offset range.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org