Hi all, I have a spark structured streaming app that is consuming from a kafka topic with retention set up. Sometimes I face an issue where my query has not finished processing a message but the retention kicks in and deletes the offset, which since I use the default setting of “failOnDataLoss=true” causes my query to fail. The solution I currently have is manual, deleting the offsets directory and rerunning.
I instead like to have spark automatically fall back to the earliest offset available. The solutions I saw recommend setting auto.offset = earliest, but for structured streaming, you cannot set that. How do I do this for structured streaming? Thanks! -- Cheers, Ruijing Li