Yuchen Liu created SPARK-53560:
----------------------------------

             Summary: Crash looping when retrying uncommitted batch in Kafka 
source and AvailableNow trigger
                 Key: SPARK-53560
                 URL: https://issues.apache.org/jira/browse/SPARK-53560
             Project: Spark
          Issue Type: Bug
          Components: Structured Streaming
    Affects Versions: 3.4.0
            Reporter: Yuchen Liu


There is a crash loop when using Kafka source with AvailableNow. It will be 
triggered deterministically when the query fails / terminates in the middle of 
the run, and the user changes the Kafka topic partition, then restarts the 
query. This is because the topic partitions in uncommitted batch and the latest 
topic partitions aren't the same, but trigger.AvailableNow requires it anyway.

This PR solves the issue by not fetching the latest topic partitions in 
function `prepareForTriggerAvailableNow`. Instead, it fetches in 
`latestOffset`, because Spark will not call `latestOffset` when retrying the 
uncommitted batch.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to