jaagupku commented on issue #36284:
URL: https://github.com/apache/beam/issues/36284#issuecomment-3345675902

   Thank you for quick reply. I tried `commitOffsetsInFinalize` and removed 
offset tracking and committing from `ProcessRecordsDoFn`. But then it 
permanently skips processing of some messages. I am also using Spark runner.
   
   I count processed messages by initializing `Map<TopicPartition, Long>` in 
start bundle. In `processElement` it increments value in map. And in finish 
bundle logged it out. The job in YARN eventually fails due to failures in setup 
and starts another attempt. When it stopped reading the messages I restarted 
the job manually, and it had skipped some messages. I produced 20000 messages 
in the topic and in the database and in logs it had 19255 messages.
   
   So when the attempt in YARN gets killed due to failures in setup, then 
`commitOffsetsInFinalize` will commit offsets that are not yet processed?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to