jaagupku commented on issue #36284: URL: https://github.com/apache/beam/issues/36284#issuecomment-3345675902
Thank you for quick reply. I tried `commitOffsetsInFinalize` and removed offset tracking and committing from `ProcessRecordsDoFn`. But then it permanently skips processing of some messages. I am also using Spark runner. I count processed messages by initializing `Map<TopicPartition, Long>` in start bundle. In `processElement` it increments value in map. And in finish bundle logged it out. The job in YARN eventually fails due to failures in setup and starts another attempt. When it stopped reading the messages I restarted the job manually, and it had skipped some messages. I produced 20000 messages in the topic and in the database and in logs it had 19255 messages. So when the attempt in YARN gets killed due to failures in setup, then `commitOffsetsInFinalize` will commit offsets that are not yet processed? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
