ableegoldman opened a new pull request #8926:
URL: https://github.com/apache/kafka/pull/8926


   This should address at least some of the excessive TaskCorruptedExceptions 
we've been seeing lately. Basically, at the moment we only commit tasks if 
`commitNeeded` is true -- this seems true by definition. But the problem is we 
do some essential cleanup in `postCommit` that should always be done before a 
task is closed:
   
   1. clear the PartitionGroup
   2. write the checkpoint
   
   2 is actually fine to skip when `commitNeeded = false` with ALOS, as we will 
have already written a checkpoint during the last commit. But for EOS, we 
_only_ write the checkpoint before a close -- so even if there is no new 
pending data since the last commit, we have to write the current offsets. If we 
don't, the task will be assumed dirty and we will run into our friend the 
TaskCorruptedException during (re)initialization.
   
   To fix this, we should just always call `prepareCommit` and `postCommit` at 
the TaskManager level. Within the task, it can decide whether or not to 
actually do something in those methods based on `commitNeeded`. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to