zheliu2 opened a new pull request, #15569:
URL: https://github.com/apache/iceberg/pull/15569

   ## Summary
   
   Fixes #15452
   
   After a Kafka broker restart, threads in the Iceberg sink connector can hang 
and stop committing to the catalog while the data consumer group continues to 
progress. Two root causes:
   
   - **Fatal producer exceptions silently swallowed**: `Coordinator.commit()` 
catches all exceptions including fatal Kafka producer errors 
(`ProducerFencedException`, `OutOfOrderSequenceException`, 
`InvalidProducerEpochException`). After such errors the producer is 
unrecoverable, but the coordinator keeps running with a broken producer, unable 
to send any events. This PR propagates fatal exceptions so the coordinator 
thread terminates and the Kafka Connect framework restarts the task.
   
   - **Channel.stop() can hang indefinitely**: `producer.close()`, 
`consumer.close()`, and `admin.close()` are called without timeouts. If the 
broker is unavailable, these calls can block forever, preventing the 
coordinator thread from completing shutdown. This PR adds a 30-second close 
timeout and wraps each close in try-catch so one failing client does not block 
the others.
   
   ## Changes
   
   - `Coordinator.java`: Add `isFatalException()` check in `commit()` to 
propagate fatal Kafka producer errors instead of swallowing them
   - `Channel.java`: Add close timeouts (30s) and individual try-catch for 
producer, consumer, and admin close operations
   - `TestCoordinator.java`: Add `testCommitFatalProducerError` verifying that 
`ProducerFencedException` propagates from `process()`
   
   ## Test plan
   
   - [x] `testCommitFatalProducerError` - Verifies that fencing the producer 
causes `process()` to throw `ProducerFencedException`
   - [x] All existing `TestCoordinator` tests pass


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to