Kowshik Prakasam created KAFKA-10723:
----------------------------------------

             Summary: LogManager shutdown error handler doesn't shutdown all 
internal thread pools
                 Key: KAFKA-10723
                 URL: https://issues.apache.org/jira/browse/KAFKA-10723
             Project: Kafka
          Issue Type: Bug
            Reporter: Kowshik Prakasam


*TL;DR:*

The asynchronous shutdown in LogManager has the shortcoming that if during 
shutdown any of the internal futures fail, then we do not always ensure that 
all futures are completed before LogManager.shutdown returns. As a result, 
despite the shut down completed message from KafkaServer is seen in the error 
logs, some futures continue to run from inside LogManager attempting to close 
the logs.



*Description:*

When LogManager is shutting down, exceptions in log closure are handled 
[here|https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/log/LogManager.scala#L497-L501].
 However, this 
[line|https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/log/LogManager.scala#L502]
 in the finally clause may still throw an exception once again (for ex: if any 
of the tasks fail during thread pool shutdown):

threadPools.foreach(_.shutdown())

Note that ExecutorService.shutdown() initiates an orderly shutdown in which 
previously submitted tasks are executed, but no new tasks will be accepted (see 
javadoc link 
[here|https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ExecutorService.html#shutdown()]).
 It is possible that any of the other running threads that asynchronously close 
logs may throw an exception. When this happens, ExecutoreService.shutdown() can 
throw an exception. This situation is not handled inside the finally clause 
which means that some of the thread pools which are closing logs could be 
leaked and continue to run in the background, after the control returns to the 
caller (i.e. KafkaServer).

As a result, even after the `shut down completed` message is seen in the error 
logs, log closures continue to happen in the background. 
 

*Proposed fix:*

It seems useful that we maintain the contract with KafkaServer that after 
LogManager.shutdown is called once, all thread pools that close the logs are 
guaranteed to be attempted shutdown once before the control returns to the 
caller (either via an exception case or a success case). Supposing when 
multiple exceptions are raised when closing the logs (from different thread 
pools), probably we can just raise the first exception among these to the 
caller of LogManager.shutdown (i.e. KafkaServer ) while optionally logging the 
rest of the exceptions to stderr.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to