ritesh adval created AMQ-9747:
---------------------------------
Summary: Kahadb checkpoint runner thread dies without catching
exception
Key: AMQ-9747
URL: https://issues.apache.org/jira/browse/AMQ-9747
Project: ActiveMQ Classic
Issue Type: Bug
Components: KahaDB
Affects Versions: 6.1.7, 5.18.7, 6.1.6, 6.1.4
Reporter: ritesh adval
Attachments: image-2025-07-21-11-14-20-510.png
This bug affects all version of activemq, I have added a few ones starting
5.18.4. This is a really critical bug, which causes checkpoint runner thread
to SILENTLY GET KILLED (exception eaten away), causing kahadb journal files to
KEEP GROWING, till you restart activemq. There is a bug in handling of io
exception... see screen shot below. In the catch block its calls
brokerService.handleIOException().
!image-2025-07-21-11-14-20-510.png! iif you take a look at the default io
exception handler which is used, it will throw this SuppressReplyException at
[https://github.com/apache/activemq/blob/main/activemq-broker/src/main/java/org/apache/activemq/util/DefaultIOExceptionHandler.java#L165]
and if stopStartConnectors is true (which is if you use
LeaseLockerIOExceptionHandler) then also it throws this SuppressReplyException
at
[https://github.com/apache/activemq/blob/main/activemq-broker/src/main/java/org/apache/activemq/util/DefaultIOExceptionHandler.java#L155]
and because of this, the CheckPoint runner thread as shown in above screen
shot would silently die, even though broker is still running...
it seems the checkpoint runner should not be dying.... we had a situation where
we were using EFS as our storage for kahadb... and due to a blip in connection
EFS, an io exception in page.flush in MessageDatabase was thrown (we were using
LeaseLockerIOExceptionHandler)... that caused DefaultIOExceptionHandler logic
to start and stop connectors and return SuppressReplyException as i mentioned
above, causing CheckPoint runner to silient get killed... while broker was
still running....
we had this in production and the fix we did is to extend
LeaseLockerIOExceptionHandler and catch exception throw from handle(IOException
ex) method and log it as warn and not propogate it up to checkpoint runner
thread... but this is temporary fix.. i am not even sure if CheckpointRunner
needs to use DefaultIOExceptionHandler.... it should die siliently...
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
For further information, visit: https://activemq.apache.org/contact