[ https://issues.apache.org/jira/browse/ARTEMIS-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17007595#comment-17007595 ]
Justin Bertram commented on ARTEMIS-2586: ----------------------------------------- Also, you've used the {{AMQP}} component on this JIRA but from the stack-traces it appears you're using the core protocol. Please clarify what protocol you're using. > Inifinite Block in AMQ212054 after transient DB-error > ----------------------------------------------------- > > Key: ARTEMIS-2586 > URL: https://issues.apache.org/jira/browse/ARTEMIS-2586 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: AMQP > Affects Versions: 2.10.1 > Environment: This is Ubuntu 18.04 and Oracle DB, but don't think it's > that relevant for the issue. > Reporter: Rico Neubauer > Priority: Major > Attachments: 2019-11-28_threaddump_01.txt, > 2019-12-04_threaddump_01.txt, Message-Counts.png, initial-error.txt, > log-extract.txt, writerIndex-Credits.PNG > > > Hi, > Would like to describe a quite severe situation which was expirienced in a > long-running test with 2 out of 3 instances/machines. > We are running Karaf with Artemis 2.10.1. > After some time (see screenshot), first one, then after a while a 2nd > instance came to a complete stop. > Looking into the logs and thread-dumps revealed the following (same for bith > instances): > # There was a temporary problem connecting to the DB ({{connection reset by > peer}}and {{Closed Connection }}) > # This resulted (due to handling on our side) in an > {{IllegalStateException}}/{{Error during two phase commit}} being thrown back > to Artemis. > # After this, there is no messaging possible anymore at all and the > following log repeats: > {noformat} > AMQ212054: Destination address=DLQ is blocked. If the system is configured to > block make sure you consume messages on this configuration.{noformat} > (system is not configured to block, see attached config) > which comes from threads like these, trying to obtain credits for sending: > > {noformat} > "Thread-93 (ActiveMQ-client-global-threads)" Id=2001 in TIMED_WAITING on > lock=java.util.concurrent.Semaphore$NonfairSync@1f9a57e0 > at sun.misc.Unsafe.park(Native Method) > at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1039) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1332) > at java.util.concurrent.Semaphore.tryAcquire(Semaphore.java:582) > at > org.apache.activemq.artemis.core.client.impl.ClientProducerCreditsImpl.actualAcquire(ClientProducerCreditsImpl.java:73) > at > org.apache.activemq.artemis.core.client.impl.AbstractProducerCreditsImpl.acquireCredits(AbstractProducerCreditsImpl.java:77) > at > org.apache.activemq.artemis.core.client.impl.ClientProducerImpl.sendRegularMessage(ClientProducerImpl.java:301) > at > org.apache.activemq.artemis.core.client.impl.ClientProducerImpl.doSend(ClientProducerImpl.java:275) > at > org.apache.activemq.artemis.core.client.impl.ClientProducerImpl.send(ClientProducerImpl.java:128) > at > org.apache.activemq.artemis.jms.client.ActiveMQMessageProducer.doSendx(ActiveMQMessageProducer.java:485) > at > org.apache.activemq.artemis.jms.client.ActiveMQMessageProducer.send(ActiveMQMessageProducer.java:195) > at > com.seeburger.engine.jms.MessageReceiverBase.sendToDLQ(MessageReceiverBase.java:571) > at > com.seeburger.engine.jms.MessageReceiverBase.handleException(MessageReceiverBase.java:493) > at > com.seeburger.engine.jms.MessageReceiverBase.onMessage(MessageReceiverBase.java:387) > at > org.apache.activemq.artemis.jms.client.JMSMessageListenerWrapper.onMessage(JMSMessageListenerWrapper.java:110) > at > org.apache.activemq.artemis.core.client.impl.ClientConsumerImpl.callOnMessage(ClientConsumerImpl.java:1031) > at > org.apache.activemq.artemis.core.client.impl.ClientConsumerImpl.access$400(ClientConsumerImpl.java:50) > at > org.apache.activemq.artemis.core.client.impl.ClientConsumerImpl$Runner.run(ClientConsumerImpl.java:1154) > at > org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:42) > at > org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:31) > at > org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:66) > at > org.apache.activemq.artemis.utils.actors.ProcessorBase$$Lambda$431/1769898766.run(Unknown > Source) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118) > Locked synchronizers: count = 1 > - java.util.concurrent.ThreadPoolExecutor$Worker@bc49fcf > {noformat} > which will never succeed, since the credits seem to no suffice (see heap-dump > screenshot) > From my point of view, the thrown IllegalStateException should not lead to > the system going in this non-recoverable state, what do you think, is there > something that can be enhanced? > > [Fastthread-Link|https://fastthread.io/my-thread-report.jsp?p=c2hhcmVkLzIwMjAvMDEvMy8tLTIwMTktMTItMDRfdGhyZWFkZHVtcF8wMS50eHQtLTEzLTM4LTE1OzstLTIwMTktMTEtMjhfdGhyZWFkZHVtcF8wMS50eHQtLTEzLTM4LTE1] > In case it helps: The 2 instances are still in this state (since September) > and I can fetch additional information or debug them on request. -- This message was sent by Atlassian Jira (v8.3.4#803005)