[ https://issues.apache.org/jira/browse/AMQ-5712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14507000#comment-14507000 ]
Christopher L. Shannon commented on AMQ-5712: --------------------------------------------- Thanks Timothy. I have tested out this patch and it looks good. I will go ahead and apply this to our broker so we can use it for now. > Broker can deadlock when using queues while producers wait on disk space > ------------------------------------------------------------------------ > > Key: AMQ-5712 > URL: https://issues.apache.org/jira/browse/AMQ-5712 > Project: ActiveMQ > Issue Type: Bug > Components: Broker > Affects Versions: 5.11.1 > Reporter: Christopher L. Shannon > Attachments: queue-cursor.patch > > > I am experiencing a deadlock when using a Queue with non-persistent messages. > The queue has a cursor high memory water mark set (right now at 70%). When > a producer is producing messages quickly to the queue and that limit gets > hit, the broker can deadlock. I have tried setting producerWindowSize and > alwaysSyncSend which did not seem to help. When the broker hits that limit, I > am unable to do things like purge the queue. Consumers can also deadlock as > well. > Note that this appears to be the same issue as described in this ticket here: > AMQ-2475 . The difference is that I am using a Queue and not a Topic and the > fix for this appears to only have been for Topics. > The problem appears to be in the Queue class on line 1852 inside the > {{cursorAdd}} method. The method being called is {{return > messages.addMessageLast(msg);}} which will block indefinitely if there is no > space available, which in turn ties up the {{messagesLock}} from being used > by any other threads. We have seen a deadlock where consumers can't consume > because they are waiting on this lock. It looks like in AMQ-2475 part of > the fix was to replace {{messages.addMessageLast(msg)}} with > {{messages.tryAddMessageLast(msg, 10)}}. I also noticed that not all of the > message cursors support {{tryAddMessageLast}}, which could be a problem. > {{FilePendingMessageCursor}} implements it but the rest of the cursors > (notably {{StoreQueueCursor}}) simply delegate back to {{addMessageLast}} in > the parent class. So part of this fix may require implementing > {{tryAddMessageLast}} across more cursors. > Here is part of the thread dump showing the stuck producer: > {code} > "ActiveMQ Transport: ssl:///192.168.3.142:38589" daemon prio=10 > tid=0x00007fb46c006000 nid=0x3b1a runnable [0x00007fb4b8a0d000] > java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00000000cfb13cd0> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2176) > at org.apache.activemq.usage.Usage.waitForSpace(Usage.java:103) > at org.apache.activemq.usage.Usage.waitForSpace(Usage.java:90) > at org.apache.activemq.usage.Usage.waitForSpace(Usage.java:80) > at > org.apache.activemq.broker.region.cursors.FilePendingMessageCursor.tryAddMessageLast(FilePendingMessageCursor.java:235) > - locked <0x00000000d2015ee0> (a > org.apache.activemq.broker.region.cursors.FilePendingMessageCursor) > at > org.apache.activemq.broker.region.cursors.FilePendingMessageCursor.addMessageLast(FilePendingMessageCursor.java:207) > - locked <0x00000000d2015ee0> (a > org.apache.activemq.broker.region.cursors.FilePendingMessageCursor) > at > org.apache.activemq.broker.region.cursors.StoreQueueCursor.addMessageLast(StoreQueueCursor.java:97) > - locked <0x00000000d1f20908> (a > org.apache.activemq.broker.region.cursors.StoreQueueCursor) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)