[
https://issues.apache.org/jira/browse/QPID-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240716#comment-13240716
]
[email protected] commented on QPID-3911:
-----------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4546/
-----------------------------------------------------------
Review request for qpid, Robbie Gemmell, rajith attapattu, Keith Wall, and Rob
Godfrey.
Summary
-------
Invoking of MessageConsumer#close() and Session#rollback from a consumer
listener results in a deadlock on 0-10 path and a timeout exception on 0-9 path
On 0-10 path the deadlock is caused by an ExecutionException sent from the
broker in response to message.stop, message.flow commands (sent from
Session#rollback for the suspension of the channel) when message.stop or
message.flow command is delivered to the broker after message.cancel command
which is sent as part of MessageConsumer#close(). Command message.cancel
results in a deletion of a subscription on the broker side and for the
following message.stop or message.flow command the subscription cannot be found
and ExceutionException is reported:
org.apache.qpid.AMQException: ch=0 id=2 ExecutionException(errorCode=NOT_FOUND,
commandId=20, classCode=4, commandCode=12, fieldIndex=0, description=not-found:
Unknown destination 1 [email protected]
(qpid/broker/SemanticState.cpp:569), errorInfo={}) [error code 404: not found]
The ExecutionException is sent from both Java and C++ Brokers
On receiving an ExecutionException connection tries to acquire the
FailoverMutex in order to close itself in AMQConnection#exceptionReceived
method.
However, the FailoverMutex is acquired in MessageConsumer#close() which is
waiting for a release of a session dispatcher lock. The last is hold in a
dispatcher thread. The closing of a connection occurs in a dispatcher thread
and this results in a deadlock.
The suggested patch introduces the following changes:
Adds synchronization on AMSession#_messageDeliveryLock into
MessageConsumer#close() in order to block until message listener in progress
has completed(as required in JMS javadoc for MessageConsumer#close())
Changes the session dispatcher to stop the message delivery into consumer
local message queue if the consumer in the process of closing. This eliminates
the need to stop the dispatcher on rejecting pending messages for closing
consumer.
Removes the synchronization on a session dispatcher lock from
AMQSession.Dispatcher#rejectPending and code to stop the dispatcher as we are
synchronizing on a deliveryLock now and incoming messages are not dispatched
into closing consumer anymore.
Adds a system test to reproduce the deadlock
This addresses bug QPID-3911.
https://issues.apache.org/jira/browse/QPID-3911
Diffs
-----
/trunk/qpid/java/client/src/main/java/org/apache/qpid/client/AMQSession.java
1306567
/trunk/qpid/java/client/src/main/java/org/apache/qpid/client/BasicMessageConsumer.java
1306567
/trunk/qpid/java/systests/src/main/java/org/apache/qpid/test/unit/close/MessageConsumerCloseTest.java
PRE-CREATION
Diff: https://reviews.apache.org/r/4546/diff
Testing
-------
Thanks,
Oleksandr
> Consumer.close() and session.rollback() deadlocks
> --------------------------------------------------
>
> Key: QPID-3911
> URL: https://issues.apache.org/jira/browse/QPID-3911
> Project: Qpid
> Issue Type: Bug
> Components: Java Client
> Affects Versions: 0.16
> Environment: 0.16 Java Client and Java Broker
> Reporter: Praveen Murugesan
> Assignee: Alex Rudyy
> Labels: java, qpidclient
> Attachments: DeadLockStackTraces.txt,
> QPID-3911-Fix-deadlock-on-concurrent-invocation-of-MessageConsumer-close-and-Session-rollback.patch,
> QpidConsumerCloseRollbackDeadlock.java
>
>
> Found one Java-level deadlock:
> =============================
> "Dispatcher-Channel-0":
> waiting to lock monitor 0x0000000001e65ec8 (object
> 0x00000007c180bd58, a java.lang.Object),
> which is held by "main"
> "main":
> waiting to lock monitor 0x0000000001cffbc8 (object
> 0x00000007c2e10c08, a java.lang.Object),
> which is held by "Dispatcher-Channel-0"
> Java stack information for the threads listed above:
> ===================================================
> "Dispatcher-Channel-0":
> at
> org.apache.qpid.client.AMQConnection.exceptionReceived(AMQConnection.java:1255)
> - waiting to lock <0x00000007c180bd58> (a java.lang.Object)
> at
> org.apache.qpid.client.AMQSession_0_10.setCurrentException(AMQSession_0_10.java:1057)
> at
> org.apache.qpid.client.AMQSession_0_10.sync(AMQSession_0_10.java:1034)
> at
> org.apache.qpid.client.AMQSession_0_10.sendSuspendChannel(AMQSession_0_10.java:851)
> at
> org.apache.qpid.client.AMQSession.suspendChannel(AMQSession.java:3075)
> - locked <0x00000007c2c3d330> (a java.lang.Object)
> at org.apache.qpid.client.AMQSession.rollback(AMQSession.java:1854)
> - locked <0x00000007c2c3d330> (a java.lang.Object)
> at
> QpidConsumerCloseRollbackDeadlock$QpidMqHandler.onMessage(QpidConsumerCloseRollbackDeadlock.java:208)
> at
> org.apache.qpid.client.BasicMessageConsumer.notifyMessage(BasicMessageConsumer.java:745)
> at
> org.apache.qpid.client.BasicMessageConsumer_0_10.notifyMessage(BasicMessageConsumer_0_10.java:141)
> at
> org.apache.qpid.client.BasicMessageConsumer.notifyMessage(BasicMessageConsumer.java:719)
> at
> org.apache.qpid.client.BasicMessageConsumer_0_10.notifyMessage(BasicMessageConsumer_0_10.java:186)
> at
> org.apache.qpid.client.BasicMessageConsumer_0_10.notifyMessage(BasicMessageConsumer_0_10.java:54)
> at
> org.apache.qpid.client.AMQSession$Dispatcher.notifyConsumer(AMQSession.java:3467)
> at
> org.apache.qpid.client.AMQSession$Dispatcher.dispatchMessage(AMQSession.java:3406)
> - locked <0x00000007c2c3d350> (a java.lang.Object)
> - locked <0x00000007c2e10c08> (a java.lang.Object)
> at
> org.apache.qpid.client.AMQSession$Dispatcher.access$1000(AMQSession.java:3180)
> at org.apache.qpid.client.AMQSession.dispatch(AMQSession.java:3173)
> at
> org.apache.qpid.client.message.UnprocessedMessage.dispatch(UnprocessedMessage.java:54)
> at
> org.apache.qpid.client.AMQSession$Dispatcher.run(AMQSession.java:3329)
> at java.lang.Thread.run(Thread.java:636)
> "main":
> at
> org.apache.qpid.client.AMQSession$Dispatcher.rejectPending(AMQSession.java:3211)
> - waiting to lock <0x00000007c2e10c08> (a java.lang.Object)
> at
> org.apache.qpid.client.AMQSession.confirmConsumerCancelled(AMQSession.java:903)
> at
> org.apache.qpid.client.BasicMessageConsumer_0_10.sendCancel(BasicMessageConsumer_0_10.java:170)
> at
> org.apache.qpid.client.BasicMessageConsumer.close(BasicMessageConsumer.java:593)
> - locked <0x00000007c180bd58> (a java.lang.Object)
> at
> org.apache.qpid.client.BasicMessageConsumer.close(BasicMessageConsumer.java:555)
> at
> QpidConsumerCloseRollbackDeadlock.main(QpidConsumerCloseRollbackDeadlock.java:77)
> Found 1 deadlock.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]