Alex Rudyy created QPID-6051:
--------------------------------
Summary: [Java Broker] When client socket is get closed after
transaction commit but before sending TxCommitOk back, a message might be
deleted from a store but left on queue causing later on unhandled
"AMQStoreException: Metadata not found" and broker crush
Key: QPID-6051
URL: https://issues.apache.org/jira/browse/QPID-6051
Project: Qpid
Issue Type: Bug
Components: Java Broker
Affects Versions: 0.28, 0.26, 0.24, 0.22, 0.20, 0.18, 0.30
Reporter: Alex Rudyy
Assignee: Alex Rudyy
If socket is get closed on sending back to a client command "TxCommitOk", the
broker tries to run rollback actions for already committed transaction. This
might cause a deletion of the message from a message store leaving the message
enqueued on a queue.
If later on, a consumer tries to consume such message, the broker cannot find
message metadata (as it was previously deleted) and crushes.
Here are the exception stack traces demonstrating the issue:
{noformat}
2014-08-08 12:16:00,843 ERROR [IoSender - /127.0.0.1:36625] util.Logger
(Logger.java:78) - error in write thread
java.net.SocketException: Connection reset
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113)
at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
at org.apache.qpid.transport.network.io.IoSender.run(IoSender.java:313)
at java.lang.Thread.run(Thread.java:722)
2014-08-08 12:16:00,848 ERROR [IoReceiver - /127.0.0.1:36625]
txn.LocalTransaction$1 (LocalTransaction.java:338) - Failed to commit
transaction
org.apache.qpid.transport.SenderClosedException: sender is closed
at org.apache.qpid.transport.network.io.IoSender.send(IoSender.java:114)
at org.apache.qpid.transport.network.io.IoSender.send(IoSender.java:40)
at
org.apache.qpid.server.protocol.AMQProtocolEngine.writeFrame(AMQProtocolEngine.java:570)
at
org.apache.qpid.server.handler.TxCommitHandler$1.run(TxCommitHandler.java:73)
at org.apache.qpid.server.AMQChannel$3.run(AMQChannel.java:975)
at
org.apache.qpid.server.txn.LocalTransaction$1.completeDeferredWork(LocalTransaction.java:333)
at
org.apache.qpid.server.txn.LocalTransaction$1.checkUnderlyingCompletion(LocalTransaction.java:321)
at
org.apache.qpid.server.txn.LocalTransaction$1.waitForCompletion(LocalTransaction.java:313)
at
org.apache.qpid.server.txn.LocalTransaction.sync(LocalTransaction.java:435)
at org.apache.qpid.server.AMQChannel.sync(AMQChannel.java:1654)
at
org.apache.qpid.server.AMQChannel.receivedComplete(AMQChannel.java:221)
at
org.apache.qpid.server.protocol.AMQProtocolEngine.receiveComplete(AMQProtocolEngine.java:273)
at
org.apache.qpid.server.protocol.AMQProtocolEngine.received(AMQProtocolEngine.java:256)
at
org.apache.qpid.server.protocol.AMQProtocolEngine.received(AMQProtocolEngine.java:85)
at
org.apache.qpid.server.protocol.MultiVersionProtocolEngine.received(MultiVersionProtocolEngine.java:118)
at
org.apache.qpid.server.protocol.MultiVersionProtocolEngine.received(MultiVersionProtocolEngine.java:37)
at
org.apache.qpid.transport.network.io.IoReceiver.run(IoReceiver.java:152)
at java.lang.Thread.run(Thread.java:722)
Caused by: java.net.SocketException: Connection reset
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113)
at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
at org.apache.qpid.transport.network.io.IoSender.run(IoSender.java:313)
... 1 more
2014-08-08 12:16:00,849 ERROR [IoReceiver - /127.0.0.1:36625]
protocol.AMQProtocolEngine (AMQProtocolEngine.java:260) - Unexpected exception
when processing datablock
java.lang.RuntimeException: Failed to commit transaction
at
org.apache.qpid.server.txn.LocalTransaction$1.completeDeferredWork(LocalTransaction.java:341)
at
org.apache.qpid.server.txn.LocalTransaction$1.checkUnderlyingCompletion(LocalTransaction.java:321)
at
org.apache.qpid.server.txn.LocalTransaction$1.waitForCompletion(LocalTransaction.java:313)
at
org.apache.qpid.server.txn.LocalTransaction.sync(LocalTransaction.java:435)
at org.apache.qpid.server.AMQChannel.sync(AMQChannel.java:1654)
at
org.apache.qpid.server.AMQChannel.receivedComplete(AMQChannel.java:221)
at
org.apache.qpid.server.protocol.AMQProtocolEngine.receiveComplete(AMQProtocolEngine.java:273)
at
org.apache.qpid.server.protocol.AMQProtocolEngine.received(AMQProtocolEngine.java:256)
at
org.apache.qpid.server.protocol.AMQProtocolEngine.received(AMQProtocolEngine.java:85)
at
org.apache.qpid.server.protocol.MultiVersionProtocolEngine.received(MultiVersionProtocolEngine.java:118)
at
org.apache.qpid.server.protocol.MultiVersionProtocolEngine.received(MultiVersionProtocolEngine.java:37)
at
org.apache.qpid.transport.network.io.IoReceiver.run(IoReceiver.java:152)
at java.lang.Thread.run(Thread.java:722)
Caused by: org.apache.qpid.transport.SenderClosedException: sender is closed
at org.apache.qpid.transport.network.io.IoSender.send(IoSender.java:114)
at org.apache.qpid.transport.network.io.IoSender.send(IoSender.java:40)
at
org.apache.qpid.server.protocol.AMQProtocolEngine.writeFrame(AMQProtocolEngine.java:570)
at
org.apache.qpid.server.handler.TxCommitHandler$1.run(TxCommitHandler.java:73)
at org.apache.qpid.server.AMQChannel$3.run(AMQChannel.java:975)
at
org.apache.qpid.server.txn.LocalTransaction$1.completeDeferredWork(LocalTransaction.java:333)
... 12 more
Caused by: java.net.SocketException: Connection reset
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113)
at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
at org.apache.qpid.transport.network.io.IoSender.run(IoSender.java:313)
... 1 more
2014-08-08 12:45:14,173 ERROR [pool-4-thread-3] server.Main$1 (Main.java:367) -
Uncaught exception, shutting down.
java.lang.RuntimeException: org.apache.qpid.AMQStoreException: Metadata not
found for message with id 4204052254 [error code 541: internal error]
at
org.apache.qpid.server.store.berkeleydb.AbstractBDBMessageStore$StoredBDBMessage.getMetaData(AbstractBDBMessageStore.java:1617)
at
org.apache.qpid.server.message.AMQMessage.getMessageMetaData(AMQMessage.java:93)
at
org.apache.qpid.server.message.AMQMessage.getMessagePublishInfo(AMQMessage.java:155)
at
org.apache.qpid.server.output.ProtocolOutputConverterImpl.createEncodedDeliverBody(ProtocolOutputConverterImpl.java:207)
at
org.apache.qpid.server.output.ProtocolOutputConverterImpl.writeDeliver(ProtocolOutputConverterImpl.java:70)
at
org.apache.qpid.server.protocol.AMQProtocolEngine$WriteDeliverMethod.deliverToClient(AMQProtocolEngine.java:1512)
at
org.apache.qpid.server.subscription.SubscriptionImpl.sendToClient(SubscriptionImpl.java:683)
at
org.apache.qpid.server.subscription.SubscriptionImpl$AckSubscription.send(SubscriptionImpl.java:293)
at
org.apache.qpid.server.queue.SimpleAMQQueue.deliverMessage(SimpleAMQQueue.java:877)
at
org.apache.qpid.server.queue.SimpleAMQQueue.attemptDelivery(SimpleAMQQueue.java:1710)
at
org.apache.qpid.server.queue.SimpleAMQQueue.processQueue(SimpleAMQQueue.java:1877)
at org.apache.qpid.server.queue.QueueRunner.run(QueueRunner.java:71)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
Caused by: org.apache.qpid.AMQStoreException: Metadata not found for message
with id 4204052254 [error code 541: internal error]
at
org.apache.qpid.server.store.berkeleydb.AbstractBDBMessageStore.getMessageMetaData(AbstractBDBMessageStore.java:1419)
at
org.apache.qpid.server.store.berkeleydb.AbstractBDBMessageStore$StoredBDBMessage.getMetaData(AbstractBDBMessageStore.java:1613)
... 14 more
{noformat}
It seems we should simply stop trying to rollback transactions if they fail
after they have actually committed and allow the underlying exception to be
re-thrown
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]