[ 
https://issues.apache.org/jira/browse/QPID-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14090521#comment-14090521
 ] 

ASF subversion and git services commented on QPID-5974:
-------------------------------------------------------

Commit 1616703 from [~aconway] in branch 'qpid/trunk'
[ https://svn.apache.org/r1616703 ]

QPID-5974: HA qpid-txtest2 can bring down a cluster (JERR_MAP_LOCKED))

Problem: transactional dequeues can be sent via two paths as part of the 
transaction and
via the normal queue replication. If journal is involved this can result result 
in store errors
if the normal replication path attempts to dequeue before the transaction.

Solution: this is also the case for enqueues, and we already have code in place 
to skip replication
of tx enqueues via the normal route. Copied the same logic for dequeues.

> HA qpid-txtest2 can bring down a cluster (JERR_MAP_LOCKED)
> ----------------------------------------------------------
>
>                 Key: QPID-5974
>                 URL: https://issues.apache.org/jira/browse/QPID-5974
>             Project: Qpid
>          Issue Type: Bug
>          Components: C++ Clustering
>    Affects Versions: 0.28
>            Reporter: Alan Conway
>            Assignee: Alan Conway
>
> Description of problem:
> qpid-txtest2 AMQP0-10 transactional & durable transfer operation can bring 
> down whole qpid HA.  Note no brokers were killed, just the txtest was run.
> To reproduce:
> 3 node cluster 
> whlie qpid-txtest2 -b 20.0.20.200 --tx-count 500 --queues 10 
> --messages-per-tx 10 --total-messages 1000 --durable 1
> Result: 
> Test fails. Broker logs show critical and error messages  like this:
> {noformat}
> [root@dhcp-lab-A ~]# grep -E 'error|critical' ~qpidd/qpidd.log
> 2014-07-24 14:10:33 [Protocol] error Connection 
> qpid.192.168.6.246:5672-192.168.6.247:34210 timed out: closing
> [root@dhcp-lab-B ~]# grep -E 'error|critical' ~qpidd/qpidd.log
> 2014-07-24 14:10:23 [HA] critical Shutting down: Backup of tx-test2-1: 
> Replication failed: Queue tx-test2-1: async_dequeue() failed: jexception 
> 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked by a 
> pending transaction. (drid=0x6da3) 
> (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
>  (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)
> 2014-07-24 14:10:23 [Protocol] error Connection 
> qpid.ha.link.09e80392-0c79-4239-a1d0-ea5b53c71bd9 closed by error: Backup of 
> tx-test2-1: Replication failed: Queue tx-test2-1: async_dequeue() failed: 
> jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID 
> locked by a pending transaction. (drid=0x6da3) 
> (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
>  
> (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)(501)
> 2014-07-24 14:10:24 [Broker] error Could not find dequeued message on commit
> 2014-07-24 14:10:24 [HA] error Backup of transaction 00648954: Destroyed 
> prematurely, rollback
> 2014-07-24 14:10:24 [HA] error Backup of transaction 2f556197: Destroyed 
> prematurely, rollback
> 2014-07-24 14:10:24 [HA] error Backup of transaction 5bd58ffe: Destroyed 
> prematurely, rollback
> 2014-07-24 14:10:24 [HA] error Backup of transaction 5d34703c: Destroyed 
> prematurely, rollback
> 2014-07-24 14:10:24 [HA] error Backup of transaction 7e93a7ea: Destroyed 
> prematurely, rollback
> 2014-07-24 14:10:24 [HA] error Backup of transaction e8856f6f: Destroyed 
> prematurely, rollback
> 2014-07-24 14:10:38 [HA] critical Shutting down: Backup of tx-test2-1: 
> Replication failed: Queue tx-test2-1: async_dequeue() failed: jexception 
> 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked by a 
> pending transaction. (drid=0x7a42) 
> (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
>  (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)
> 2014-07-24 14:10:38 [Protocol] error Connection 
> qpid.ha.link.0fc6bd3c-48c2-4b27-9db3-2742b3ddc835 closed by error: Backup of 
> tx-test2-1: Replication failed: Queue tx-test2-1: async_dequeue() failed: 
> jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID 
> locked by a pending transaction. (drid=0x7a42) 
> (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
>  
> (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)(501)
> 2014-07-24 14:10:38 [Broker] error Could not find dequeued message on commit
> 2014-07-24 14:10:38 [HA] critical Shutting down: Backup of tx-test2-10: 
> Replication failed: Queue tx-test2-10: async_dequeue() failed: jexception 
> 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked by a 
> pending transaction. (drid=0x7a43) 
> (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
>  (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)
> 2014-07-24 14:10:38 [Protocol] error Connection 
> qpid.ha.link.0fc6bd3c-48c2-4b27-9db3-2742b3ddc835 closed by error: Backup of 
> tx-test2-10: Replication failed: Queue tx-test2-10: async_dequeue() failed: 
> jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID 
> locked by a pending transaction. (drid=0x7a43) 
> (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
>  
> (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)(501)
> 2014-07-24 14:10:38 [Broker] error Could not find dequeued message on commit
> 2014-07-24 14:10:40 [HA] critical Shutting down: Backup of tx-test2-7: 
> Replication failed: Queue tx-test2-7: async_dequeue() failed: jexception 
> 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked by a 
> pending transaction. (drid=0x7a49) 
> (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
>  (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)
> 2014-07-24 14:10:40 [Protocol] error Connection 
> qpid.ha.link.0fc6bd3c-48c2-4b27-9db3-2742b3ddc835 closed by error: Backup of 
> tx-test2-7: Replication failed: Queue tx-test2-7: async_dequeue() failed: 
> jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID 
> locked by a pending transaction. (drid=0x7a49) 
> (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
>  
> (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)(501)
> 2014-07-24 14:10:40 [Broker] error Could not find dequeued message on commit
> 2014-07-24 14:11:10 [HA] error Backup: Joining active cluster, cannot be 
> promoted.
> [root@dhcp-lab-C ~]# grep -E 'error|critical' ~qpidd/qpidd.log
> 2014-07-24 14:10:23 [HA] critical Shutting down: Backup of tx-test2-1: 
> Replication failed: Queue tx-test2-1: async_dequeue() failed: jexception 
> 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked by a 
> pending transaction. (drid=0x53a3) 
> (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
>  (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)
> 2014-07-24 14:10:23 [Protocol] error Connection 
> qpid.ha.link.1bb57f0a-48db-460c-9260-0f5b353e4bd1 closed by error: Backup of 
> tx-test2-1: Replication failed: Queue tx-test2-1: async_dequeue() failed: 
> jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID 
> locked by a pending transaction. (drid=0x53a3) 
> (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
>  
> (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)(501)
> 2014-07-24 14:10:24 [Broker] error Could not find dequeued message on commit
> 2014-07-24 14:10:24 [HA] error Backup of transaction 00648954: Destroyed 
> prematurely, rollback
> 2014-07-24 14:10:24 [HA] error Backup of transaction 2f556197: Destroyed 
> prematurely, rollback
> 2014-07-24 14:10:24 [HA] error Backup of transaction 5bd58ffe: Destroyed 
> prematurely, rollback
> 2014-07-24 14:10:24 [HA] error Backup of transaction 5d34703c: Destroyed 
> prematurely, rollback
> 2014-07-24 14:10:24 [HA] error Backup of transaction 7e93a7ea: Destroyed 
> prematurely, rollback
> 2014-07-24 14:10:24 [HA] error Backup of transaction e8856f6f: Destroyed 
> prematurely, rollback
> 2014-07-24 14:10:35 [HA] error Backup of transaction 243b4279: Destroyed 
> prematurely, rollback
> 2014-07-24 14:10:35 [HA] error Backup of transaction 4f4a25df: Destroyed 
> prematurely, rollback
> 2014-07-24 14:10:35 [HA] error Backup of transaction 80cbe9af: Destroyed 
> prematurely, rollback
> 2014-07-24 14:10:35 [HA] error Backup of transaction a3ed917a: Destroyed 
> prematurely, rollback
> 2014-07-24 14:10:35 [HA] error Backup of transaction b7a4b9a0: Destroyed 
> prematurely, rollback
> 2014-07-24 14:10:35 [HA] error Backup of transaction b9ba9995: Destroyed 
> prematurely, rollback
> 2014-07-24 14:10:35 [HA] error Backup of transaction cbd0d6bf: Destroyed 
> prematurely, rollback
> 2014-07-24 14:10:35 [HA] error Backup of transaction e127288a: Destroyed 
> prematurely, rollback
> 2014-07-24 14:10:35 [HA] error Backup of transaction eb43e683: Destroyed 
> prematurely, rollback
> 2014-07-24 14:10:35 [HA] error Backup of transaction f29196c1: Destroyed 
> prematurely, rollback
> 2014-07-24 14:10:53 [HA] error Backup: Still catching up, cannot be promoted.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to