[ 
https://issues.apache.org/jira/browse/ARTEMIS-480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Justin Bertram resolved ARTEMIS-480.
------------------------------------
       Resolution: Fixed
    Fix Version/s: 1.3.0

> [Artemis Testsuite] 
> BridgeReconnectTest.testDeliveringCountOnBridgeConnectionFailure fails due to 
> racing condition
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: ARTEMIS-480
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-480
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 1.1.0, 1.2.0, 1.3.0
>            Reporter: Ingo Weiss
>             Fix For: 1.3.0
>
>
> {code}
> java.lang.AssertionError: Delivering count of a source queue should be zero 
> on connection failure expected:<0> but was:<1>
>       at org.junit.Assert.fail(Assert.java:88)
>       at org.junit.Assert.failNotEquals(Assert.java:743)
>       at org.junit.Assert.assertEquals(Assert.java:118)
>       at org.junit.Assert.assertEquals(Assert.java:555)
>       at 
> org.apache.activemq.artemis.tests.integration.cluster.bridge.BridgeReconnectTest.testDeliveringCountOnBridgeConnectionFailure(BridgeReconnectTest.java:688)
> {code}
> {code}
> 18:25:43,722 WARN  [org.apache.activemq.artemis.core.server] AMQ222094: 
> Bridge unable to send message 
> Reference[22]:NON-RELIABLE:ServerMessage[messageID=22,durable=false,userID=null,priority=4,
>  bodySize=79, timestamp=Tue Feb 02 18:25:43 EST 2016,expiration=0, 
> durable=false, 
> address=testAddress,properties=TypedProperties[propkey=18,_AMQ_BRIDGE_DUP=[47A6
>  779A CA04 11E5 9C91 A169 FCCB 5522 0000 0000 0000 0016)]]@1861263739, will 
> try again once bridge reconnects: 
> ActiveMQObjectClosedException[errorType=OBJECT_CLOSED message=AMQ119018: 
> Producer is closed]
>       at 
> org.apache.activemq.artemis.core.client.impl.ClientProducerImpl.checkClosed(ClientProducerImpl.java:298)
>  [artemis-core-client-1.1.0.wildfly-012.jar:]
>       at 
> org.apache.activemq.artemis.core.client.impl.ClientProducerImpl.send(ClientProducerImpl.java:122)
>  [artemis-core-client-1.1.0.wildfly-012.jar:]
>       at 
> org.apache.activemq.artemis.core.server.cluster.impl.BridgeImpl.deliverStandardMessage(BridgeImpl.java:698)
>  [artemis-server-1.1.0.wildfly-012.jar:]
>       at 
> org.apache.activemq.artemis.core.server.cluster.impl.BridgeImpl.handle(BridgeImpl.java:574)
>  [artemis-server-1.1.0.wildfly-012.jar:]
>       at 
> org.apache.activemq.artemis.core.server.impl.QueueImpl.handle(QueueImpl.java:2410)
>  [artemis-server-1.1.0.wildfly-012.jar:]
>       at 
> org.apache.activemq.artemis.core.server.impl.QueueImpl.deliver(QueueImpl.java:1813)
>  [artemis-server-1.1.0.wildfly-012.jar:]
>       at 
> org.apache.activemq.artemis.core.server.impl.QueueImpl.access$1400(QueueImpl.java:97)
>  [artemis-server-1.1.0.wildfly-012.jar:]
>       at 
> org.apache.activemq.artemis.core.server.impl.QueueImpl$DeliverRunner.run(QueueImpl.java:2581)
>  [artemis-server-1.1.0.wildfly-012.jar:]
>       at 
> org.apache.activemq.artemis.utils.OrderedExecutorFactory$OrderedExecutor$ExecutorTask.run(OrderedExecutorFactory.java:100)
>  [artemis-core-client-1.1.0.wildfly-012.jar:]
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1153)
>  [rt.jar:1.8.0-internal]
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  [rt.jar:1.8.0-internal]
>       at java.lang.Thread.run(Thread.java:785) [vm.jar:1.8.0-internal]
> {code}
> I've investigated this issue and I found the race condition which causes 
> mentioned fail. Problem lies in \[1\]. When bridge detects some problem, it 
> calls {{connectionFailed}} method which call for every message in {{refs}} 
> the {{Queue.cancel(ref, timeBase)}}. {{Queue.cancel}} decreases 
> {{deliveryCount}} for canceled message. However before this step, we remove 
> reference on actual message from {{refs}} on line 20, so for this message the 
> {{deliveryCount}} is not decreased. This is correct behavior, because for 
> this message we return {{HandleStatus.BUSY}}. I think that problem is in 
> {{QueueImpl#deliver}} method. If bridge returns {{HandleStatus.BUSY}} we 
> should decrease {{deliveryCount}}. So I think that instead of \[2\], there 
> should be \[3\].
> \[1\]
> {code:language=java|linenumbers=true}
> private BridgeImpl#HandleStatus deliverStandardMessage(SimpleString dest, 
> final MessageReference ref, ServerMessage message) {
>       // if we failover during send then there is a chance that the
>       // that this will throw a disconnect, we need to remove the message
>       // from the acks so it will get resent, duplicate detection will cope
>       // with any messages resent
>       if (ActiveMQServerLogger.LOGGER.isTraceEnabled()) {
>          ActiveMQServerLogger.LOGGER.trace("going to send message: " + 
> message + " from " + this.getQueue());
>       }
>       try {
>          producer.send(dest, message);
>       }
>       catch (final ActiveMQException e) {
>          ActiveMQServerLogger.LOGGER.bridgeUnableToSendMessage(e, ref);
>          synchronized (refs) {
>             // We remove this reference as we are returning busy which means 
> the reference will never leave the Queue.
>             // because of this we have to remove the reference here
>             refs.remove(message.getMessageID());
>          }
>          connectionFailed(e, false);
>          return HandleStatus.BUSY;
>       }
>       return HandleStatus.HANDLED;
>    }
> {code}
> \[2\]
> {code}
> else if (status == HandleStatus.BUSY) {
>     holder.iter.repeat();
>      noDelivery++;
> }
> {code}
> \[3\]
> {code}
> else if (status == HandleStatus.BUSY) {
>     decDelivering();
>     holder.iter.repeat();
>      noDelivery++;
> }
> {code}
> Steps to reproduce:
> 1. {{cd tests}}
> 2. {{while true; do mvn 
> -Dtest=BridgeReconnectTest#testDeliveringCountOnBridgeConnectionFailure 
> -Ptests -DfailIfNoTests=false test; done}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to