[ https://issues.apache.org/jira/browse/ARTEMIS-480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Justin Bertram resolved ARTEMIS-480. ------------------------------------ Resolution: Fixed Fix Version/s: 1.3.0 > [Artemis Testsuite] > BridgeReconnectTest.testDeliveringCountOnBridgeConnectionFailure fails due to > racing condition > ------------------------------------------------------------------------------------------------------------------ > > Key: ARTEMIS-480 > URL: https://issues.apache.org/jira/browse/ARTEMIS-480 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker > Affects Versions: 1.1.0, 1.2.0, 1.3.0 > Reporter: Ingo Weiss > Fix For: 1.3.0 > > > {code} > java.lang.AssertionError: Delivering count of a source queue should be zero > on connection failure expected:<0> but was:<1> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.activemq.artemis.tests.integration.cluster.bridge.BridgeReconnectTest.testDeliveringCountOnBridgeConnectionFailure(BridgeReconnectTest.java:688) > {code} > {code} > 18:25:43,722 WARN [org.apache.activemq.artemis.core.server] AMQ222094: > Bridge unable to send message > Reference[22]:NON-RELIABLE:ServerMessage[messageID=22,durable=false,userID=null,priority=4, > bodySize=79, timestamp=Tue Feb 02 18:25:43 EST 2016,expiration=0, > durable=false, > address=testAddress,properties=TypedProperties[propkey=18,_AMQ_BRIDGE_DUP=[47A6 > 779A CA04 11E5 9C91 A169 FCCB 5522 0000 0000 0000 0016)]]@1861263739, will > try again once bridge reconnects: > ActiveMQObjectClosedException[errorType=OBJECT_CLOSED message=AMQ119018: > Producer is closed] > at > org.apache.activemq.artemis.core.client.impl.ClientProducerImpl.checkClosed(ClientProducerImpl.java:298) > [artemis-core-client-1.1.0.wildfly-012.jar:] > at > org.apache.activemq.artemis.core.client.impl.ClientProducerImpl.send(ClientProducerImpl.java:122) > [artemis-core-client-1.1.0.wildfly-012.jar:] > at > org.apache.activemq.artemis.core.server.cluster.impl.BridgeImpl.deliverStandardMessage(BridgeImpl.java:698) > [artemis-server-1.1.0.wildfly-012.jar:] > at > org.apache.activemq.artemis.core.server.cluster.impl.BridgeImpl.handle(BridgeImpl.java:574) > [artemis-server-1.1.0.wildfly-012.jar:] > at > org.apache.activemq.artemis.core.server.impl.QueueImpl.handle(QueueImpl.java:2410) > [artemis-server-1.1.0.wildfly-012.jar:] > at > org.apache.activemq.artemis.core.server.impl.QueueImpl.deliver(QueueImpl.java:1813) > [artemis-server-1.1.0.wildfly-012.jar:] > at > org.apache.activemq.artemis.core.server.impl.QueueImpl.access$1400(QueueImpl.java:97) > [artemis-server-1.1.0.wildfly-012.jar:] > at > org.apache.activemq.artemis.core.server.impl.QueueImpl$DeliverRunner.run(QueueImpl.java:2581) > [artemis-server-1.1.0.wildfly-012.jar:] > at > org.apache.activemq.artemis.utils.OrderedExecutorFactory$OrderedExecutor$ExecutorTask.run(OrderedExecutorFactory.java:100) > [artemis-core-client-1.1.0.wildfly-012.jar:] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1153) > [rt.jar:1.8.0-internal] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > [rt.jar:1.8.0-internal] > at java.lang.Thread.run(Thread.java:785) [vm.jar:1.8.0-internal] > {code} > I've investigated this issue and I found the race condition which causes > mentioned fail. Problem lies in \[1\]. When bridge detects some problem, it > calls {{connectionFailed}} method which call for every message in {{refs}} > the {{Queue.cancel(ref, timeBase)}}. {{Queue.cancel}} decreases > {{deliveryCount}} for canceled message. However before this step, we remove > reference on actual message from {{refs}} on line 20, so for this message the > {{deliveryCount}} is not decreased. This is correct behavior, because for > this message we return {{HandleStatus.BUSY}}. I think that problem is in > {{QueueImpl#deliver}} method. If bridge returns {{HandleStatus.BUSY}} we > should decrease {{deliveryCount}}. So I think that instead of \[2\], there > should be \[3\]. > \[1\] > {code:language=java|linenumbers=true} > private BridgeImpl#HandleStatus deliverStandardMessage(SimpleString dest, > final MessageReference ref, ServerMessage message) { > // if we failover during send then there is a chance that the > // that this will throw a disconnect, we need to remove the message > // from the acks so it will get resent, duplicate detection will cope > // with any messages resent > if (ActiveMQServerLogger.LOGGER.isTraceEnabled()) { > ActiveMQServerLogger.LOGGER.trace("going to send message: " + > message + " from " + this.getQueue()); > } > try { > producer.send(dest, message); > } > catch (final ActiveMQException e) { > ActiveMQServerLogger.LOGGER.bridgeUnableToSendMessage(e, ref); > synchronized (refs) { > // We remove this reference as we are returning busy which means > the reference will never leave the Queue. > // because of this we have to remove the reference here > refs.remove(message.getMessageID()); > } > connectionFailed(e, false); > return HandleStatus.BUSY; > } > return HandleStatus.HANDLED; > } > {code} > \[2\] > {code} > else if (status == HandleStatus.BUSY) { > holder.iter.repeat(); > noDelivery++; > } > {code} > \[3\] > {code} > else if (status == HandleStatus.BUSY) { > decDelivering(); > holder.iter.repeat(); > noDelivery++; > } > {code} > Steps to reproduce: > 1. {{cd tests}} > 2. {{while true; do mvn > -Dtest=BridgeReconnectTest#testDeliveringCountOnBridgeConnectionFailure > -Ptests -DfailIfNoTests=false test; done}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)