[ https://issues.apache.org/jira/browse/BEAM-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16122193#comment-16122193 ]
Alex Filatov commented on BEAM-2684: ------------------------------------ I can reproduce the issue by commenting out Thread.sleep on line #74 or by adding one on line #105. {code} org.apache.qpid.proton.messenger.impl.MessengerImpl processAllConnectors SEVERE: Error processing connection java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at org.apache.qpid.proton.driver.impl.ConnectorImpl.process(ConnectorImpl.java:89) at org.apache.qpid.proton.messenger.impl.MessengerImpl.processAllConnectors(MessengerImpl.java:687) at org.apache.qpid.proton.messenger.impl.MessengerImpl.waitUntil(MessengerImpl.java:863) at org.apache.qpid.proton.messenger.impl.MessengerImpl.waitUntil(MessengerImpl.java:844) at org.apache.qpid.proton.messenger.impl.MessengerImpl.send(MessengerImpl.java:417) at org.apache.qpid.proton.messenger.impl.MessengerImpl.send(MessengerImpl.java:394) at org.apache.beam.sdk.io.amqp.AmqpIOTest$1.run(AmqpIOTest.java:82) {code} Relevant bits of thread dump: {code:java} "direct-runner-worker" #15 prio=5 os_prio=31 tid=0x00007fd2c8270000 nid=0x5303 runnable [0x000070000aa00000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method) at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198) at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) - locked <0x00000007404c0dd0> (a sun.nio.ch.Util$3) - locked <0x00000007404c0dc0> (a java.util.Collections$UnmodifiableSet) - locked <0x00000007404c0c20> (a sun.nio.ch.KQueueSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:101) at org.apache.qpid.proton.driver.impl.DriverImpl.doWait(DriverImpl.java:81) at org.apache.qpid.proton.messenger.impl.MessengerImpl.waitUntil(MessengerImpl.java:894) at org.apache.qpid.proton.messenger.impl.MessengerImpl.waitUntil(MessengerImpl.java:844) at org.apache.qpid.proton.messenger.impl.MessengerImpl.recv(MessengerImpl.java:446) at org.apache.qpid.proton.messenger.impl.MessengerImpl.recv(MessengerImpl.java:451) at org.apache.beam.sdk.io.amqp.AmqpIO$UnboundedAmqpReader.advance(AmqpIO.java:330) "Thread-1" #11 prio=5 os_prio=31 tid=0x00007fd2c7a4f000 nid=0x1407 runnable [0x000070000a5f5000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method) at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198) at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) - locked <0x000000074000aa98> (a sun.nio.ch.Util$3) - locked <0x000000074000aaa8> (a java.util.Collections$UnmodifiableSet) - locked <0x000000074000aa48> (a sun.nio.ch.KQueueSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:101) at org.apache.qpid.proton.driver.impl.DriverImpl.doWait(DriverImpl.java:81) at org.apache.qpid.proton.messenger.impl.MessengerImpl.waitUntil(MessengerImpl.java:894) at org.apache.qpid.proton.messenger.impl.MessengerImpl.waitUntil(MessengerImpl.java:844) at org.apache.qpid.proton.messenger.impl.MessengerImpl.send(MessengerImpl.java:417) at org.apache.qpid.proton.messenger.impl.MessengerImpl.send(MessengerImpl.java:394) at org.apache.beam.sdk.io.amqp.AmqpIOTest$1.run(AmqpIOTest.java:82) {code} I think root cause is a race between sender and receiver. In the peer-to-peer mode (without a broker) receiver must be running before sender starts sending messages. But when it's not the case sender seems to get stuck by suppressing 'Connection refused' exception and then waiting indefinitely for a connection event on an invalid socket channel. > AmqpIOTest is flaky > ------------------- > > Key: BEAM-2684 > URL: https://issues.apache.org/jira/browse/BEAM-2684 > Project: Beam > Issue Type: Bug > Components: sdk-java-extensions > Reporter: Eugene Kirpichov > Assignee: Jean-Baptiste Onofré > > This test is often timing out, and has been doing that for a while, causing > unrelated PRs to fail randomly. I've gotten into the habit of excluding > sdks/java/io/amqp when running "mvn verify" and I suppose it's not a good > habit :) > Example failure: > https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/13424/console -- This message was sent by Atlassian JIRA (v6.4.14#64029)