[ 
https://issues.apache.org/jira/browse/BEAM-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16122193#comment-16122193
 ] 

Alex Filatov commented on BEAM-2684:
------------------------------------

I can reproduce the issue by commenting out Thread.sleep on line #74 or by 
adding one on line #105.

{code}
org.apache.qpid.proton.messenger.impl.MessengerImpl processAllConnectors
SEVERE: Error processing connection
java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
        at 
org.apache.qpid.proton.driver.impl.ConnectorImpl.process(ConnectorImpl.java:89)
        at 
org.apache.qpid.proton.messenger.impl.MessengerImpl.processAllConnectors(MessengerImpl.java:687)
        at 
org.apache.qpid.proton.messenger.impl.MessengerImpl.waitUntil(MessengerImpl.java:863)
        at 
org.apache.qpid.proton.messenger.impl.MessengerImpl.waitUntil(MessengerImpl.java:844)
        at 
org.apache.qpid.proton.messenger.impl.MessengerImpl.send(MessengerImpl.java:417)
        at 
org.apache.qpid.proton.messenger.impl.MessengerImpl.send(MessengerImpl.java:394)
        at org.apache.beam.sdk.io.amqp.AmqpIOTest$1.run(AmqpIOTest.java:82)
{code}

Relevant bits of thread dump:

{code:java}
"direct-runner-worker" #15 prio=5 os_prio=31 tid=0x00007fd2c8270000 nid=0x5303 
runnable [0x000070000aa00000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method)
        at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198)
        at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
        - locked <0x00000007404c0dd0> (a sun.nio.ch.Util$3)
        - locked <0x00000007404c0dc0> (a java.util.Collections$UnmodifiableSet)
        - locked <0x00000007404c0c20> (a sun.nio.ch.KQueueSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:101)
        at 
org.apache.qpid.proton.driver.impl.DriverImpl.doWait(DriverImpl.java:81)
        at 
org.apache.qpid.proton.messenger.impl.MessengerImpl.waitUntil(MessengerImpl.java:894)
        at 
org.apache.qpid.proton.messenger.impl.MessengerImpl.waitUntil(MessengerImpl.java:844)
        at 
org.apache.qpid.proton.messenger.impl.MessengerImpl.recv(MessengerImpl.java:446)
        at 
org.apache.qpid.proton.messenger.impl.MessengerImpl.recv(MessengerImpl.java:451)
        at 
org.apache.beam.sdk.io.amqp.AmqpIO$UnboundedAmqpReader.advance(AmqpIO.java:330)

"Thread-1" #11 prio=5 os_prio=31 tid=0x00007fd2c7a4f000 nid=0x1407 runnable 
[0x000070000a5f5000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method)
        at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198)
        at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
        - locked <0x000000074000aa98> (a sun.nio.ch.Util$3)
        - locked <0x000000074000aaa8> (a java.util.Collections$UnmodifiableSet)
        - locked <0x000000074000aa48> (a sun.nio.ch.KQueueSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:101)
        at 
org.apache.qpid.proton.driver.impl.DriverImpl.doWait(DriverImpl.java:81)
        at 
org.apache.qpid.proton.messenger.impl.MessengerImpl.waitUntil(MessengerImpl.java:894)
        at 
org.apache.qpid.proton.messenger.impl.MessengerImpl.waitUntil(MessengerImpl.java:844)
        at 
org.apache.qpid.proton.messenger.impl.MessengerImpl.send(MessengerImpl.java:417)
        at 
org.apache.qpid.proton.messenger.impl.MessengerImpl.send(MessengerImpl.java:394)
        at org.apache.beam.sdk.io.amqp.AmqpIOTest$1.run(AmqpIOTest.java:82)
{code}

I think root cause is a race between sender and receiver. In the peer-to-peer 
mode (without a broker) receiver must be running before sender starts sending 
messages. But when it's not the case sender seems to get stuck by suppressing 
'Connection refused' exception and then waiting indefinitely for a connection 
event on an invalid socket channel.

> AmqpIOTest is flaky
> -------------------
>
>                 Key: BEAM-2684
>                 URL: https://issues.apache.org/jira/browse/BEAM-2684
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-extensions
>            Reporter: Eugene Kirpichov
>            Assignee: Jean-Baptiste Onofré
>
> This test is often timing out, and has been doing that for a while, causing 
> unrelated PRs to fail randomly. I've gotten into the habit of excluding 
> sdks/java/io/amqp when running "mvn verify" and I suppose it's not a good 
> habit :) 
> Example failure: 
> https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/13424/console



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to