[jira] [Commented] (BEAM-2684) AmqpIOTest is flaky

2017-08-10 Thread Alex Filatov (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122193#comment-16122193
 ] 

Alex Filatov commented on BEAM-2684:


I can reproduce the issue by commenting out Thread.sleep on line #74 or by 
adding one on line #105.

{code}
org.apache.qpid.proton.messenger.impl.MessengerImpl processAllConnectors
SEVERE: Error processing connection
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at 
org.apache.qpid.proton.driver.impl.ConnectorImpl.process(ConnectorImpl.java:89)
at 
org.apache.qpid.proton.messenger.impl.MessengerImpl.processAllConnectors(MessengerImpl.java:687)
at 
org.apache.qpid.proton.messenger.impl.MessengerImpl.waitUntil(MessengerImpl.java:863)
at 
org.apache.qpid.proton.messenger.impl.MessengerImpl.waitUntil(MessengerImpl.java:844)
at 
org.apache.qpid.proton.messenger.impl.MessengerImpl.send(MessengerImpl.java:417)
at 
org.apache.qpid.proton.messenger.impl.MessengerImpl.send(MessengerImpl.java:394)
at org.apache.beam.sdk.io.amqp.AmqpIOTest$1.run(AmqpIOTest.java:82)
{code}

Relevant bits of thread dump:

{code:java}
"direct-runner-worker" #15 prio=5 os_prio=31 tid=0x7fd2c827 nid=0x5303 
runnable [0x7aa0]
   java.lang.Thread.State: RUNNABLE
at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method)
at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198)
at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
- locked <0x0007404c0dd0> (a sun.nio.ch.Util$3)
- locked <0x0007404c0dc0> (a java.util.Collections$UnmodifiableSet)
- locked <0x0007404c0c20> (a sun.nio.ch.KQueueSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:101)
at 
org.apache.qpid.proton.driver.impl.DriverImpl.doWait(DriverImpl.java:81)
at 
org.apache.qpid.proton.messenger.impl.MessengerImpl.waitUntil(MessengerImpl.java:894)
at 
org.apache.qpid.proton.messenger.impl.MessengerImpl.waitUntil(MessengerImpl.java:844)
at 
org.apache.qpid.proton.messenger.impl.MessengerImpl.recv(MessengerImpl.java:446)
at 
org.apache.qpid.proton.messenger.impl.MessengerImpl.recv(MessengerImpl.java:451)
at 
org.apache.beam.sdk.io.amqp.AmqpIO$UnboundedAmqpReader.advance(AmqpIO.java:330)

"Thread-1" #11 prio=5 os_prio=31 tid=0x7fd2c7a4f000 nid=0x1407 runnable 
[0x7a5f5000]
   java.lang.Thread.State: RUNNABLE
at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method)
at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198)
at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
- locked <0x00074000aa98> (a sun.nio.ch.Util$3)
- locked <0x00074000aaa8> (a java.util.Collections$UnmodifiableSet)
- locked <0x00074000aa48> (a sun.nio.ch.KQueueSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:101)
at 
org.apache.qpid.proton.driver.impl.DriverImpl.doWait(DriverImpl.java:81)
at 
org.apache.qpid.proton.messenger.impl.MessengerImpl.waitUntil(MessengerImpl.java:894)
at 
org.apache.qpid.proton.messenger.impl.MessengerImpl.waitUntil(MessengerImpl.java:844)
at 
org.apache.qpid.proton.messenger.impl.MessengerImpl.send(MessengerImpl.java:417)
at 
org.apache.qpid.proton.messenger.impl.MessengerImpl.send(MessengerImpl.java:394)
at org.apache.beam.sdk.io.amqp.AmqpIOTest$1.run(AmqpIOTest.java:82)
{code}

I think root cause is a race between sender and receiver. In the peer-to-peer 
mode (without a broker) receiver must be running before sender starts sending 
messages. But when it's not the case sender seems to get stuck by suppressing 
'Connection refused' exception and then waiting indefinitely for a connection 
event on an invalid socket channel.

> AmqpIOTest is flaky
> ---
>
> Key: BEAM-2684
> URL: https://issues.apache.org/jira/browse/BEAM-2684
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Reporter: Eugene Kirpichov
>Assignee: Jean-Baptiste Onofré
>
> This test is often timing out, and has been doing that for a while, causing 
> unrelated PRs to fail randomly. I've gotten into the habit of excluding 
> sdks/java/io/amqp when running "mvn verify" and I suppose it's not a good 
> habit :) 
> Example failure: 
> 

[jira] [Created] (BEAM-2544) AvroIOTest is flaky

2017-06-29 Thread Alex Filatov (JIRA)
Alex Filatov created BEAM-2544:
--

 Summary: AvroIOTest is flaky
 Key: BEAM-2544
 URL: https://issues.apache.org/jira/browse/BEAM-2544
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-core
Reporter: Alex Filatov
Assignee: Davor Bonaci
Priority: Minor


"Write then read" tests randomly fail.

Steps to reproduce:

cd /runners/direct-java
mvn clean compile
mvn surefire:test@validates-runner-tests -Dtest=AvroIOTest

Repeat last step until a failure (on my machine failure rate is approx 1/3).

Example:

[ERROR] testAvroIOWriteAndReadSchemaUpgrade(org.apache.beam.sdk.io.AvroIOTest)  
Time elapsed: 0.198 s  <<< ERROR!
java.lang.RuntimeException: java.io.FileNotFoundException: 
/var/folders/1c/sl733g5s1g7_4mq61_qmbjx4gn/T/junit3332447750239941326/output.avro
 (No such file or directory)
at 
org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:340)
at 
org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:302)
at 
org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:201)
at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:64)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:297)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:283)
at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:340)
at 
org.apache.beam.sdk.io.AvroIOTest.testAvroIOWriteAndReadSchemaUpgrade(AvroIOTest.java:275)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:239)
at 
org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:321)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
at 
org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:321)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runners.Suite.runChild(Suite.java:128)
at org.junit.runners.Suite.runChild(Suite.java:27)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.apache.maven.surefire.junitcore.JUnitCore.run(JUnitCore.java:55)
at 
org.apache.maven.surefire.junitcore.JUnitCoreWrapper.createRequestAndRun(JUnitCoreWrapper.java:137)
at 
org.apache.maven.surefire.junitcore.JUnitCoreWrapper.executeEager(JUnitCoreWrapper.java:107)
at 
org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:83)
at 
org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:75)
at 
org.apache.maven.surefire.junitcore.JUnitCoreProvider.invoke(JUnitCoreProvider.java:157)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:386)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:323)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:143)
Caused by: java.io.FileNotFoundException: