I suspect many of these BindExceptions are caused by the test itself trying to use the same port twice, rather than something else on the box grabbing a port.
I do think the long term solution is to get rid of AvailablePort. But maybe in the short term we could change it to not pick a random number, but instead always start from the same point? That way if there are bugs in the tests they would fail every time. -Dan On Wed, Sep 7, 2016 at 6:17 AM, Jens Deppe <jde...@pivotal.io> wrote: > We're already using that plugin to run the distributedTest task. We have a > story to also implement that for flaky tests. > > --Jens > > On Tue, Sep 6, 2016 at 9:16 PM, Sai Boorlagadda <sai.boorlaga...@gmail.com > > > wrote: > > > +1 for dockerized tests. > > > > Most of the CI failures due to state left over are not easily > reproducible. > > I prefer spending time eliminating these failures and may be dockerized > > tests would be the way to go. > > > > Sai > > > > On Tue, Sep 6, 2016 at 5:44 PM, Swapnil Bawaskar <sbawas...@pivotal.io> > > wrote: > > > > > To make sure this is not a problem again, how about running the tests > in > > > their own container using something like gradle-dockerized-test-plugin[ > > 1]? > > > If each of our test is run in its own container, we will be able to > > address > > > the BindAddress as well as "state left by previous test" issue. Sure > this > > > will take longer to complete the tests, but we only do a nightly build. > > We > > > could also run our tests in parallel in different containers to > speed-up > > > our build. We could also go one step further in getting a clean slate > on > > > "CI failure" issues. My main argument for doing this are: > > > 1. We have 190 issues that are marked as "ci" failures [2]. > > > 2. That a lot of CI failures are due to state left behind by previous > > > tests. (26 are just bind exceptions[3]) > > > > > > Fixing 190 test issues is definitely going to slow us down from adding > > > features to Geode, so getting a clean slate will allow us to narrow > down > > CI > > > failures to race condition in test (or product). > > > > > > If we think this is a good idea, then we could check with ASF infra to > > see > > > if docker can be setup on jenkins slaves. > > > > > > [1] https://github.com/pedjak/gradle-dockerized-test-plugin > > > [2] > > > https://issues.apache.org/jira/browse/GEODE-1778?jql= > > > project%20%3D%20GEODE%20AND%20status%20in%20(Open%2C%20% > > > 22In%20Progress%22%2C%20Reopened)%20AND%20labels%20%3D%20ci > > > [3] > > > https://issues.apache.org/jira/browse/GEODE-973?jql= > > > project%20%3D%20GEODE%20AND%20status%20in%20(Open%2C%20% > > > 22In%20Progress%22%2C%20Reopened)%20AND%20text%20~% > 20%22BindException%22 > > > > > > On Tue, Sep 6, 2016 at 3:01 PM, Kirk Lund <kl...@pivotal.io> wrote: > > > > > > > No wonder that test is intermittently failing then. I didn't think we > > had > > > > any tests with hard-coded ports. I filed GEODE-1863 and Darrel picked > > it > > > > up. > > > > > > > > -Kirk > > > > > > > > On Tue, Sep 6, 2016 at 9:30 AM, Bruce Schuchardt < > > bschucha...@pivotal.io > > > > > > > > wrote: > > > > > > > > > This test is not using AvailablePort. There are two test cases in > > this > > > > > class that alway use port 5555. > > > > > > > > > > > > > > > Le 9/6/2016 à 8:00 AM, Anthony Baker a écrit : > > > > > > > > > >> How could we fix AvailablePort so we don’t try to use in-use > ports? > > > > >> > > > > >> Anthony > > > > >> > > > > >> On Sep 3, 2016, at 10:29 PM, Kirk Lund <kl...@apache.org> wrote: > > > > >>> > > > > >>> We're still hitting BindExceptions in the nightly build, so I'll > go > > > > ahead > > > > >>> and propose this again: any test that uses AvailablePort to find > a > > > > random > > > > >>> port could be altered to automatically Retry if it encounters and > > > fails > > > > >>> because of java.net.BindException. Opinions? > > > > >>> > > > > >>> -Kirk > > > > >>> > > > > >>> :geode-core:integrationTest > > > > >>> > > > > >>> com.gemstone.gemfire.internal.cache.DiskRegionJUnitTest > > > > > >>> testBridgeServerRunningInSynchPersistOnlyForIOExceptionCase > FAILED > > > > >>> java.net.BindException: Failed to create server socket on > > > > >>> null[5,555] > > > > >>> at com.gemstone.gemfire.internal. > > > > SocketCreator.createServerSock > > > > >>> et( > > > > >>> SocketCreator.java:814) > > > > >>> at com.gemstone.gemfire.internal. > > > > SocketCreator.createServerSock > > > > >>> et( > > > > >>> SocketCreator.java:774) > > > > >>> at com.gemstone.gemfire.internal. > > > > SocketCreator.createServerSock > > > > >>> et( > > > > >>> SocketCreator.java:738) > > > > >>> at com.gemstone.gemfire.internal.cache.tier.sockets. > > > > >>> AcceptorImpl.<init>(AcceptorImpl.java:470) > > > > >>> at com.gemstone.gemfire.internal. > > > cache.CacheServerImpl.start( > > > > >>> CacheServerImpl.java:323) > > > > >>> at com.gemstone.gemfire.internal. > > cache.DiskRegionJUnitTest. > > > > >>> testBridgeServerRunningInSynchPersistOnlyForIOExceptionCase( > > > > >>> DiskRegionJUnitTest.java:2215) > > > > >>> > > > > >>> Caused by: > > > > >>> java.net.BindException: Address already in use > > > > >>> at java.net.PlainSocketImpl.socketBind(Native > Method) > > > > >>> at java.net.AbstractPlainSocketImpl.bind( > > > > >>> AbstractPlainSocketImpl.java:387) > > > > >>> at java.net.ServerSocket.bind(ServerSocket.java:375) > > > > >>> at com.gemstone.gemfire.internal.SocketCreator. > > > > >>> createServerSocket(SocketCreator.java:811) > > > > >>> ... 5 more > > > > >>> > > > > >>> com.gemstone.gemfire.internal.cache.DiskRegionJUnitTest > > > > > >>> testBridgeServerStoppingInSynchPersistOnlyForIOExceptionCase > > FAILED > > > > >>> java.net.BindException: Failed to create server socket on > > > > >>> null[5,555] > > > > >>> at com.gemstone.gemfire.internal. > > > > SocketCreator.createServerSock > > > > >>> et( > > > > >>> SocketCreator.java:814) > > > > >>> at com.gemstone.gemfire.internal. > > > > SocketCreator.createServerSock > > > > >>> et( > > > > >>> SocketCreator.java:774) > > > > >>> at com.gemstone.gemfire.internal. > > > > SocketCreator.createServerSock > > > > >>> et( > > > > >>> SocketCreator.java:738) > > > > >>> at com.gemstone.gemfire.internal.cache.tier.sockets. > > > > >>> AcceptorImpl.<init>(AcceptorImpl.java:470) > > > > >>> at com.gemstone.gemfire.internal. > > > cache.CacheServerImpl.start( > > > > >>> CacheServerImpl.java:323) > > > > >>> at com.gemstone.gemfire.internal. > > cache.DiskRegionJUnitTest. > > > > >>> testBridgeServerStoppingInSynchPersistOnlyForIOExceptionCase > > > > >>> (DiskRegionJUnitTest.java:2103) > > > > >>> > > > > >>> Caused by: > > > > >>> java.net.BindException: Address already in use > > > > >>> at java.net.PlainSocketImpl.socketBind(Native > Method) > > > > >>> at java.net.AbstractPlainSocketImpl.bind( > > > > >>> AbstractPlainSocketImpl.java:387) > > > > >>> at java.net.ServerSocket.bind(ServerSocket.java:375) > > > > >>> at com.gemstone.gemfire.internal.SocketCreator. > > > > >>> createServerSocket(SocketCreator.java:811) > > > > >>> ... 5 more > > > > >>> > > > > >>> 3247 tests completed, 2 failed, 175 skipped > > > > >>> > > > > >> > > > > > > > > > > > > > > >