To make sure this is not a problem again, how about running the tests in their own container using something like gradle-dockerized-test-plugin[1]? If each of our test is run in its own container, we will be able to address the BindAddress as well as "state left by previous test" issue. Sure this will take longer to complete the tests, but we only do a nightly build. We could also run our tests in parallel in different containers to speed-up our build. We could also go one step further in getting a clean slate on "CI failure" issues. My main argument for doing this are: 1. We have 190 issues that are marked as "ci" failures [2]. 2. That a lot of CI failures are due to state left behind by previous tests. (26 are just bind exceptions[3])
Fixing 190 test issues is definitely going to slow us down from adding features to Geode, so getting a clean slate will allow us to narrow down CI failures to race condition in test (or product). If we think this is a good idea, then we could check with ASF infra to see if docker can be setup on jenkins slaves. [1] https://github.com/pedjak/gradle-dockerized-test-plugin [2] https://issues.apache.org/jira/browse/GEODE-1778?jql=project%20%3D%20GEODE%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%20AND%20labels%20%3D%20ci [3] https://issues.apache.org/jira/browse/GEODE-973?jql=project%20%3D%20GEODE%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%20AND%20text%20~%20%22BindException%22 On Tue, Sep 6, 2016 at 3:01 PM, Kirk Lund <kl...@pivotal.io> wrote: > No wonder that test is intermittently failing then. I didn't think we had > any tests with hard-coded ports. I filed GEODE-1863 and Darrel picked it > up. > > -Kirk > > On Tue, Sep 6, 2016 at 9:30 AM, Bruce Schuchardt <bschucha...@pivotal.io> > wrote: > > > This test is not using AvailablePort. There are two test cases in this > > class that alway use port 5555. > > > > > > Le 9/6/2016 à 8:00 AM, Anthony Baker a écrit : > > > >> How could we fix AvailablePort so we don’t try to use in-use ports? > >> > >> Anthony > >> > >> On Sep 3, 2016, at 10:29 PM, Kirk Lund <kl...@apache.org> wrote: > >>> > >>> We're still hitting BindExceptions in the nightly build, so I'll go > ahead > >>> and propose this again: any test that uses AvailablePort to find a > random > >>> port could be altered to automatically Retry if it encounters and fails > >>> because of java.net.BindException. Opinions? > >>> > >>> -Kirk > >>> > >>> :geode-core:integrationTest > >>> > >>> com.gemstone.gemfire.internal.cache.DiskRegionJUnitTest > > >>> testBridgeServerRunningInSynchPersistOnlyForIOExceptionCase FAILED > >>> java.net.BindException: Failed to create server socket on > >>> null[5,555] > >>> at com.gemstone.gemfire.internal. > SocketCreator.createServerSock > >>> et( > >>> SocketCreator.java:814) > >>> at com.gemstone.gemfire.internal. > SocketCreator.createServerSock > >>> et( > >>> SocketCreator.java:774) > >>> at com.gemstone.gemfire.internal. > SocketCreator.createServerSock > >>> et( > >>> SocketCreator.java:738) > >>> at com.gemstone.gemfire.internal.cache.tier.sockets. > >>> AcceptorImpl.<init>(AcceptorImpl.java:470) > >>> at com.gemstone.gemfire.internal.cache.CacheServerImpl.start( > >>> CacheServerImpl.java:323) > >>> at com.gemstone.gemfire.internal.cache.DiskRegionJUnitTest. > >>> testBridgeServerRunningInSynchPersistOnlyForIOExceptionCase( > >>> DiskRegionJUnitTest.java:2215) > >>> > >>> Caused by: > >>> java.net.BindException: Address already in use > >>> at java.net.PlainSocketImpl.socketBind(Native Method) > >>> at java.net.AbstractPlainSocketImpl.bind( > >>> AbstractPlainSocketImpl.java:387) > >>> at java.net.ServerSocket.bind(ServerSocket.java:375) > >>> at com.gemstone.gemfire.internal.SocketCreator. > >>> createServerSocket(SocketCreator.java:811) > >>> ... 5 more > >>> > >>> com.gemstone.gemfire.internal.cache.DiskRegionJUnitTest > > >>> testBridgeServerStoppingInSynchPersistOnlyForIOExceptionCase FAILED > >>> java.net.BindException: Failed to create server socket on > >>> null[5,555] > >>> at com.gemstone.gemfire.internal. > SocketCreator.createServerSock > >>> et( > >>> SocketCreator.java:814) > >>> at com.gemstone.gemfire.internal. > SocketCreator.createServerSock > >>> et( > >>> SocketCreator.java:774) > >>> at com.gemstone.gemfire.internal. > SocketCreator.createServerSock > >>> et( > >>> SocketCreator.java:738) > >>> at com.gemstone.gemfire.internal.cache.tier.sockets. > >>> AcceptorImpl.<init>(AcceptorImpl.java:470) > >>> at com.gemstone.gemfire.internal.cache.CacheServerImpl.start( > >>> CacheServerImpl.java:323) > >>> at com.gemstone.gemfire.internal.cache.DiskRegionJUnitTest. > >>> testBridgeServerStoppingInSynchPersistOnlyForIOExceptionCase > >>> (DiskRegionJUnitTest.java:2103) > >>> > >>> Caused by: > >>> java.net.BindException: Address already in use > >>> at java.net.PlainSocketImpl.socketBind(Native Method) > >>> at java.net.AbstractPlainSocketImpl.bind( > >>> AbstractPlainSocketImpl.java:387) > >>> at java.net.ServerSocket.bind(ServerSocket.java:375) > >>> at com.gemstone.gemfire.internal.SocketCreator. > >>> createServerSocket(SocketCreator.java:811) > >>> ... 5 more > >>> > >>> 3247 tests completed, 2 failed, 175 skipped > >>> > >> > > >