+1 for dockerized tests. Most of the CI failures due to state left over are not easily reproducible. I prefer spending time eliminating these failures and may be dockerized tests would be the way to go.
Sai On Tue, Sep 6, 2016 at 5:44 PM, Swapnil Bawaskar <sbawas...@pivotal.io> wrote: > To make sure this is not a problem again, how about running the tests in > their own container using something like gradle-dockerized-test-plugin[1]? > If each of our test is run in its own container, we will be able to address > the BindAddress as well as "state left by previous test" issue. Sure this > will take longer to complete the tests, but we only do a nightly build. We > could also run our tests in parallel in different containers to speed-up > our build. We could also go one step further in getting a clean slate on > "CI failure" issues. My main argument for doing this are: > 1. We have 190 issues that are marked as "ci" failures [2]. > 2. That a lot of CI failures are due to state left behind by previous > tests. (26 are just bind exceptions[3]) > > Fixing 190 test issues is definitely going to slow us down from adding > features to Geode, so getting a clean slate will allow us to narrow down CI > failures to race condition in test (or product). > > If we think this is a good idea, then we could check with ASF infra to see > if docker can be setup on jenkins slaves. > > [1] https://github.com/pedjak/gradle-dockerized-test-plugin > [2] > https://issues.apache.org/jira/browse/GEODE-1778?jql= > project%20%3D%20GEODE%20AND%20status%20in%20(Open%2C%20% > 22In%20Progress%22%2C%20Reopened)%20AND%20labels%20%3D%20ci > [3] > https://issues.apache.org/jira/browse/GEODE-973?jql= > project%20%3D%20GEODE%20AND%20status%20in%20(Open%2C%20% > 22In%20Progress%22%2C%20Reopened)%20AND%20text%20~%20%22BindException%22 > > On Tue, Sep 6, 2016 at 3:01 PM, Kirk Lund <kl...@pivotal.io> wrote: > > > No wonder that test is intermittently failing then. I didn't think we had > > any tests with hard-coded ports. I filed GEODE-1863 and Darrel picked it > > up. > > > > -Kirk > > > > On Tue, Sep 6, 2016 at 9:30 AM, Bruce Schuchardt <bschucha...@pivotal.io > > > > wrote: > > > > > This test is not using AvailablePort. There are two test cases in this > > > class that alway use port 5555. > > > > > > > > > Le 9/6/2016 à 8:00 AM, Anthony Baker a écrit : > > > > > >> How could we fix AvailablePort so we don’t try to use in-use ports? > > >> > > >> Anthony > > >> > > >> On Sep 3, 2016, at 10:29 PM, Kirk Lund <kl...@apache.org> wrote: > > >>> > > >>> We're still hitting BindExceptions in the nightly build, so I'll go > > ahead > > >>> and propose this again: any test that uses AvailablePort to find a > > random > > >>> port could be altered to automatically Retry if it encounters and > fails > > >>> because of java.net.BindException. Opinions? > > >>> > > >>> -Kirk > > >>> > > >>> :geode-core:integrationTest > > >>> > > >>> com.gemstone.gemfire.internal.cache.DiskRegionJUnitTest > > > >>> testBridgeServerRunningInSynchPersistOnlyForIOExceptionCase FAILED > > >>> java.net.BindException: Failed to create server socket on > > >>> null[5,555] > > >>> at com.gemstone.gemfire.internal. > > SocketCreator.createServerSock > > >>> et( > > >>> SocketCreator.java:814) > > >>> at com.gemstone.gemfire.internal. > > SocketCreator.createServerSock > > >>> et( > > >>> SocketCreator.java:774) > > >>> at com.gemstone.gemfire.internal. > > SocketCreator.createServerSock > > >>> et( > > >>> SocketCreator.java:738) > > >>> at com.gemstone.gemfire.internal.cache.tier.sockets. > > >>> AcceptorImpl.<init>(AcceptorImpl.java:470) > > >>> at com.gemstone.gemfire.internal. > cache.CacheServerImpl.start( > > >>> CacheServerImpl.java:323) > > >>> at com.gemstone.gemfire.internal.cache.DiskRegionJUnitTest. > > >>> testBridgeServerRunningInSynchPersistOnlyForIOExceptionCase( > > >>> DiskRegionJUnitTest.java:2215) > > >>> > > >>> Caused by: > > >>> java.net.BindException: Address already in use > > >>> at java.net.PlainSocketImpl.socketBind(Native Method) > > >>> at java.net.AbstractPlainSocketImpl.bind( > > >>> AbstractPlainSocketImpl.java:387) > > >>> at java.net.ServerSocket.bind(ServerSocket.java:375) > > >>> at com.gemstone.gemfire.internal.SocketCreator. > > >>> createServerSocket(SocketCreator.java:811) > > >>> ... 5 more > > >>> > > >>> com.gemstone.gemfire.internal.cache.DiskRegionJUnitTest > > > >>> testBridgeServerStoppingInSynchPersistOnlyForIOExceptionCase FAILED > > >>> java.net.BindException: Failed to create server socket on > > >>> null[5,555] > > >>> at com.gemstone.gemfire.internal. > > SocketCreator.createServerSock > > >>> et( > > >>> SocketCreator.java:814) > > >>> at com.gemstone.gemfire.internal. > > SocketCreator.createServerSock > > >>> et( > > >>> SocketCreator.java:774) > > >>> at com.gemstone.gemfire.internal. > > SocketCreator.createServerSock > > >>> et( > > >>> SocketCreator.java:738) > > >>> at com.gemstone.gemfire.internal.cache.tier.sockets. > > >>> AcceptorImpl.<init>(AcceptorImpl.java:470) > > >>> at com.gemstone.gemfire.internal. > cache.CacheServerImpl.start( > > >>> CacheServerImpl.java:323) > > >>> at com.gemstone.gemfire.internal.cache.DiskRegionJUnitTest. > > >>> testBridgeServerStoppingInSynchPersistOnlyForIOExceptionCase > > >>> (DiskRegionJUnitTest.java:2103) > > >>> > > >>> Caused by: > > >>> java.net.BindException: Address already in use > > >>> at java.net.PlainSocketImpl.socketBind(Native Method) > > >>> at java.net.AbstractPlainSocketImpl.bind( > > >>> AbstractPlainSocketImpl.java:387) > > >>> at java.net.ServerSocket.bind(ServerSocket.java:375) > > >>> at com.gemstone.gemfire.internal.SocketCreator. > > >>> createServerSocket(SocketCreator.java:811) > > >>> ... 5 more > > >>> > > >>> 3247 tests completed, 2 failed, 175 skipped > > >>> > > >> > > > > > >