Diego, Thanks for sharing your fix. I think it is the best way to go.
Would you like to try to implement it in the Zookeeper code base? Enrico Il Lun 8 Feb 2021, 12:35 Diego Salvi - Diennea <diego.sa...@diennea.com> ha scritto: > Hi Stig, > > > > Yes you can receive from maven the fork instance and the maximum fork > count and then using such values to compute custom ports to run parallel > tests. > > The boring part is to use configure every test to use computed ports and > not default one. > > > > You have to configure surefire to setup some system properties: > > > > <properties> > > <forkCount>4</forkCount> > > </properties> > > > > <plugins> > > <plugin> > > <groupId>org.apache.maven.plugins</groupId> > > <artifactId>maven-surefire-plugin</artifactId> > > <version>2.21.0</version> > > <configuration> > > <forkCount>${forkCount}</forkCount> > > <runOrder>balanced</runOrder> > > <systemPropertyVariables> > > <!-- Due to a surefire bug we cannot get directly the fork number > but we need a prefix (“current:”) or it will be null --> > > <my.fork.num>current:${surefire.forkNumber}</my.fork.num> > > <my.fork.max>${forkCount}</my.fork.max> > > </systemPropertyVariables> > > </configuration> > > </plugin> > > </plugins> > > > > Then you can create a base test class o a test rule to read such system > properties and evaluate a custom port when requested. > > I gone for the base class but the test rule is possible too: > > > > public abstract class BaseTest { > > > > public static final String NUM_PROPERTY_PREFIX = "current:"; > > > > /** > > * Fork "position" in [0,len) > > */ > > protected final int pos; > > > > /** > > * Max fork count > > */ > > protected final int len; > > > > public BaseTestUtils() { > > System.out.println("[TESTENV] my.fork.num: " + > System.getProperty("my.fork.num")); > > System.out.println("[TESTENV] my.fork.max: " + > System.getProperty("my.fork.max")); > > > > final String num = System.getProperty("my.fork.num", > NUM_PROPERTY_PREFIX + "1"); > > > > pos = Integer.valueOf(num.substring(NUM_PROPERTY_PREFIX.length(), > num.length())) - 1; > > len = Integer.getInteger("my.fork.max", 1); > > > > } > > > > public int evaluatePort(int base) { > > return evaluatePort(pos, len, base); > > } > > > > private static int evaluatePort(int pos, int len, int base) { > > > > /* max: 65535, we use just first 16 bits */ > > > > if (len <= pos || len <= 0 || pos < 0) { > > throw new IllegalArgumentException("Len or pos invalid"); > > } > > > > int max = len - 1; > > int prefixBitsNeeded = Integer.SIZE - > Integer.numberOfLeadingZeros(max); > > > > int bitsRemaining = 16 - prefixBitsNeeded; > > > > int suffixBitNeeded = Integer.SIZE - > Integer.numberOfLeadingZeros(base); > > > > if (suffixBitNeeded > bitsRemaining) { > > throw new IllegalArgumentException("Base too large: pos " + > pos + " len " + len + " base " + base > > + "; needed bits " + suffixBitNeeded + " remaining " + > bitsRemaining); > > } > > > > int port = (pos << bitsRemaining) + base; > > > > return port; > > } > > > > } > > > > In EmailSuccess we use a tricky “shift” logic to compute a new port but > smarter or simpler ones could be written too (something like port + pos * > 5000). > > > > Diego Salvi > > > > > > *Da: *Enrico Olivelli <eolive...@gmail.com> > *Data: *lunedì 8 febbraio 2021 11:49 > *A: *DevZooKeeper <dev@zookeeper.apache.org> > *Cc: *Diego Salvi - Diennea <diego.sa...@diennea.com> > *Oggetto: *Re: Java 11 tests are very flaky on GitHub Actions > > > > Stig, > > thanks for your information. > > Unfortunately this is not doable in our test suite, because we have to > create server side configuration for an ensemble of servers before starting > them. > > I am afraid that in order to implement a safe port-assignment we will have > to instrument all of the code that is opening server side sockets in order > to use mocks or pre-created sockets. > But this is not easy because we sometimes have tests that close and reopen > the ServerSocket, but doing so we will "lose" the port. > > I saw in other projects (non OSS, so I cannot send pointers) that you can > receive from the Maven environment the if of the "forked JVM" and then you > can compute all of the ports used by the tests from that value. > For instance if you start 4 parallel tests you will have that JVM id > spanning from 0 to 3 and you receive that value as a System Property. > Then the test can use that I in order to compute static ports, like JVMID > * 1000 + X . > > I am cc'ing one of my former colleagues at EmailSuccess.com, Diego, that > probably can share his experience in EmailSuccess > > Enrico > > > > Il giorno sab 6 feb 2021 alle ore 16:32 Stig Rohde Døssing < > s...@apache.org> ha scritto: > > I'm not sure if this is easy to solve for Zookeeper, but going to make a > suggestion here I've seen work on other projects. > > The "find a free port" code is inherently racy. I've seen that solution > used in many other projects, and it never works that well. > > A much safer way to do it is to pass port 0 directly to the server you're > trying to start. As you know from the port reservation code, this will > cause the OS to use a free port. The port selected by the OS can be > retrieved by the test by calling e.g. > > https://docs.oracle.com/javase/7/docs/api/java/net/ServerSocket.html#getLocalPort() > . > > > You go from this flow > Open ServerSocket on 0 > Get port via ServerSocket.getLocalPort > Close socket <- This is the dangerous bit, as after this, the port is again > up for grabs for other tests to acquire > Pass chosen port to server > Server opens ServerSocket on port > > To > Pass 0 to server > Server opens ServerSocket on 0, getting a random port > Test calls some method on the server that ends up in a call to > ServerSocket.getLocalPort > > This won't work for cases where you're testing e.g. the client starting > before the server, and connecting once the server has booted, but for the > common use case of starting a server, and then starting a client connected > to that server, this should work. > > On 2021/02/05 15:58:00, Christopher <c...@apache.org> wrote: > > FWIW, the maven-surefire-plugin should be able to retry temporarily> > > failing tests, but it isn't working with JUnit5 until the plugin is> > > updated to a newer version. However, the newer version of the plugin> > > didn't work with the versions of JUnit that ZK is using. When I tried> > > to update maven-surefire-plugin, things got worse due to JUnit4/JUnit5> > > stuff that I don't understand.> > > > > I have created a PR to trigger the tests to run on JDK8 instead of> > > JDK11, to demonstrate that the tests are flaky there (or to prove me> > > wrong), but it doesn't need to be merged, as it's just a test:> > > https://github.com/apache/zookeeper/pull/1595> > > > > On Fri, Feb 5, 2021 at 10:48 AM Christopher <ct...@apache.org> wrote:> > > >> > > > These tests are flaky on JDK8, too, when I tried. It's my> > > > understanding that's why they were not being run on Travis previously> > > > (still aren't). Most of the tests that I see failing are due to the> > > > "Address already in use" bind error. This may be more likely in a> > > > virtualized environment like GitHub Actions (vs. non-vritualized> > > > Jenkins), but I don't think it is unique to that environment... just> > > > maybe more likely for whatever reason.> > > >> > > > I have spent quite a bit of time looking into this, and I think the> > > > main cause is that the port reservation stuff isn't working the way it> > > > should. It tries to bind to a port, and then it closes the> > > > ServerSocket that was used to find an available port. What it should> > > > do instead is return the ServerSocket itself for use in the calling> > > > code, after it has successfully bound, rather than returning an> > > > integer. But, that might be a big change, and there might be a simpler> > > > fix.> > > >> > > > On Fri, Feb 5, 2021 at 10:25 AM Enrico Olivelli <eo...@gmail.com> > wrote:> > > > >> > > > > Hi,> > > > > I see that the new test workflow with Java 11 is very flaky.> > > > >> > > > > This is an example> > > > > > > https://github.com/apache/zookeeper/pull/1592/checks?check_run_id=1830428694 > > > > > > >> > > > > I would like to not consider it blocker for merging pull requests.> > > > >> > > > > That said, we should investigate further, the tests that are failing > were> > > > > not flaky on JDK8> > > > >> > > > > Thoughts ?> > > > > Enrico> > > > > > ------------------------------ > > CONFIDENTIALITY & PRIVACY NOTICE > This e-mail (including any attachments) is strictly confidential and may > also contain privileged information. If you are not the intended recipient > you are not authorised to read, print, save, process or disclose this > message. If you have received this message by mistake, please inform the > sender immediately and destroy this e-mail, its attachments and any copies. > Any use, distribution, reproduction or disclosure by any person other than > the intended recipient is strictly prohibited and the person responsible > may incur in penalties. > The use of this e-mail is only for professional purposes; there is no > guarantee that the correspondence towards this e-mail will be read only by > the recipient, because, under certain circumstances, there may be a need to > access this email by third subjects belonging to the Company. >