On Thu, Sep 26, 2024 at 11:50 AM Rémy Maucherat <r...@apache.org> wrote: > > On Thu, Sep 26, 2024 at 11:18 AM <build...@apache.org> wrote: > > > > Build status: BUILD FAILED: failed compile (failure) > > Worker used: bb_worker2_ubuntu > > URL: https://ci2.apache.org/#builders/120/builds/81 > > Blamelist: remm <r...@apache.org> > > Build Text: failed compile (failure) > > Status Detected: new failure > > Build Source Stamp: [branch main] 36c226f2141573a841b6cd4d7ad8e8b582080510 > > Certificate reloading, plus the use of the FFM version of the > implementation in parallel, causes a reliable crash on shutdown for > tomcat-native/JNI. I haven't been able to find out yet what is wrong > (certainly this kind of use is inappropriate, you would use either > tomcat-native or FFM, not both).
The crash scenario seems to be like this: - The OpenSSL/JNI test runs, everything is good. - During the test, a certificate reload is done, leaving a dangling SSL_CTX from OpenSSL, associated with an OpenSSL context object. SSL_CTX_free is run whenever GC occurs through a cleaner. - Library.terminate() is called since Tomcat is destroyed at the end of the test. - GC is independent from that and the JVM is still running, so the OpenSSL context object is still there. No crash (yet), but the JVM crash is now unavoidable. - *In the same JVM*, the OpenSSL/FFM test now starts. - The test runs fine, does its own certificate reloading, leaving another OpenSSL unrefed context object. - JVM sees some objects to GC, GC finally runs in parallel with everything (it would not be fun otherwise). - GC cleans up the OpenSSL context from the OpenSSL/JNI test run. However, since APR has been released in terminateAPR, this crashes the JVM in some random location (OpenSSL, APR, etc) but always while running the org.apache.tomcat.jni.SSLContext.free method. It seems to be fixed if I add a System.gc() in AprLifecycleListener.terminateAPR(), which would cause the unreferenced OpenSSL contexts to go through their SSL_CTX_free immediately. Fingers crossed ... This seems to be an acceptable compromise to be able to run the testsuite efficiently. Rémy > Rémy > > > > > Steps: > > > > worker_preparation: 0 > > > > git: 0 > > > > shell: 0 > > > > shell_1: 0 > > > > shell_2: 0 > > > > shell_3: 0 > > > > shell_4: 0 > > > > shell_5: 0 > > > > shell_6: 0 > > > > compile: 1 > > > > shell_7: 0 > > > > shell_8: 0 > > > > shell_9: 0 > > > > shell_10: 0 > > > > Rsync docs to nightlies.apache.org: 0 > > > > shell_11: 0 > > > > Rsync RAT to nightlies.apache.org: 0 > > > > compile_1: 2 > > > > shell_12: 0 > > > > Rsync Logs to nightlies.apache.org: 0 > > > > > > -- ASF Buildbot > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org > > For additional commands, e-mail: dev-h...@tomcat.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org For additional commands, e-mail: dev-h...@tomcat.apache.org