вт, 7 дек. 2021 г. в 16:22, Mark Thomas <ma...@apache.org>: > > Hi all, > > I've been investigating some recent build failures and it appears that > some builds are failing because Linux is closing tasks because of memory > pressure. > > My proposed solution is based on the following facts / observations: > > - The test node we are using has 4 cores and 16GB of RAM > > - The unit tests are currently configured to run with 6 threads > > - The test node is currently configured to run two tests concurrently > > - In local testing test thread count == core count gave the best > performance > > - In local testing increasing / decreasing test threads by 10% had a > marginal impact on test duration > > > My proposed solution is therefore: > > - reduce test thread count from 6 to 4 > > - investigate whether we can reduce the concurrent tests from 2 to 1
I wonder if there is some consistency about when that happens. (What tests are being executed, or at least time from launch.) I mean if there are tests that require a noticeable amount of memory. I have encountered such a test once, https://bz.apache.org/bugzilla/show_bug.cgi?id=65177 org.apache.tomcat.util.net.TestSsl IIRC, a fix reduced memory requirements for that test from 256Mb down to 144Mb (128 + 16) of byte arrays. Though in such a case I would expect an OutOfMemoryError in java. I think Linux OOM killer can be active for outside reasons that are out of our control. Also it looks like several builds run in parallel. Tomcat 10.1.x https://ci2.apache.org/#/builders/44 Worker bb2_worker2_ubuntu. A build started at 03:28 PM (visible if I hover mouse over "started at" time for build 113) and was running for an hour and 8 minutes (visible if I hover over build number). Tomcat 10.0.x https://ci2.apache.org/#/builders/43 Worker bb2_worker2_ubuntu A build started at 03:46 PM and is currently running (for more than an hour). Tomcat 9 https://ci2.apache.org/#/builders/37 Worker bb2_worker2_ubuntu A build started at 04:37 PM and is currently running. Even though they did not start at the same time, it looks like they overlap. I saw both 10.0.x and 9 being tested at the same time. (10.0.x has finished a few seconds ago). Best regards, Konstantin Kolinko --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org For additional commands, e-mail: dev-h...@tomcat.apache.org