On Tue, Aug 29, 2023 at 07:05:05PM +0000, Holger Levsen wrote: > fwiw, at 18:48:18 UTC jenkins.d.n had a load of 3.79 (with 23 cores), > then mosh lost connection for 12min, then at 19:00:03 the load was 155.37, > then at 19:01:09 the load was 60.10, a min later the load was 25. > > Last time we looked it was the diffoscope process causing this load, > though I'm now surprised how something built this quickly can cause diffoscope > to do this... so maybe another red herring...
from my local logs: 2023-04-29 08:46 UTC, jenkins, powercycle, no ping 2023-05-08 18:54 UTC, jenkins, powercycle, no ping 2023-07-22 22:40 UTC, jenkins, powercycle, no ping 2023-07-23 12:48 UTC, jenkins, powercycle, no ping 2023-07-23 16:12 UTC, jenkins, powercycle, no ping 2023-07-24 11:28 UTC, jenkins, powercycle, no ping 2023-07-25 15:46 UTC, jenkins, powercycle, no ping 2023-07-26 15:15 UTC, jenkins, powercycle, no ping 2023-07-28 12:20 UTC, jenkins, powercycle, no ping 2023-07-29 00:28 UTC, jenkins, powercycle, no ping 2023-07-29 08:35 UTC, jenkins, powercycle, no ping 2023-08-07 09:05 UTC, jenkins, powercycle, no ping 2023-08-11 16:54 UTC, jenkins, powercycle, no ping 2023-08-27 10:02 UTC, jenkins, powercycle, no ping 2023-08-29 07:38 UTC, jenkins, powercycle, no ping 2023-08-29 08:40 UTC, jenkins, powercycle, no ping 2023-08-29 23:08 UTC, jenkins, powercycle, no ping 2023-08-30 22:53 UTC, jenkins, powercycle, no ping 2023-08-31 08:39 UTC, jenkins, powercycle, no ping (what's not visible here is the cleanup work required after each of these useless powercycles...) so I've disabled the Debian r-b CI builds again, to see if this makes this issue (the machine is so loaded it doesnt even respond to pings anymore) go away. Sadly, it's rather hard to see if this "helps", so it will be some days until I'll reenable them. what has changed in July is that this host was upgraded to bookworm. what also has changed is that diffoscope was upgraded (constantly to the sid version), though I don't see any relevant changes in changelog. Investigating this is also really difficult, as you might imagine. help much welcome. So far we could see that the load is getting really high and that its probably the diffoscope process, not java. one idea would be to run diffoscope from stable, though as this is both less than ideal as well means some work to implement this change, I've refrained from trying this so far. I guess this will be my next step^wpoke. help much welcome. -- cheers, Holger ⢀⣴⠾⠻⢶⣦⠀ ⣾⠁⢠⠒⠀⣿⡁ holger@(debian|reproducible-builds|layer-acht).org ⢿⡄⠘⠷⠚⠋⠀ OpenPGP: B8BF54137B09D35CF026FE9D 091AB856069AAA1C ⠈⠳⣄ 20230709: Today was the warmest day on earth in 125,000 years. Today was also the day with the most planes in the air at one time ever in history. By the time you read this both of these records have probably been broken.
signature.asc
Description: PGP signature
_______________________________________________ Reproducible-builds mailing list Reproducible-builds@alioth-lists.debian.net https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/reproducible-builds