Hi All,
On 01/01/2026 14:52, Samuel Thibault wrote:
Michael Kelly, le jeu. 01 janv. 2026 14:48:19 +0000, a ecrit:
It's hard to say. The 4 buildds keep building packages all day long, and
I notice such "stray" errors on one of them like every one day or two.
That's possibly as rare as the stress-ng bit errors then given that my machine
is almost certainly slower than those supporting the buildds.
It may then be simpler to just reproduce it with stress-ng, since then
you'd know exactly what it was doing, while package installation etc. is
a mess of things that happen :)
Samuel
I'm making an update on this investigation because like others I'm
likely to have less time for looking into this from tomorrow.
I have been successful at adjusting the stress-ng parameters to make the
likelihood of 'bit error' reports close to 100%. A test like the
following on a 4GB hurd-amd64 virtual machine and also on a 4GB real
hardware fails for me almost every time:
# stress-ng -t 20s --metrics --vm 64 --vm-bytes 1800M --vm-method incdec
With errors like:
stress-ng: fail: [3947] vm: detected 141733920769 bit errors while
stressing memory
stress-ng: fail: [3984] vm: detected 2 bit errors while stressing memory
That's the good news. The bad news is that I suspect the cause is
related to the handling of the signals which are used to terminate the
stress-ng worker (oomable child). That first error reported (above) has
a value which is nonsense given the size of memory region being worked
on. I added some debug to the stress-ng code and there were some
extraordinary things going on which made no sense at all with stack
variables seemingly changing 'randomly'. It seems suspicious to me that
these things only start occurring after the first signal is delivered to
the process. This all needs a thorough investigation when time permits.
In any case, this same test result does not present when running on
hurd-i386. That test completes perfectly over many 10s of iterations.
This indicates that the stress-ng bit errors are not related to the
buildd issues. I've had no luck recreating that issue but will return to
it when time permits.
Regards,
Mike.