Bug#1062356: adios2: flaky autopkgtest (host dependent): times out on big host

2024-02-06 Thread Drew Parsons
Source: adios2
Followup-For: Bug #1062356

The flakey test is adios2-mpi-examples. debian/tests is building and
running them manually, and running on only 3 processors (mpirun -n 3)
So the problem can't be overload of the machine.

I'll just skip insituGlobalArraysReaderNxN_mpi.

For reference, upstream is making some changes to make it more
reliable to run tests against the installed library,
https://github.com/ornladios/ADIOS2/pull/3906
also
https://github.com/ornladios/ADIOS2/pull/3820
Not certain that that directly makes insituGlobalArraysReaderNxN_mpi
more reliable though.



Bug#1062356: adios2: flaky autopkgtest (host dependent): times out on big host

2024-02-06 Thread Drew Parsons
Source: adios2
Followup-For: Bug #1062356

Can't be quite as simple as just the host machine.

https://ci.debian.net/data/autopkgtest/unstable/amd64/a/adios2/41403641/log.gz
completed in 9 minutes,
while
https://ci.debian.net/data/autopkgtest/unstable/amd64/a/adios2/41496866/log.gz
failed with timeout.

But that was ci-worker13 in both cases.
Maybe it's a race condition.

Might be simplest to just skip insituGlobalArraysReaderNxN_mpi
though I can also review how many CPUs are invoked by the test.
Usually safer not to run tests on all 64 available cpus, for instance,
if there are that many on the machine.



Bug#1062356: adios2: flaky autopkgtest (host dependent): times out on big host

2024-01-31 Thread Paul Gevers

Source: adios2
Version: 2.9.2+dfsg1-8
Severity: serious
User: debian...@lists.debian.org
Usertags: flaky

Dear maintainer(s),

I looked at the results of the autopkgtest of your package. I noticed 
that it regularly fails. The failures seem related to the host that runs 
the test. ci-worker13 is a beefy machine [1], while the other amd64 
workers are much more moderate [2]. The tests that time out after 2:50 
hours seem to all run on ci-worker13 (so, lots of CPU's and lots of 
RAM), while the other runs only take below 10 minutes (and seem to run 
on one of the other hosts.


Because the unstable-to-testing migration software now blocks on
regressions in testing, flaky tests, i.e. tests that flip between
passing and failing without changes to the list of installed packages,
are causing people unrelated to your package to spend time on these
tests.

Don't hesitate to reach out if you need help and some more information
from our infrastructure.

Paul

[1] https://metal.equinix.com/product/servers/m3-large/
[2] https://aws.amazon.com/ec2/instance-types/m5/


OpenPGP_signature.asc
Description: OpenPGP digital signature