Hi Paul,

thanks for getting back to me so quickly.

Adding Nilesh and Shengjing, we were discussing this issue on a separate,
private email conversation.

On Wed, Sep 29, 2021 at 2:45 PM Paul Gevers <elb...@debian.org> wrote:

> Hi Reinhard,
>
> On 29-09-2021 02:56, Reinhard Tartler wrote:
> > Please unblock package golang-github-klauspost-compress
> >
> > The problem is with autopkgtest, on armel and armhf, the test
> > machines frequently run out of memory when executing the extensive
> > test suite.
>
> As our armhf worker has the most memory of all our workers (250GB) and
> most cores (160), that would hint at an very bad bug on armhf/armel, or
> bad parameters for e.g. xz (--threads=0 recently showed up as behaving
> badly with 160 cores).
>

So, after some cleanups in the package and a backported patch from upstream
that increases the test timeout from 2 to 4 minutes (and this consistently
fixes
failures on i386), we now have two runs in a row where the armhf builder
OOM:

https://ci.debian.net/data/autopkgtest/testing/armhf/g/golang-github-klauspost-compress/15714893/log.gz
https://ci.debian.net/data/autopkgtest/testing/armhf/g/golang-github-klauspost-compress/15733957/log.gz

Note that the test doesn't do xz, but zstd compression, cf.
https://github.com/klauspost/compress/blob/master/zstd/README.md
so your comment regarding --threads=0 does not seem to apply? -- not sure,
please let me know what you think.

The consistent OOM is surprising given that you state that the worker has
250GB of RAM. Looking at the logs,
I note that the tests are being passed the option -p 160 by the dh-golang
helper, so it will build
and run test executables concurrently. That confirms to me that we are
indeed running on these 250GB/160 core workers.

Is it possible that armhf is setting up ulimits that limits the amount of
memory the test may allocate?

The thing is, I am able to pass the tests just fine on the on the porterbox
abel.debian.org, which
has significantly fewer cores (4) and memory (4GB).

Maybe we need to limit the concurrent builds/test in debian/rules so that
it never users more than
let's say 8 cores or something?


>
> > I suggest the folling hints:
> >
> > force-badtest golang-github-klauspost-compress/*/armhf
> > force-badtest golang-github-klauspost-compress/*/armel
>
> If it's just to have golang-github-klauspost-compress migrate,
> force-skiptest would be better. But, as you want
> golang-github-klauspost-compress itself to migrate, I rather have it
> that you add an Architecture field to the test declaration and skip
> armel and armhf. Flaky tests are considered RC.
>

As far as I understand, all golang packages use autodep8 to declare the
tests,
which doesn't support adding the Architecture field. In order to get around
this,
I guess I could remove the Testsuite field from debian/control and add a
debian/tests/control that looks similar to what autodep8 generates, but adds
the Architecture: !armhf  restriction.

Is that best practice or am I overlooking something?

-- 
regards,
    Reinhard

Reply via email to