Re: [sage-devel] Re: Documentation and state of Sage CI

Vincent Macri Wed, 27 Aug 2025 16:21:58 -0700

To slightly expand on the above: We do have tests with fixed(build.yml) and with variable random seed (ci-meson.yml).

I was not aware that only meson used a random seed. That's good to knowand explains why meson seems to fail more often in CI. As areviewer/developer it's helpful to know which workflows are the moststable, and which could fail for unrelated reasons. This fact should bewritten down somewhere, probably in the CI documentation if you end uphaving time to write it.

A related question would be: shall we temporarily disable tests thatare known to randomly fail?
Advantage: less noise due to random failures
Disadvantage: less coverage

I don't think tests need to be disabled, but rather the CI should notreport a PR as failing if the same failure occurs on develop. So stillrun the failing tests, but don't report the workflow as failing if thesame test fails on develop. I think we already have something like thisfor the fixed seed tests, but not for the random seeds.

On the other hand, it would be nice if the CI highlighted a test whichpasses in a PR but fails on develop.

> I wonder if the stranger/unreproducible failures might be caused bysome faulty caching on the CI server, but I don't know enough abouthow the CI server is configured and what is cached between builds tosay if that might be the case.
From my experience, these issues are almost never specific to CI (i.e.the same error could be reproduced in principle by running the samecommands locally on a developer's machine). The only exceptions areissues related to "docker pull/push" that you sometimes see. Thosecome from the design decision to run the CI in a new docker container.Fixing those issues by redesiging the corresponding workflows would bedesirable (see below).

This failure in code unrelated to the PR is neither due to a random testor a docker issue, and I cannot reproduce it:https://github.com/sagemath/sage/actions/runs/17134650987/job/48607483558#step:15:8858

I agree that failures like this are very rare though. We do use a lot ofCI every month, so I would not be surprised to learn that the expectednumber of monthly random hardware glitches (or solar flares, or whateveryour favourite explanation is for strange computer phenomena) for our CIsetup is non-negligible.

> It would be nice to have a GitHub label for these kinds of issuesso they can be found more easily. I'm not sure who has permissions toadd new labels.
Good idea! Needs to be done by one of the github org admins.

Would whoever has the permissions for this consider adding two newlabels to GitHub? One called "CI" (or something similar) to be used forissues/PRs relating to the CI (we have "CI fix" but that's for CI fixesthat should be merged before other PRs). And one called "random seedfailure" (or something similar) for issues that report or PRs that fixtests that consistently fail for specific random seeds.


--
You received this message because you are subscribed to the Google Groups 
"sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/sage-devel/9f9089bd-5e20-4b1c-ac9d-7d0a0f760d64%40ucalgary.ca.

Re: [sage-devel] Re: Documentation and state of Sage CI

Reply via email to