Re: Heads Up: cirrus-ci is shutting down June 1st

Andres Freund Wed, 10 Jun 2026 07:55:39 -0700

Hi,

On 2026-06-04 15:03:39 -0400, Andres Freund wrote:
> Pushed it now. Only a tiny change from the last version, Bilal suggested ove
> IM to remove the multi-threading argument from robocopy, for performance.
>
> I did futz around with ccache improvements for a bit. I think we're going to
> need them, but they're complicated enough to do them separately.

The ccache improvements have been committed since, in:
2026-06-08 f52c44ce48a ci: Improve ccache handling

Before we can backpatch the CI support I think we need to resolve a few more
things:

- re-enabling crash reporting for windows, Bilal sent a patch [1]

I think that's pretty much a must have, otherwise debuggin windows issues is
really hard.

- Cold or inapplicable (e.g. due to a core header change) compiler warnings
task is very slow (35min). I have a patch that I need to send out to
convert everything but the headercheck in compilerwarnings to meson, that
reduces the worst case build times considerably (primarily due to the cross
build being able to use precompiled headers)

I'll try to send that out later today.

There's other issues, but I'm not sure we need to resolve them before
backpatching:

- Coverage for the BSDs - this is complicated enough that I'm not sure it's
worth backpatching.

I'm on the fence.

- It's too much work to see what all failed across all the jobs. I've
experimented with generating a markdown summary across the jobs that ran
(basically a table that shows which steps succeeded and how many tests
failed/skipped/timed out, as well as the name of the first failed test).

It does require not entirely trivial changes. But it does make it faster to
grasp what's going on. It also perhaps is interesting for cfbot /
commitfest app, because it'd basically would include a summary of which
steps failed and how many tests passed/failed/... as an output of each job
and then the workflow.

That's a pretty substantial QOL improvement,

- Right now all logs get uploaded. That's quite the waste of space for
artifacts. Bilal has sent a patch: [2]

But this isn't a new problem, so perhaps it's ok to leave this for later?

- I comparison to cirrus-ci it's considerably more painful (and it wasn't
exactly pain-free on cirrus either) to access the logs of failed tasks. One
can't just link to the failure or such.

I have wondered about determining which test failed first, and uploading the
most crucial logs for that test separately, so one could at least look and
link to those without unpacking a .zip.

An argument against making that a hard requirement before backpatching is
that one needs to look at failures on master a lot more often than on the
back branches.

- A decent chunk of test time is spent setting up the containers (I've
optimized them a bit to reduce that already). Somehow docker is pretty slow
around container extraction. I had already split the containers into one
for docs and one for the rest, if we did that further, we could make startup
of e.g. sanitycheck (which has an outsized impact) a decent bit faster, but
we can't use the same container for e.g. linux-meson-32.

I think it may be smart to just add per-task tags for the containers. Then
we can have them initially be the same (by just pushing the same container
with different tags), which would allow us to adjust the containers contents
later, without needing to patch the workflow in the postgres repo.

I suspect we should do the tag aliases before backpatching, but I'm very
willing to be convinced otherwise.

Any opinions on the above? Any other points that we need to resolve before
backpatching?

Greetings,

Andres Freund

[1]
https://postgr.es/m/CAN55FZ1BgsXSTzOpehnMa4NzWL8Aivsxx-di7-VT6bZ3j2Omow%40mail.gmail.com
[2]
https://www.postgresql.org/message-id/CAN55FZ07AefTV_D2bCZae5jtQOQD1QByNe3FbXvM9Lq166c4og%40mail.gmail.com

Re: Heads Up: cirrus-ci is shutting down June 1st

Reply via email to