> On Oct 29, 2020, at 9:21 AM, Joan Touzet <[email protected]> wrote:
>
> (Sidebar about the script's details)
Sure.
> I tried to read the shell script, but I'm not in the headspace to fully parse
> it at the moment. If I'm understanding correctly, this will still catch
> CouchDB's CI docker images if they haven't changed in a week, which happens
> often enough, negating the cache.
Correct. We actually tried something similar for a while and discovered
that in a lot of cases, upstream packages would disappear (or worse, have
security problems) thus making it look the image is still "good" when it's not.
So a rebuild weekly at least guarantees some level of "yup, still good"
without having too much of a negative impact.
> As a project, we're kind of stuck between a rock and a hard place. We want to
> force a docker pull on the base CI image if it's out of date or the image is
> corrupted. Otherwise we want to cache forever, not just for a week. I can
> probably manage the "do we need to re-pull?" bit with some clever CI
> scripting (check for the latest image hash locally, validate the local image,
> pull if either fails) but I don't understand how the script resolves the
> latter.
Most projects that use Yetus for their actual CI testing build the
image used for the CI as part of the CI. It is a multi-stage, multi-file
docker build that has each run use a 'base' Dockerfile (provided by the
project) that rarely changed and a per-run file that Yetus generates on the
fly, with both images tagged by either git sha or branch (depending upon
context). Due to how docker image reference counts on the layers work, this
makes the docker images effectively used as a "rolling cache" and (beyond a
potential weekly cache removal) full builds are rare.. thus making them
relatively cheap (typically <1m runtime) unless the base image had a change far
up the chain (so structure wisely). Of course, this also tests the actual
image of the CI build as part of the CI. (What tests the testers? philosophy)
Given that Jenkins tries really hard to have job affinity, re-runs were still
cheap after the initial one. [Ofc, now that the cache is getting nuked every
day....]
Actually, looking at some of the ci-hadoop jobs, it looks like yetus is
managing the cache on them. I'm seeing individual run containers from days ago
at least. So that's a good sign.
> Can a exemption list be passed to the script so that images matching a
> certain regex are excluded? You say the script ignores labels entirely, so
> perhaps not...
Patches accepted. ;)
FWIW, I've been testing on my local machine for unrelated reasons and I
keep blowing away running containers I care about so I might end up adding it
myself. That said: the code was specifically built for CI systems where the
expectation should be that nothing is permanent.