On Thu, May 16, 2024 at 05:52:43PM +0100, Camilla Conte wrote: > Enables caching from the qemu-project repository. > > Uses a dedicated "$NAME-cache" tag for caching, to address limitations. > See issue "when using --cache=true, kaniko fail to push cache layer [...]": > https://github.com/GoogleContainerTools/kaniko/issues/1459
After investigating, this is a result of a different design approach for caching in kaniko. In docker, it can leverage any existing image as a cache source, reusing individual layers that were present. IOW, there's no difference between a cache and a final image, they're one and the same thing In kaniko, the cache is a distinct object type. IIUC, it is not populated with the individual layers, instead it has a custom format for storing the cached content. Therefore the concept of storing the cache at the same location as the final image, is completely inappropriate - you can't store two completely different kinds of content at the same place. That is also why you can't just "git pull" the fetch the cache image(s) beforehand, and also why it doesn't look like you can use multiple cache sources with kaniko. None of this is inherantly a bad thing..... except when it comes to data storage. By using Kaniko we would, at minimum, doubling the amount of data storage we consume in the gitlab registry. This is a potentially significant concern because GitLab does technically have a limited storage quota, even with our free OSS plan subscription. Due to technical limitations, they've never been able to actually enforce it thus far, but one day they probably will. At which point we're doomed, because even with our current Docker-in-Docker setup I believe we're exceeding our quota. Thus the idea of doubling our container storage usage is pretty unappealing. We can avoid that by running without cache, but that has the cost of increasing the job running time, since all containers would be rebuilt on every pipeline. This will burn through our Azure compute allowance more quickly (or our GitLab CI credits if we had to switch away from Azure). > Does not specify a context since no Dockerfile is using COPY or ADD > instructions. > > Does not enable reproducible builds as > that results in builds failing with an out of memory error. > See issue "Using --reproducible loads entire image into memory": > https://github.com/GoogleContainerTools/kaniko/issues/862 > > Previous attempts, for the records: > - Alex Bennée: > https://lore.kernel.org/qemu-devel/20230330101141.30199-12-alex.ben...@linaro.org/ > - Camilla Conte (me): > https://lore.kernel.org/qemu-devel/20230531150824.32349-6-cco...@redhat.com/ > > Signed-off-by: Camilla Conte <cco...@redhat.com> > --- > .gitlab-ci.d/container-template.yml | 25 +++++++++++-------------- > 1 file changed, 11 insertions(+), 14 deletions(-) > > diff --git a/.gitlab-ci.d/container-template.yml > b/.gitlab-ci.d/container-template.yml > index 4eec72f383..066f253dd5 100644 > --- a/.gitlab-ci.d/container-template.yml > +++ b/.gitlab-ci.d/container-template.yml > @@ -1,21 +1,18 @@ > .container_job_template: > extends: .base_job_template > - image: docker:latest > stage: containers > - services: > - - docker:dind > + image: > + name: gcr.io/kaniko-project/executor:debug > + entrypoint: [""] > + variables: > + DOCKERFILE: "$CI_PROJECT_DIR/tests/docker/dockerfiles/$NAME.docker" > + CACHE_REPO: "$CI_REGISTRY/qemu-project/qemu/qemu/$NAME-cache" > before_script: > - export TAG="$CI_REGISTRY_IMAGE/qemu/$NAME:$QEMU_CI_CONTAINER_TAG" > - # Always ':latest' because we always use upstream as a common cache > source > - - export COMMON_TAG="$CI_REGISTRY/qemu-project/qemu/qemu/$NAME:latest" > - - docker login $CI_REGISTRY -u "$CI_REGISTRY_USER" -p > "$CI_REGISTRY_PASSWORD" > - - until docker info; do sleep 1; done > script: > - echo "TAG:$TAG" > - - echo "COMMON_TAG:$COMMON_TAG" > - - docker build --tag "$TAG" --cache-from "$TAG" --cache-from > "$COMMON_TAG" > - --build-arg BUILDKIT_INLINE_CACHE=1 > - -f "tests/docker/dockerfiles/$NAME.docker" "." > - - docker push "$TAG" > - after_script: > - - docker logout > + - /kaniko/executor > + --dockerfile "$DOCKERFILE" > + --destination "$TAG" > + --cache=true > + --cache-repo="$CACHE_REPO" I'm surprised there is no need to set provide the user/password login credentials for the registry. None the less I tested this and it succeeed. I guess gitlab somehow has some magic authorization granted to any CI job, that avoids the need for a manual login ? Wonder why we needed the 'docker login' step though ? Perhaps because D-in-D results in using an externally running docker daemon which didn't inherit credentials from the job environment ? Caching of course fails when I'm running jobs in my fork. IOW, if we change container content in a fork and want to test it, it will be doing a full build from scratch every time. This likely isn't the end of the world because dockerfiles change in frequently, and when they do, paying the price of full rebuild is a time limited proble unless a PULL is sent and accepted. TL;DR: functionally this patch is capable of working. The key downside is that it doubles our storage usage. I'm not convinced Kaniko offers a compelling enough benefit to justify this penalty. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|