Re: [PATCH] gitlab-ci: Replace Docker with Kaniko

Daniel P . Berrangé Thu, 16 May 2024 11:25:09 -0700

On Thu, May 16, 2024 at 05:52:43PM +0100, Camilla Conte wrote:
> Enables caching from the qemu-project repository.
> 
> Uses a dedicated "$NAME-cache" tag for caching, to address limitations.
> See issue "when using --cache=true, kaniko fail to push cache layer [...]":
> https://github.com/GoogleContainerTools/kaniko/issues/1459


After investigating, this is a result of a different design approach
for caching in kaniko.

In docker, it can leverage any existing image as a cache source,
reusing individual layers that were present. IOW, there's no
difference between a cache and a final image, they're one and the
same thing

In kaniko, the cache is a distinct object type. IIUC, it is not
populated with the individual layers, instead it has a custom
format for storing the cached content. Therefore the concept of
storing the cache at the same location as the final image, is
completely inappropriate - you can't store two completely different
kinds of content at the same place.

That is also why you can't just "git pull" the fetch the cache
image(s) beforehand, and also why it doesn't look like you can
use multiple cache sources with kaniko.

None of this is inherantly a bad thing..... except when it comes
to data storage. By using Kaniko we would, at minimum, doubling
the amount of data storage we consume in the gitlab registry.

This is a potentially significant concern because GitLab does
technically have a limited storage quota, even with our free
OSS plan  subscription.

Due to technical limitations, they've never been able to
actually enforce it thus far, but one day they probably will.
At which point we're doomed, because even with our current
Docker-in-Docker setup I believe we're exceeding our quota.
Thus the idea of doubling our container storage usage is pretty
unappealing.

We can avoid that by running without cache, but that has the
cost of increasing the job running time, since all containers
would be rebuilt on every pipeline. This will burn through
our Azure compute allowance more quickly (or our GitLab CI
credits if we had to switch away from Azure).

> Does not specify a context since no Dockerfile is using COPY or ADD 
> instructions.
> 
> Does not enable reproducible builds as
> that results in builds failing with an out of memory error.
> See issue "Using --reproducible loads entire image into memory":
> https://github.com/GoogleContainerTools/kaniko/issues/862
> 
> Previous attempts, for the records:
>   - Alex Bennée: 
> https://lore.kernel.org/qemu-devel/20230330101141.30199-12-alex.ben...@linaro.org/
>   - Camilla Conte (me): 
> https://lore.kernel.org/qemu-devel/20230531150824.32349-6-cco...@redhat.com/
> 
> Signed-off-by: Camilla Conte <cco...@redhat.com>
> ---
>  .gitlab-ci.d/container-template.yml | 25 +++++++++++--------------
>  1 file changed, 11 insertions(+), 14 deletions(-)
> 
> diff --git a/.gitlab-ci.d/container-template.yml 
> b/.gitlab-ci.d/container-template.yml
> index 4eec72f383..066f253dd5 100644
> --- a/.gitlab-ci.d/container-template.yml
> +++ b/.gitlab-ci.d/container-template.yml
> @@ -1,21 +1,18 @@
>  .container_job_template:
>    extends: .base_job_template
> -  image: docker:latest
>    stage: containers
> -  services:
> -    - docker:dind
> +  image:
> +    name: gcr.io/kaniko-project/executor:debug
> +    entrypoint: [""]
> +  variables:
> +    DOCKERFILE: "$CI_PROJECT_DIR/tests/docker/dockerfiles/$NAME.docker"
> +    CACHE_REPO: "$CI_REGISTRY/qemu-project/qemu/qemu/$NAME-cache"
>    before_script:
>      - export TAG="$CI_REGISTRY_IMAGE/qemu/$NAME:$QEMU_CI_CONTAINER_TAG"
> -    # Always ':latest' because we always use upstream as a common cache 
> source
> -    - export COMMON_TAG="$CI_REGISTRY/qemu-project/qemu/qemu/$NAME:latest"
> -    - docker login $CI_REGISTRY -u "$CI_REGISTRY_USER" -p 
> "$CI_REGISTRY_PASSWORD"
> -    - until docker info; do sleep 1; done
>    script:
>      - echo "TAG:$TAG"
> -    - echo "COMMON_TAG:$COMMON_TAG"
> -    - docker build --tag "$TAG" --cache-from "$TAG" --cache-from 
> "$COMMON_TAG"
> -      --build-arg BUILDKIT_INLINE_CACHE=1
> -      -f "tests/docker/dockerfiles/$NAME.docker" "."
> -    - docker push "$TAG"
> -  after_script:
> -    - docker logout
> +    - /kaniko/executor
> +      --dockerfile "$DOCKERFILE"
> +      --destination "$TAG"
> +      --cache=true
> +      --cache-repo="$CACHE_REPO"

I'm surprised there is no need to set provide the user/password
login credentials for the registry. None the less  I tested this
and it succeeed.

I guess gitlab somehow has some magic authorization granted to any CI
job, that avoids the need for a manual login ? Wonder why we needed
the 'docker login' step though ? Perhaps because D-in-D results in
using an externally running docker daemon which didn't inherit
credentials from the job environment ?

Caching of course fails when I'm running jobs in my fork. IOW, if we
change container content in a fork and want to test it, it will be
doing a full build from scratch every time. This likely isn't the end
of the world because dockerfiles change in frequently, and when they
do, paying the price of full rebuild is a time limited proble unless
a PULL is sent and accepted.


TL;DR: functionally this patch is capable of working. The key downside
is that it doubles our storage usage. I'm not convinced Kaniko offers
a compelling enough benefit to justify this penalty.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [PATCH] gitlab-ci: Replace Docker with Kaniko

Reply via email to