[
https://issues.apache.org/jira/browse/HADOOP-19859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18075764#comment-18075764
]
ASF GitHub Bot commented on HADOOP-19859:
-----------------------------------------
ajfabbri commented on code in PR #8451:
URL: https://github.com/apache/hadoop/pull/8451#discussion_r3132362703
##########
.github/workflows/build_image_cache.yml:
##########
@@ -0,0 +1,53 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+name: Image Cache
+
+# Security: write privileges are safe since this is triggered only by
+# `push` and `workflow_dispatch` (implying user has write access).
+on:
+ # Run jobs when a commit is merged
+ push:
+ branches:
+ - 'trunk'
+ - 'branch-*'
+ paths:
+ - 'dev-support/docker/**'
Review Comment:
Cool. Makes sense to rebuild when anything in here changes. In the future we
might just publish an image and use it directly instead of always doing a
cached build? We can iterate on it though. 👍
##########
.github/workflows/build_image_cache.yml:
##########
@@ -0,0 +1,51 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+name: Image Cache
+
+on:
+ # Run jobs when a commit is merged
+ push:
+ branches:
+ - 'trunk'
+ - 'branch-*'
+ paths:
+ - 'dev-support/docker/**'
+ workflow_dispatch:
Review Comment:
Agreed. It is a "trusted" action. (We still are careful to use best
practices below, and our CodeQL scanning helps enforce that in the future.)
##########
.github/workflows/tmpl_build_image_cache.yml:
##########
@@ -0,0 +1,62 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+name: Build Image Cache
+
+on:
+ workflow_call:
+ inputs:
+ os:
+ required: false
+ type: string
+ description: Operating system to create build image cache for.
+ default: ubuntu_24
+
+# Default to minimal permissions for workflow.
+permissions:
+ packages: read
+
+jobs:
+ main:
+ name: build-image-cache-${{ inputs.os }}-${{ github.ref_name }}
+ if: github.repository == 'apache/hadoop'
+ runs-on: ubuntu-24.04
+ permissions:
+ packages: write
+ steps:
+ - name: Checkout Hadoop repository
+ uses: actions/checkout@v6
+ - name: Set up Docker Buildx
+ uses:
docker/setup-buildx-action@4d04d5d9486b7bd6fa91e7baf45bbb4f8b9deedd # v4.0.0
+ - name: Login to DockerHub
+ uses: docker/login-action@b45d80f862d83dbcd57f89517bcf500b2ab88fb2 #
v4.0.0
+ with:
+ registry: ghcr.io
+ username: ${{ github.actor }}
+ password: ${{ secrets.GITHUB_TOKEN }}
+ - name: Build image cache for ${{ inputs.os }}-${{ github.ref_name }}
+ id: docker_build
+ uses:
docker/build-push-action@d08e5c354a6adb9ed34480a06d141179aa583294 # v7.0.0
+ with:
+ context: ./dev-support/docker/
+ file: ./dev-support/docker/Dockerfile_${{ inputs.os }}
+ push: true
+ tags: ghcr.io/apache/hadoop/gha-build-${{ inputs.os
}}-image-cache:${{ github.ref_name }}-static
+ cache-from: type=registry,ref=ghcr.io/apache/hadoop/gha-build-${{
inputs.os }}-image-cache:${{ github.ref_name }}
+ cache-to: type=registry,ref=ghcr.io/apache/hadoop/gha-build-${{
inputs.os }}-image-cache:${{ github.ref_name }},mode=max
Review Comment:
Just getting familiar with this and reading docs. Is this based on the Spark
CI workflows?
`type=registry` ([docs](https://docs.docker.com/build/cache/backends/))
> registry: embeds the build cache into a separate image, and pushes to a
dedicated location separate from the main output.
`cache-to:` exports the cache to a particular backend (registry) after a
build. `cache-from` specifies how to import at start of a build. IIUC the local
BuildKit cache is always enabled, but has no persistence between runs, so only
helps with multiple builds within the same workflow.
The locations passed in (`ref=`) act as the key for the cache lookup, and we
separate these by OS and branch name.
`mode=max` means to export all intermediate layers of the image build,
whereas `mode=min` only exports those which end up in the image. This looks
good to me. 👍
> Use cache to speed up GHA infra image building
> ----------------------------------------------
>
> Key: HADOOP-19859
> URL: https://issues.apache.org/jira/browse/HADOOP-19859
> Project: Hadoop Common
> Issue Type: Sub-task
> Reporter: Cheng Pan
> Priority: Major
> Labels: pull-request-available
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]