pan3793 commented on code in PR #8450:
URL: https://github.com/apache/hadoop/pull/8450#discussion_r3125722408
##########
.github/workflows/tmpl_build_and_test.yml:
##########
@@ -86,6 +90,19 @@ jobs:
name: Build Image ${{ inputs.os }}-${{ inputs.branch }}
runs-on: ubuntu-24.04
needs: [ precondition ]
+ # Security: this does not leak write access for our image repository to
+ # forked repos.
+ #
+ # We have `packages: write` permissions for our GITHUB_TOKEN below.
However:
+ #
+ # - For `pull_request`, GitHub downgrades GITHUB_TOKEN permissions to
+ # read-only.
+ # - For `push` triggers on a fork, the GITHUB_TOKEN retains write
+ # permissions, but the `push` is happening in the context of the fork,
not
+ # the upstream repo.
+ # - For `pull_request_target` (not used here), image repo permissions are
+ # scoped to the repository they run on. This prevents forks from writing
+ # to our Apache Hadoop image repo.
Review Comment:
details explanation of my understanding
suppose Tom opens a PR target to apache/hadoop trunk branch from a forked
repo, which adds a workflow yaml defines `pull_request_target`, then nothing
happens on this PR, because GHA looks at the apache/hadoop trunk branch, and it
does not have such a workflow.
it this PR gets merged, then the next PR targeting the apache/hadoop trunk
branch will trigger this workflow, and it uses the trunk's version and ignores
the PR's change in this workflow yaml. what's dangerous here is - the workflow
might consume the code from the PR change, for example, the workflow use the
Dockerfile to build an image and push to apache/hadoop image repo, the context
only protects the workflow yaml file itself (again, it use the trunk branch
version yaml instead of PR's version), but it still able to read to unstrust
code from PR (because we explicitly checkout the PR code to CI workspace to
verify the PR change). so, why we are safe? because we only use
`pull_request_target` for workflows that do not read files from the source
repo. currently, it's a script that only calls the GitHub API, hackers have no
opportunity to inject their code for execution without getting a PR merged.
our CI requires some workflows to run on PR with write permission, to edit
labels, to create checks, all of which won't consume untrusted code, so we are
safe. without write permission then they stop working.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]