potiuk commented on PR #36339: URL: https://github.com/apache/airflow/pull/36339#issuecomment-1869568986
Cool. Great to know you are researching it and I am happy to share some details you asked for - hoping to get some feedback and audit-like approach for what we do. You can read a.lot about similar attempts and actually discovered security issue in the past in my recent blog post https://medium.com/apache-airflow/unraveling-the-code-navigating-a-ci-release-security-vulnerability-in-apache-airflow-620214a96297 where you will find a number of details and even links to PRs that improved our defenses (and it explains context and assumptions we took when designing our CI workflows). This is a very interesting issue for your research likely as it shows how even things like using docker cache poisoning might influence release management process. Also if you want to see some of the design decisions (mostly about using docker containers to run user-submitted code in forked PRs you can take a look at this ADR - that explains some of the security considerations https://github.com/apache/airflow/blob/main/dev%2Fbreeze%2Fdoc%2Fadr%2F0005-preventing-using-contributed-code-when-building-images.md The workflows that we have in Airflow are nicely described in https://github.com/apache/airflow/blob/main/CI.rst and you will also find diagrams describing the workflows in https://github.com/apache/airflow/blob/main/CI_DIAGRAMS.md showing how they look like. I keep the docs and diagrams up-to-date so you should be able to understand better how it all works. Regarding the protection - we do not apply any special protection for regular pull-request workflow on GitHub Infrastructure / Public Runners and the above issue and details explain all the considerations for 'Pull request target' workflow - which does indeed need more protection due to potential access to secrets and. Write GitHub tokens. But as you indeed figured out correctly - for our self-hosted infrastructure we use a forked Runner where we only allow to run actions on our self-hosted runners when the actor for the runner is one of our maintainers. You can see the forked Runner - which we tried - unsuccessfully - to contribute to the runner - https://github.com/actions/runner/pull/783 - you can look at the original repo where it comes from (@ashb created the fork) and we keep on rebasing it and we have semi-automated framework to release and use a patched runner for our own use in our self-hosted runners. We also follow a well defined ASF policies for those workflows - I am a member of security committee of the Apache Software Foundation and we have this page where we keep some security notes for all projects of the ASF and generally speaking all ASF projects require manual workflow approval for all non-maintainers. This can be relaxed if the ASF project's PMC agrees to follow the rules and watch out described here: https://cwiki.apache.org/confluence/plugins/servlet/mobile?contentId=173085013#content/view/173085013 - and here is policy describing it: https://infra.apache.org/github-actions-policy.html - this is one of the reasons why we are very watchful about the kind of PRs you made - because being watchful is part of the commitment we did when we applied to the ASF infrastructure to switch the 'approval' to be done only for 'new' users. Generally speaking - if you are researching the subject we would appreciate if you could review all those and let us know (privately) via [email protected] (see our security policy in the airflow repo for details). The report from October that we addressed (and then added extra layers of protection as described in the blog post of mine) had been raised to HackerOne and with the blog post we supported the researcher to apply for HackerOne Bounty for that one and if you find any weaknesses in our approach we would love to hear about it (but hopefully without rIsing the alarm bells as you did). Also if you find anything for any other ASF projects, reporting such issues to [email protected] is most welcome We know how important those kind of issues (basically supply-chain) are and we will be looking at such reports very closely when we get them . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
