potiuk commented on pull request #16904:
URL: https://github.com/apache/airflow/pull/16904#issuecomment-892149704


   The problem is that we can't do it differently until we have better support 
from GitHub for cross-workflow communication. 
   
   All details are described in CI.rst but here is the gist of it :
   
   * The PR workflows have by definition only 'read ' access to the GitHub Repo 
of Airflow.  They cannot have 'write' access because it opens up security issue 
possibility ( where anyone running a PR could modify Airflow Repo)
   
   * On the other hand we need to build (and share) ci docker images from the 
incoming PR so that all jobs can use those. This is needed because we have 
different kinds of jobs (tests/building providers / testing images / helm tests 
etc. - having common CI docker images with all the sources and what is more 
important dependencies built in make it possible to have common 'execution' 
environment for all tests. The problem with it is that if we have read only 
access in PR workflow, we cannot build such image only once and share it with 
other jobs because we cannot store the image anywhere. Each build job runs on 
separate machine and ideally you just build the images once, push it somewhere 
and every job uses this image. That could save a LOT of build time - especially 
when someone adds s new dependency (which does not happen very often but often 
enough)
   
   * Therefore we run building image in separate workflow (pull_request_target) 
where we have write access, but also ci workflow and scripts are coming fron 
'main' branch - so no risk of someone's PR injecting something in our repo. 
Those 'Build image' workflows can build and push image with write permissions.
   
   * The 'pull request targer' workflow is triggered in parallel to 'ci' 
workflow. currently GitHub Action does not have a feature to add any 'cross 
workflew' dependency so we cannot wait in one workflow for the result of 
another (explicitly). As a workaround 'wait for image' job from ci workflow 
simply actively polls for the images produced by 'build image' workflow. It is 
a waste. Active job running and checking if images are already pushed. But 
there is no other way. Action do not yet have a possibility of 'pausing' and 
'resuming' workflows. It is planned but not yet there 
   
   * Also 'build image' workflow is 'safe' so we run those using our 
self-hosted runners while the 'ci build' is potentially unsafe (GitHub detected 
people using actions to mine Bitcoin(!) )  so we need to run them using Public 
GitHub Runners which get isolation and monitoring from GiyHub. There we have 
150 slots in queue for all Apache projects (> 350 of them) which make those 
jobs susceptible to long queues. On the other hand we also limit our 'build 
image' job queue as we have limited funds from Astronomer and AWS to run those 
jobs in Amazon infrastructure (GCP also donated credits but we need to switch 
to GCP). That's why those queues run at different capacity and one is sometimes 
faster than the other ( in either direction).
   
   
   In short 'build' and 'wait' are designed to run in parallel in any sequence 
and we have no way currently to enforce the sequence they run in.
   
   I hope it makes things clearer :) (or maybe reveal the hidden complexities 
we are dealing with ) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to