potiuk edited a comment on pull request #10368:
URL: https://github.com/apache/airflow/pull/10368#issuecomment-675177393


   Hello everyone. This is a major overhaul of the way how we are utilizing 
GitHub Actions - something that was enabled by recent features released by 
GitHub Actions (namely "workflow_run" feature). 
   
   I've been working on it last week and heavily tested it on my fork 
https://github.com/potiuk/airflow/actions. I hope there will be rather little 
number of teething problems, but I am already very familiar with how GA work 
and I will be able to fix any problems quickly. I am also going to watch it 
once we merge it to make sure it works as expected. 
   
   See commit description for what is achieved by this change. I have just one 
thing to say - this is my "dream" architecture of the CI builds that I had in 
mind at the very beginning of my work on Airflow, one that could only be 
achieved by the most recent changes by GitHub. I really hope this is one of the 
last fundamental changes in the scripting for CI because I literally run out of 
ideas what can be improved (just kidding - there are always small things ;).
   
   It has many nice properties but the most important ones:
   
   *  5-12 minutes saved for each Job (Builds of images are done only once not 
for each job). Not per whole run - but per Job (!). This will help both - 
increase number of parallell PRs that can be run and decrease the feedback time 
for each build. There were sometimes much slower builds when python base image 
was upgraded or Dockerfile changed - this problem will be gone.
   
   * the jobs/runs are fully consistent - all jobs in the same build use 
exactly the same image prepared only once. 
   
   * full trackability and reproducibility of each run - we keep the images in 
GitHub registry and you can recreate the exact failed run by running `./breeze 
--github-image-id <RUN_ID>` or `./breeze --github-image-id <COMMIT_ID>` for 
merged runs.
   
   * I cleaned up outputs of the job so that they only show relevant information
   
   * I cleaned up initialization code for bash scripts - removed some 
duplicates and organized it better and I fully documented it - describing the 
purpose of all options (that was the lat script refactoring I planned)
   
   It's quite a huge change, and I can try to split it into smaller ones (but 
conceptually it is one big overhaul of the way our CI works) 
   
   When you can start from the workflows at the end of the documentation: 
https://github.com/PolideaInternal/airflow/blob/prebuild-ci-images-in-github-actions/CI.rst
  - I prepared some sequence diagrams of the CI architecture (using mermaid - 
which is an absolutely cool tool for converting markdownish descriptions of 
diagrams into really nice diagrams). It explains all the "whys" and also 
"hows". 
   
   NOTE! For Review/Merge I needed to disable waiting for images, so the 
speedups are not visible yet - I have to merge it to master in order to  enable 
the "Build Image" workflows. I also use "master" version of my own Github 
Cancel Action which I developed for that purpose - I will release it's v2 
version and switch to it once we get a few days of the builds working in 
Airflow.
   
   I developed https://github.com/potiuk/cancel-workflow-runs new Github Action 
for "Cancel Workflow Run" that is a swiss-army-knife of Run cancelling and I 
plan to share it with Apache Beam and other Apache projects that might need it 
as well.
   
   I really look forward to review comments and merging it eventually. This 
will help all of contributors and committers to move faster. This is literally 
completion of 2 years of the "dream" architecture for our CI :).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to