How many projects are already using GitHub actions?
It seems to be fairly new, and I find it concerning that we are already
hitting the limit. If only few projects are using it currently, then it
may be futile to rely on it because it would inevitably collapse if more
projects were to use it.
Unless there is some project using up most of the allocated minutes,
similarly to what is(was?) happening with Travis.
Alternatively, maybe GitHub actions should be reserved for quick checks
and not actual CI pipelines.
On 10/27/2020 8:53 PM, Jarek Potiuk wrote:
Hello everyone,
The queues have become unbearable during the last two days. This is not
sustainable long-term. I lost hope a bit that any kind of optimization will
help but we are trying anyway.
However, we are still trying :)
We are just about to merge and verify the PR that implements this "limited
matrix tests before approval solution. We implemented it with Tobiasz who
volunteered to help and once it works we will try to apply it to Apache
Beam as well. When it works we will be happy to share the solution with
everyone.
You can read more on how it works (with screenshot) here:
https://github.com/apache/airflow/pull/11828#issuecomment-717485938
We could not implement automated workflow run due to limitations of GitHub
Actions (you cannot rerun successful workflow via API) but we came up with
something even more flexible:
1) PRs before approval only run one default combination of matrix tests.
This in our case will save 50%-60% of build time for most PRs.
2) Once PR gets approved, it gets "okay to test" label and comment in PR
"The PR is ready to run all tests! Please rebase it to latest master or ask
committer to re-run it". It also gets an "in-progress" check in the PR
which turns the green "merge" button into a gray one to avoid accidental
merges. But commiter can still decide to merge at this point (for small,
low-risk changes).
3) Once the PR gets rebased or re-run it runs full-matrix tests and
everything follows as usual
4) We also have a special treatment for the case that Allen mentioned
earlier - the "small" "doc-only" PRs have a special treatment, after
approval, they get immediately "okay to merge" label and "The PR is ready
to be merged. No tests are needed!." comment is added by the bot
Again - once we find it working, I am happy to describe how to add it to
your GitHub actions and share such information with all other projects
using Github Actions.
J.
On Fri, Oct 23, 2020 at 5:29 PM Jarek Potiuk <[email protected]>
wrote:
Started working on this mini-solution for limiting non-approved
matrix builds.
I am working on it with a colleague of mine - Tobiasz - who worked on
Apache Beam infrastructure, so we might test it on two projects.
I will let you know the progress
Mini-design doc here:
https://docs.google.com/document/d/16rwyCfyDpKWN-DrLYbhjU0B1D58T1RFYan5ltmw4DQg/edit#
J.
On Thu, Oct 22, 2020 at 10:03 PM Jarek Potiuk <[email protected]>
wrote:
I believe this problem cannot be really handled by one project, but I
have a proposal.
I looked at the common pattern we have in the ASF projects and I think
there is a way that we can help each other.
I think most of the problems come from many PRs submitted that run a
matrix of tests before even commiters have time to take a look at them. We
discussed how we can approach it and I think I have a proposal that we can
all adopt in the ASF projects. Something that will be easy to implement and
will not impact the process we have. I would love to hear your thoughts
about it - before I start implementing it :).
My proposal is to create a GitHub Action that will allow to run only a
subset of "matrix" test for PRs that are not yet approved by committers.
This should be possible using the current GitHub Actions workflows and API.
It boils down to:
* If PR is not approved, only a subset of matrix (default value for each
matrix component) are run
* the committers can see the "green" mark of test passing and make a
review
* once the PR gets approved, automatically a new "full matrix" check is
triggered
* all future approved PR pushes run the "full matrix" check
I think that might significantly reduce the strain on GA jobs we run, and
it should very naturally fit in the typical PR workflow for ASF projects.
But I am only guessing now, so I would love to hear what you think:
I am willing (together with my colleagues) to implement this action and
add it to Apache Airflow to check it. Together with the
"cancel-workflow-action" I developed and we deployed it at Apache Airflow
and Apache Beam, I think that might help to keep the CI "pressure" much
lower - independently if any of the projects manages to get their credit
sponsors. I think I can have a working Action/implementation done over the
weekend:
More details about the proposal here:
https://lists.apache.org/thread.html/r6f6f1420aa6346c9f81bf9d9fff8816e860e49224eb02e25d856c249%40%3Cdev.airflow.apache.org%3E
J,
On Mon, Oct 19, 2020 at 5:28 PM Jarek Potiuk <[email protected]>
wrote:
Yep. We still continuously optimize it and we are reaching out to get
funding for self-hosted runners. And I think it would be great to see that
happening. I am happy to help anyone who needs some help there - I've been
already helping Apache Beam with their GitHub Actions settings.
On Mon, Oct 19, 2020 at 6:12 AM Greg Stein <[email protected]> wrote:
This is some great news, Jarek.
Given that GitHub build minutes are shared, we need more of this kind of
work from our many communities.
Thanks,
Greg
InfraAdmin, ASF
On Sun, Oct 18, 2020 at 2:32 PM Jarek Potiuk <[email protected]>
wrote:
Hello Allen,
I'd really love to give a try to Yetus - how it can actually make our
approach better.
I just merged the change I planned (finally we got to that), that
implements the final optimisation that you mentioned. In the case of a
single .md file change we got the build time down to about 1 minute,
most
of it being GitHub Actions "workflow" overhead.
We went-down with the incremental pre-commit tests to ~ 25s.
Build here: https://github.com/potiuk/airflow/pull/128/checks. As
you can
see here:
https://github.com/potiuk/airflow/pull/128/checks?check_run_id=1268353637#step:7:98
in
this case we run only the relevant static checks:
- "No-tabs checker"
- "Add license for all md files"
- "Add TOC for md files."
- "Check for merge conflicts"
- "Detect Private Key"
- "Fix End of Files"
- "Trim Trailing Whitespace"
- "Check for language that we do not accept as community",
All the other checks, image building, and all the extra checks are
skipped
(automatically as pre-commit determined them irrelevant).
All this, while we keep really comprehensive tests and optimisation of
image building for all the "serious steps". I tried to explain the
philosophy and some basic assumptions behind our CI in
https://github.com/apache/airflow/blob/master/CI.rst#ci-environment
- and
I'd love to try to see how this plays together with the Yetus tool.
Would it be possible to work together with the Yetus team on trying
to see
how it can help to further optimise and possibly simplify the setup we
have? I'd love to get some cooperation on those. I am nearly done
with all
optimisations I planned, And we are for years (long before my tenure)
among
top-3 Apache projects when it comes to CI-time use, so that might be
a good
one if we can pull together some improvements.
J.
On Wed, Oct 14, 2020 at 4:41 PM Jarek Potiuk <
[email protected]>
wrote:
Exactly - > dialectic vs. dislectic for example.
On Wed, Oct 14, 2020 at 4:40 PM Jarek Potiuk <
[email protected]>
wrote:
And really sorry about yatus vs. yetus - I am slightly dialectic
and
when
things are not in the dictionary, I tend to do many mistakes. I
hope
it's
not something that people can take as a sign of being "worse", but
if
you
felt offended by that - apologies.
On Wed, Oct 14, 2020 at 4:34 PM Jarek Potiuk <
[email protected]>
wrote:
Hey Allen,
I would be super happy if you could help us to do it properly at
Airlfow
- would you like to work with us and get the yatus configuration
that
would work for us ? I am super happy to try it? Maybe you could
open PR
with some basic yatus implementation to start with and we could
work
together to get it simplified? I would love to learn how to do it.
J
On Wed, Oct 14, 2020 at 3:37 PM Allen Wittenauer
<[email protected]> wrote:
On Oct 13, 2020, at 11:04 PM, Jarek Potiuk <
[email protected]>
wrote:
This is a logic
that we have to implement regardless - whether we use yatus or
pre-commit
(please correct me if I am wrong).
I'm not sure about yatus, but for yetus, for the most
part,
yes, one would like to need to implement custom rules in the
personality to
exactly duplicate the overly complicated and over engineered
airflow
setup. The big difference is that one wouldn't be starting from
scratch.
The difference engine is already there. The file filter is
already
there.
full build vs. PR handling is already there. etc etc etc
For all others, this is not a big issue because in total all
other
pre-commits take 2-3 minutes at best. And if we find that we
need to
optimize it further we can simply disable the '--all-files'
switch
for
pre-commit and they will only run on the latest commit-changed
files
(pre-commit will only run the tests related to those changed
files).
But
since they are pretty fast (except pylint/mypy/flake8) we think
running
them all, for now, is not a problem.
That's what everyone thinks until they start aggregating
the
time across all changes...
--
Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer
M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>
--
Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer
M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>
--
Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer
M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>
--
Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer
M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>
--
Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer
M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>
--
Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer
M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>
--
Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer
M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>