Re: GitHub Actions Concurrency Limits for Apache projects

Chesnay Schepler Tue, 27 Oct 2020 13:10:23 -0700

How many projects are already using GitHub actions?

It seems to be fairly new, and I find it concerning that we are alreadyhitting the limit. If only few projects are using it currently, then itmay be futile to rely on it because it would inevitably collapse if moreprojects were to use it.Unless there is some project using up most of the allocated minutes,similarly to what is(was?) happening with Travis.

Alternatively, maybe GitHub actions should be reserved for quick checksand not actual CI pipelines.


On 10/27/2020 8:53 PM, Jarek Potiuk wrote:

Hello everyone,

The queues have become unbearable during the last two days. This is not
sustainable long-term. I lost hope a bit that any kind of optimization will
help but we are trying anyway.

However, we are still trying :)

We are just about to merge and verify the PR that implements this "limited
matrix tests before approval solution. We implemented it with Tobiasz who
volunteered to help and once it works we will try to apply it to Apache
Beam as well. When it works we will be happy to share the solution with
everyone.

You can read more on how it works (with screenshot) here:
https://github.com/apache/airflow/pull/11828#issuecomment-717485938

We could not implement automated workflow run due to limitations of GitHub
Actions (you cannot rerun successful workflow via API) but we came up with
something even more flexible:

1) PRs before approval only run one default combination of matrix tests.
This in our case will save 50%-60% of build time for most PRs.
2) Once PR gets approved, it gets "okay to test" label and comment in PR
"The PR is ready to run all tests! Please rebase it to latest master or ask
committer to re-run it". It also gets an "in-progress" check in the PR
which turns the green "merge" button into a gray one to avoid accidental
merges. But commiter can still decide to merge at this point (for small,
low-risk changes).
3) Once the PR gets rebased or re-run it runs full-matrix tests and
everything follows as usual
4) We also have a special treatment for the case that Allen mentioned
earlier - the "small" "doc-only" PRs have a special treatment, after
approval, they get immediately "okay to merge" label and "The PR is ready
to be merged. No tests are needed!."  comment is added by the bot

Again - once we find it working, I am happy to describe how to add it to
your GitHub actions and share such information with all other projects
using Github Actions.

J.


On Fri, Oct 23, 2020 at 5:29 PM Jarek Potiuk <[email protected]>
wrote:

Started working on this mini-solution for limiting non-approved
matrix builds.

I am working on it with a colleague of mine -  Tobiasz - who worked on
Apache Beam infrastructure, so we might test it on two projects.

I will let you know the progress

Mini-design doc here:

https://docs.google.com/document/d/16rwyCfyDpKWN-DrLYbhjU0B1D58T1RFYan5ltmw4DQg/edit#

J.


On Thu, Oct 22, 2020 at 10:03 PM Jarek Potiuk <[email protected]>
wrote:

I believe this problem cannot be really handled by one project, but I
have a proposal.

I looked at the common pattern we have in the ASF projects and I think
there is a way that we can help each other.

I think most of the problems come from many PRs submitted that run a
matrix of tests before even commiters have time to take a look at them. We
discussed how we can approach it and I think I have a proposal that we can
all adopt in the ASF projects. Something that will be easy to implement and
will not impact the process we have. I would love to hear your thoughts
about it - before I start implementing it :).

My proposal is to create a GitHub Action that will allow to run only a
subset of "matrix" test for PRs that are not yet approved by committers.
This should be possible using the current GitHub Actions workflows and API.
It boils down to:
* If PR is not approved, only a subset of matrix (default value for each
matrix component) are run
* the committers can see the "green" mark of test passing and make a
review
* once the PR gets approved, automatically a new "full matrix" check is
triggered
* all future approved PR pushes run the "full matrix" check

I think that might significantly reduce the strain on GA jobs we run, and
it should very naturally fit in the typical PR workflow for ASF projects.
But I am only guessing now, so I would love to hear what you think:

I am willing (together with my colleagues) to implement this action and
add it to Apache Airflow to check it. Together with the
"cancel-workflow-action" I developed and we deployed it at Apache Airflow
and Apache Beam, I think that might help to keep the CI "pressure" much
lower - independently if any of the projects manages to get their credit
sponsors. I think I can have a working Action/implementation done over the
weekend:

More details about the proposal here:
https://lists.apache.org/thread.html/r6f6f1420aa6346c9f81bf9d9fff8816e860e49224eb02e25d856c249%40%3Cdev.airflow.apache.org%3E

On Mon, Oct 19, 2020 at 5:28 PM Jarek Potiuk <[email protected]>
wrote:

Yep. We still continuously optimize it and we are reaching out to get
funding for self-hosted runners. And I think it would be great to see that
happening. I am happy to help anyone who needs some help there - I've been
already helping Apache Beam with their GitHub Actions settings.

On Mon, Oct 19, 2020 at 6:12 AM Greg Stein <[email protected]> wrote:

This is some great news, Jarek.

Given that GitHub build minutes are shared, we need more of this kind of
work from our many communities.

Thanks,
Greg
InfraAdmin, ASF


On Sun, Oct 18, 2020 at 2:32 PM Jarek Potiuk <[email protected]>
wrote:

Hello Allen,

I'd really love to give a try to Yetus - how it can actually make our
approach better.

I just merged the change I planned (finally we got to that), that
implements the final optimisation that you mentioned. In the case of a
single .md file change we got the build time down to about 1 minute,

most

of it being GitHub Actions "workflow" overhead.

We went-down with the incremental pre-commit tests to ~ 25s.

Build here: https://github.com/potiuk/airflow/pull/128/checks. As

you can

see here:

https://github.com/potiuk/airflow/pull/128/checks?check_run_id=1268353637#step:7:98

in
this case we run only the relevant static checks:

    - "No-tabs checker"
    - "Add license for all md files"
    - "Add TOC for md files."
    - "Check for merge conflicts"
    - "Detect Private Key"
    - "Fix End of Files"
    - "Trim Trailing Whitespace"
    - "Check for language that we do not accept as community",

All the other checks, image building, and all the extra checks are

skipped

(automatically as pre-commit determined them irrelevant).

All this, while we keep really comprehensive tests and optimisation of
image building for all the "serious steps". I tried to explain the
philosophy and some basic assumptions behind our CI in
https://github.com/apache/airflow/blob/master/CI.rst#ci-environment

- and

I'd love to try to see how this plays together with the Yetus tool.

Would it be possible to work together with the Yetus team on trying

to see

how it can help to further optimise and possibly simplify the setup we
have? I'd love to get some cooperation on those. I am nearly done

with all

optimisations I planned, And we are for years (long before my tenure)

among

top-3 Apache projects when it comes to CI-time use, so that might be

a good

one if we can pull together some improvements.


J.



On Wed, Oct 14, 2020 at 4:41 PM Jarek Potiuk <

[email protected]>

wrote:

Exactly - > dialectic vs. dislectic for example.

On Wed, Oct 14, 2020 at 4:40 PM Jarek Potiuk <

[email protected]>

wrote:

And really sorry about yatus vs. yetus - I am slightly dialectic

and

when

things are not in the dictionary, I tend to do many mistakes. I

hope

it's

not something that people can take as a sign of being "worse", but

if

you

felt offended by that - apologies.



On Wed, Oct 14, 2020 at 4:34 PM Jarek Potiuk <

[email protected]>

wrote:

Hey Allen,

I would be super happy if you could help us to do it properly at

Airlfow

- would you like to work with us and get the yatus configuration

that

would work for us ? I am super happy to try it? Maybe you could

open PR

with some basic yatus implementation to start with and we could

work

together to get it simplified? I would love to learn how to do it.

J


On Wed, Oct 14, 2020 at 3:37 PM Allen Wittenauer
<[email protected]> wrote:

On Oct 13, 2020, at 11:04 PM, Jarek Potiuk <

[email protected]>

wrote:

This is a logic
that we have to implement regardless - whether we use yatus or

pre-commit

(please correct me if I am wrong).

         I'm not sure about yatus, but for yetus, for the most

part,

yes, one would like to need to implement custom rules in the

personality to

exactly duplicate the overly complicated and over engineered

airflow

setup.  The big difference is that one wouldn't be starting from

scratch.

The difference engine is already there. The file filter is

already

there.

full build vs. PR handling is already there. etc etc etc

For all others, this is not a big issue because in total all

other

pre-commits take 2-3 minutes at best. And if we find that we

need to

optimize it further we can simply disable the '--all-files'

switch

for

pre-commit and they will only run on the latest commit-changed

files

(pre-commit will only run the tests related to those changed

files).

But

since they are pretty fast (except pylint/mypy/flake8) we think

running

them all, for now, is not a problem.

         That's what everyone thinks until they start aggregating

the

time across all changes...

--

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

--

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

--

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

--

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>


--

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

--

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

--

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Re: GitHub Actions Concurrency Limits for Apache projects

Reply via email to