[jira] [Commented] (FLINK-34989) Apache Infra requests to reduce the runner usage for a project

2024-06-05 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17852519#comment-17852519
 ] 

Matthias Pohl commented on FLINK-34989:
---

Quote from today's Infra roundtable:
* The job concurrency policies are not enforced for now
* The FT runner policy items are monitored and enforced by Infra

> Apache Infra requests to reduce the runner usage for a project
> --
>
> Key: FLINK-34989
> URL: https://issues.apache.org/jira/browse/FLINK-34989
> Project: Flink
>  Issue Type: Sub-task
>  Components: Build System / CI
>Affects Versions: 1.19.0, 1.18.1, 1.20.0
>Reporter: Matthias Pohl
>Priority: Major
>  Labels: pull-request-available
>
> The GitHub Actions CI utilizes runners that are hosted by Apache Infra right 
> now. These runners are limited. The runner usage can be monitored via the 
> following links:
> * [Flink-specific 
> report|https://infra-reports.apache.org/#ghactions&project=flink&hours=168] 
> (needs ASF committer rights) This project-specific report can only be 
> modified through the HTTP GET parameters of the URL.
> * [Global report|https://infra-reports.apache.org/#ghactions] (needs ASF 
> membership)
> There was a policy change announced recently:
> {quote}
> Policy change on use of GitHub Actions
> Due to misconfigurations in their builds, some projects have been using 
> unsupportable numbers of GitHub Actions. As part of fixing this situation, 
> Infra has added a 'resource use' section to the policy on GitHub Actions. 
> This section of the policy will come into effect on April 20, 2024:
> All workflows MUST have a job concurrency level less than or equal to 20. 
> This means a workflow cannot have more than 20 jobs running at the same time 
> across all matrices.
> All workflows SHOULD have a job concurrency level less than or equal to 15. 
> Just because 20 is the max, doesn't mean you should strive for 20.
> The average number of minutes a project uses per calendar week MUST NOT 
> exceed the equivalent of 25 full-time runners (250,000 minutes, or 4,200 
> hours).
> The average number of minutes a project uses in any consecutive five-day 
> period MUST NOT exceed the equivalent of 30 full-time runners (216,000 
> minutes, or 3,600 hours).
> Projects whose builds consistently cross the maximum use limits will lose 
> their access to GitHub Actions until they fix their build configurations.
> The full policy is at  
> https://infra.apache.org/github-actions-policy.html.
> {quote}
> Currently (last week of March 2024) Flink was ranked at #19 of projects that 
> used the Apache Infra runner resources the most which doesn't seem too bad. 
> This contained not only Apache Flink but also the Kubernetes operator, 
> connectors and other resources. According to [this 
> source|https://infra.apache.org/github-actions-secrets.html] Apache Infra 
> manages 180 runners right now.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34989) Apache Infra requests to reduce the runner usage for a project

2024-04-02 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833153#comment-17833153
 ] 

Matthias Pohl commented on FLINK-34989:
---

Here's a summary of the requirements and whether we meet them based on the 
most-recent report:

|| Requirement || Flink CI ||
| Job concurrency level 20 (or better 15) or below | (n) |
| Do not exceed 25 full-time runners (FT runner), i.e. 4200hours per 7 days | 
(y) |
| Avg number of minutes should not exceed 3600 hours per 5 days | (y) |


> Apache Infra requests to reduce the runner usage for a project
> --
>
> Key: FLINK-34989
> URL: https://issues.apache.org/jira/browse/FLINK-34989
> Project: Flink
>  Issue Type: Sub-task
>  Components: Build System / CI
>Affects Versions: 1.19.0, 1.18.1, 1.20.0
>Reporter: Matthias Pohl
>Priority: Major
>
> The GitHub Actions CI utilizes runners that are hosted by Apache Infra right 
> now. These runners are limited. The runner usage can be monitored via the 
> following links:
> * [Flink-specific 
> report|https://infra-reports.apache.org/#ghactions&project=flink&hours=168] 
> (needs ASF committer rights) This project-specific report can only be 
> modified through the HTTP GET parameters of the URL.
> * [Global report|https://infra-reports.apache.org/#ghactions] (needs ASF 
> membership)
> There was a policy change announced recently:
> {quote}
> Policy change on use of GitHub Actions
> Due to misconfigurations in their builds, some projects have been using 
> unsupportable numbers of GitHub Actions. As part of fixing this situation, 
> Infra has added a 'resource use' section to the policy on GitHub Actions. 
> This section of the policy will come into effect on April 20, 2024:
> All workflows MUST have a job concurrency level less than or equal to 20. 
> This means a workflow cannot have more than 20 jobs running at the same time 
> across all matrices.
> All workflows SHOULD have a job concurrency level less than or equal to 15. 
> Just because 20 is the max, doesn't mean you should strive for 20.
> The average number of minutes a project uses per calendar week MUST NOT 
> exceed the equivalent of 25 full-time runners (250,000 minutes, or 4,200 
> hours).
> The average number of minutes a project uses in any consecutive five-day 
> period MUST NOT exceed the equivalent of 30 full-time runners (216,000 
> minutes, or 3,600 hours).
> Projects whose builds consistently cross the maximum use limits will lose 
> their access to GitHub Actions until they fix their build configurations.
> The full policy is at  
> https://infra.apache.org/github-actions-policy.html.
> {quote}
> Currently (last week of March 2024) Flink was ranked at #19 of projects that 
> used the Apache Infra runner resources the most which doesn't seem too bad. 
> This contained not only Apache Flink but also the Kubernetes operator, 
> connectors and other resources.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34989) Apache Infra requests to reduce the runner usage for a project

2024-04-02 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833154#comment-17833154
 ] 

Matthias Pohl commented on FLINK-34989:
---

This Jira issue is about adding job concurrency support. Ideally, we should 
make it configurable in an easy way and set it to a concurrency level >20 as 
requested by Apache Infra. This affects the nightly builds which run per branch 
with 5 different test profiles and each test profile having 11 runners (10 
stages + a short-running license check) being occupied in parallel.

Generally, we should make CI be more selective anyway. Apache Infra constantly 
criticizes projects to run heavy-load CI for things like simple doc changes.

> Apache Infra requests to reduce the runner usage for a project
> --
>
> Key: FLINK-34989
> URL: https://issues.apache.org/jira/browse/FLINK-34989
> Project: Flink
>  Issue Type: Sub-task
>  Components: Build System / CI
>Affects Versions: 1.19.0, 1.18.1, 1.20.0
>Reporter: Matthias Pohl
>Priority: Major
>
> The GitHub Actions CI utilizes runners that are hosted by Apache Infra right 
> now. These runners are limited. The runner usage can be monitored via the 
> following links:
> * [Flink-specific 
> report|https://infra-reports.apache.org/#ghactions&project=flink&hours=168] 
> (needs ASF committer rights) This project-specific report can only be 
> modified through the HTTP GET parameters of the URL.
> * [Global report|https://infra-reports.apache.org/#ghactions] (needs ASF 
> membership)
> There was a policy change announced recently:
> {quote}
> Policy change on use of GitHub Actions
> Due to misconfigurations in their builds, some projects have been using 
> unsupportable numbers of GitHub Actions. As part of fixing this situation, 
> Infra has added a 'resource use' section to the policy on GitHub Actions. 
> This section of the policy will come into effect on April 20, 2024:
> All workflows MUST have a job concurrency level less than or equal to 20. 
> This means a workflow cannot have more than 20 jobs running at the same time 
> across all matrices.
> All workflows SHOULD have a job concurrency level less than or equal to 15. 
> Just because 20 is the max, doesn't mean you should strive for 20.
> The average number of minutes a project uses per calendar week MUST NOT 
> exceed the equivalent of 25 full-time runners (250,000 minutes, or 4,200 
> hours).
> The average number of minutes a project uses in any consecutive five-day 
> period MUST NOT exceed the equivalent of 30 full-time runners (216,000 
> minutes, or 3,600 hours).
> Projects whose builds consistently cross the maximum use limits will lose 
> their access to GitHub Actions until they fix their build configurations.
> The full policy is at  
> https://infra.apache.org/github-actions-policy.html.
> {quote}
> Currently (last week of March 2024) Flink was ranked at #19 of projects that 
> used the Apache Infra runner resources the most which doesn't seem too bad. 
> This contained not only Apache Flink but also the Kubernetes operator, 
> connectors and other resources.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34989) Apache Infra requests to reduce the runner usage for a project

2024-04-02 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833155#comment-17833155
 ] 

Matthias Pohl commented on FLINK-34989:
---

For this issue, we should keep in mind that it is only affecting the 
non-ephemeral runners. FLINK-34331 works on enabling ephemeral runners for 
Apache Flink. Ephemeral runners would allow us to donate project specific 
runners, i.e. someone could donate hardware to allow Flink to have its own 
runners and not to worry to much about blocking other projects with CI.

> Apache Infra requests to reduce the runner usage for a project
> --
>
> Key: FLINK-34989
> URL: https://issues.apache.org/jira/browse/FLINK-34989
> Project: Flink
>  Issue Type: Sub-task
>  Components: Build System / CI
>Affects Versions: 1.19.0, 1.18.1, 1.20.0
>Reporter: Matthias Pohl
>Priority: Major
>
> The GitHub Actions CI utilizes runners that are hosted by Apache Infra right 
> now. These runners are limited. The runner usage can be monitored via the 
> following links:
> * [Flink-specific 
> report|https://infra-reports.apache.org/#ghactions&project=flink&hours=168] 
> (needs ASF committer rights) This project-specific report can only be 
> modified through the HTTP GET parameters of the URL.
> * [Global report|https://infra-reports.apache.org/#ghactions] (needs ASF 
> membership)
> There was a policy change announced recently:
> {quote}
> Policy change on use of GitHub Actions
> Due to misconfigurations in their builds, some projects have been using 
> unsupportable numbers of GitHub Actions. As part of fixing this situation, 
> Infra has added a 'resource use' section to the policy on GitHub Actions. 
> This section of the policy will come into effect on April 20, 2024:
> All workflows MUST have a job concurrency level less than or equal to 20. 
> This means a workflow cannot have more than 20 jobs running at the same time 
> across all matrices.
> All workflows SHOULD have a job concurrency level less than or equal to 15. 
> Just because 20 is the max, doesn't mean you should strive for 20.
> The average number of minutes a project uses per calendar week MUST NOT 
> exceed the equivalent of 25 full-time runners (250,000 minutes, or 4,200 
> hours).
> The average number of minutes a project uses in any consecutive five-day 
> period MUST NOT exceed the equivalent of 30 full-time runners (216,000 
> minutes, or 3,600 hours).
> Projects whose builds consistently cross the maximum use limits will lose 
> their access to GitHub Actions until they fix their build configurations.
> The full policy is at  
> https://infra.apache.org/github-actions-policy.html.
> {quote}
> Currently (last week of March 2024) Flink was ranked at #19 of projects that 
> used the Apache Infra runner resources the most which doesn't seem too bad. 
> This contained not only Apache Flink but also the Kubernetes operator, 
> connectors and other resources.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34989) Apache Infra requests to reduce the runner usage for a project

2024-04-04 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833910#comment-17833910
 ] 

Matthias Pohl commented on FLINK-34989:
---

[~martijnvisser] pointed out that we might need to fix this in the connector 
repos as well.

> Apache Infra requests to reduce the runner usage for a project
> --
>
> Key: FLINK-34989
> URL: https://issues.apache.org/jira/browse/FLINK-34989
> Project: Flink
>  Issue Type: Sub-task
>  Components: Build System / CI
>Affects Versions: 1.19.0, 1.18.1, 1.20.0
>Reporter: Matthias Pohl
>Priority: Major
>  Labels: pull-request-available
>
> The GitHub Actions CI utilizes runners that are hosted by Apache Infra right 
> now. These runners are limited. The runner usage can be monitored via the 
> following links:
> * [Flink-specific 
> report|https://infra-reports.apache.org/#ghactions&project=flink&hours=168] 
> (needs ASF committer rights) This project-specific report can only be 
> modified through the HTTP GET parameters of the URL.
> * [Global report|https://infra-reports.apache.org/#ghactions] (needs ASF 
> membership)
> There was a policy change announced recently:
> {quote}
> Policy change on use of GitHub Actions
> Due to misconfigurations in their builds, some projects have been using 
> unsupportable numbers of GitHub Actions. As part of fixing this situation, 
> Infra has added a 'resource use' section to the policy on GitHub Actions. 
> This section of the policy will come into effect on April 20, 2024:
> All workflows MUST have a job concurrency level less than or equal to 20. 
> This means a workflow cannot have more than 20 jobs running at the same time 
> across all matrices.
> All workflows SHOULD have a job concurrency level less than or equal to 15. 
> Just because 20 is the max, doesn't mean you should strive for 20.
> The average number of minutes a project uses per calendar week MUST NOT 
> exceed the equivalent of 25 full-time runners (250,000 minutes, or 4,200 
> hours).
> The average number of minutes a project uses in any consecutive five-day 
> period MUST NOT exceed the equivalent of 30 full-time runners (216,000 
> minutes, or 3,600 hours).
> Projects whose builds consistently cross the maximum use limits will lose 
> their access to GitHub Actions until they fix their build configurations.
> The full policy is at  
> https://infra.apache.org/github-actions-policy.html.
> {quote}
> Currently (last week of March 2024) Flink was ranked at #19 of projects that 
> used the Apache Infra runner resources the most which doesn't seem too bad. 
> This contained not only Apache Flink but also the Kubernetes operator, 
> connectors and other resources. According to [this 
> source|https://infra.apache.org/github-actions-secrets.html] Apache Infra 
> manages 180 runners right now.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)