Re: [DISCUSS] FLIP-395: Migration to GitHub Actions

2023-12-07 Thread Matthias Pohl
Thanks for the feedback, everyone.

I realized that there was a bit of a misunderstanding around my suggestion
for the migration plan. My intention was what Yangze Guo and Xintong had in
mind:
Azure CI should be handled as the ground truth. GitHub Actions CI should be
only used in the PRs to allow identifying mismatches between the two CI
infrastructures. The PR should NOT become unmergable if GHA fails.

It might be worth documenting this in .github/PULL_REQUEST_TEMPLATE.md and
the CI bots output (here I would need help from a contributor who has
access to the Ververica machines).

I updated the "Migration Plan" section in FLIP-396 [1] to illustrate that
there are three separate tracks which will be executed in parallel. I feel
like this is more or less what everyone else had in mind (feel free to
correct me).

 We could limit the (first) trial run to branches.

We could do that if we feel like it's confusing contributors too much. But
based on the runs I conducted in my fork XComp:flink [2] for the extended
(nightly) CI [3] and the basic (on push) CI [4], I'm quite confident that
we could get a decently stable CI for GHA right from the start.

I'm gonna leave this discussion open for a few days and will start a vote
probably next week if no other concerns come up.

Matthias


[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-396%3A+Trial+during+Flink+1.19+Cycle+to+test+migrating+to+GitHub+Actions
[2] https://github.com/XComp/flink/actions/
[3] https://github.com/XComp/flink/actions/workflows/flink-ci-extended.yml
[4] https://github.com/XComp/flink/actions/workflows/flink-ci-basic.yml

On Mon, Dec 4, 2023 at 3:00 PM Chesnay Schepler  wrote:

> We could limit the (first) trial run to branches.
>
> PRs wouldn't be affected (avoiding a bunch of concerns about maybe
> blocking PRs and misleading people into thinking that CI is green), we'd
> have a better handle on how much capacity we are consuming, but
> contributors would still get the new setup (which for some is better
> than none).
> We'd also side-step any potential security issue for the time being.
>
> On 01/12/2023 05:10, Yangze Guo wrote:
> > Thanks for the efforts, @Matthias. +1 to start a trial on Github
> > Actions and migrate the CI if we can prove its computation capacity
> > and stability.
> >
> > I share the same concern with Xintong that we do not explicitly claim
> > the effect of this trial on the contribution procedure. I think you
> > can elaborate more on this in the migration plan section. Here is my
> > thought about it:
> > I prefer to enable the CI workflow based on GitHub Actions for each PR
> > because it helps us understand its stability and performance under
> > certain pressures. However, I am not inclined to make "passing the CI
> > via GitHub Actions" a necessity in the code contribution process, we
> > can encourage contributors to report unstable cases under a specific
> > ticket umbrella when they encounter them.
> >
> > Best,
> > Yangze Guo
> >
> > On Thu, Nov 30, 2023 at 12:10 AM Matthias Pohl
> >  wrote:
> >> With regards to Alex' concerns on hardware disparity: I did a bit more
> >> digging on that one. I added my findings in a hardware section to
> FLIP-396
> >> [1]. It appears that the hardware is more or less the same between the
> >> different hosts. Apache INFRA's runners have more disk space (1TB in
> >> comparison to 14GB), though.
> >>
> >> [1]
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-396%3A+Trial+during+Flink+1.19+Cycle+to+test+migrating+to+GitHub+Actions#FLIP396:TrialduringFlink1.19CycletotestmigratingtoGitHubActions-HardwareSpecifications
> >>
> >> On Wed, Nov 29, 2023 at 4:01 PM Matthias Pohl 
> >> wrote:
> >>
> >>> Thanks for your feedback Alex. I responded to your comments below:
> >>>
> >>> This is mentioned in the "Limitations of GitHub Actions in the past"
>  section of the FLIP. Does this also apply to the Apache INFRA setup or
>  can we expect contributors' runs executed there too?
> >>>
> >>> Workflow runs on Flink forks (independent of PRs that would merge to
> >>> Apache Flink's core repo) will be executed with runners provided by
> GitHub
> >>> with their own limitations. Secrets are not set in these runs (similar
> to
> >>> what we have right now with PR runs).
> >>>
> >>> If we allow the PR CI to run on Apache INFRA-hosted ephemeral runners
> we
> >>> might have the same freedom because of their ephemeral nature (the VMs
> are
> >>> discarded leaving).
> >>>
> >>> We only have to start thinking about self-hosted customized runners if
> we
> >>> decide/need to have dedicated VMs for Flink's CI (similar to what we
> have
> >>> right now with Azure CI and Alibaba's VMs). This might happen if the
> >>> waiting times for acquiring a runner are too long. In that case, we
> might
> >>> give a certain group of people (e.g. committers) or certain types of
> events
> >>> (for PRs,  nightly builds, PR merges) the ability to use the
> self-hosted
> >>> runners.
> >>>

Re: [DISCUSS] FLIP-395: Migration to GitHub Actions

2023-12-04 Thread Chesnay Schepler

We could limit the (first) trial run to branches.

PRs wouldn't be affected (avoiding a bunch of concerns about maybe 
blocking PRs and misleading people into thinking that CI is green), we'd 
have a better handle on how much capacity we are consuming, but 
contributors would still get the new setup (which for some is better 
than none).

We'd also side-step any potential security issue for the time being.

On 01/12/2023 05:10, Yangze Guo wrote:

Thanks for the efforts, @Matthias. +1 to start a trial on Github
Actions and migrate the CI if we can prove its computation capacity
and stability.

I share the same concern with Xintong that we do not explicitly claim
the effect of this trial on the contribution procedure. I think you
can elaborate more on this in the migration plan section. Here is my
thought about it:
I prefer to enable the CI workflow based on GitHub Actions for each PR
because it helps us understand its stability and performance under
certain pressures. However, I am not inclined to make "passing the CI
via GitHub Actions" a necessity in the code contribution process, we
can encourage contributors to report unstable cases under a specific
ticket umbrella when they encounter them.

Best,
Yangze Guo

On Thu, Nov 30, 2023 at 12:10 AM Matthias Pohl
 wrote:

With regards to Alex' concerns on hardware disparity: I did a bit more
digging on that one. I added my findings in a hardware section to FLIP-396
[1]. It appears that the hardware is more or less the same between the
different hosts. Apache INFRA's runners have more disk space (1TB in
comparison to 14GB), though.

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-396%3A+Trial+during+Flink+1.19+Cycle+to+test+migrating+to+GitHub+Actions#FLIP396:TrialduringFlink1.19CycletotestmigratingtoGitHubActions-HardwareSpecifications

On Wed, Nov 29, 2023 at 4:01 PM Matthias Pohl 
wrote:


Thanks for your feedback Alex. I responded to your comments below:

This is mentioned in the "Limitations of GitHub Actions in the past"

section of the FLIP. Does this also apply to the Apache INFRA setup or
can we expect contributors' runs executed there too?


Workflow runs on Flink forks (independent of PRs that would merge to
Apache Flink's core repo) will be executed with runners provided by GitHub
with their own limitations. Secrets are not set in these runs (similar to
what we have right now with PR runs).

If we allow the PR CI to run on Apache INFRA-hosted ephemeral runners we
might have the same freedom because of their ephemeral nature (the VMs are
discarded leaving).

We only have to start thinking about self-hosted customized runners if we
decide/need to have dedicated VMs for Flink's CI (similar to what we have
right now with Azure CI and Alibaba's VMs). This might happen if the
waiting times for acquiring a runner are too long. In that case, we might
give a certain group of people (e.g. committers) or certain types of events
(for PRs,  nightly builds, PR merges) the ability to use the self-hosted
runners.

As you mentioned in the FLIP, there are some timeout-related test

discrepancies between different setups. Similar discrepancies could
manifest themselves between the Github runners and the Apache INFRA
runners. It would be great if we should have a uniform setup, where if
tests pass in the individual CI, they also pass in the main runner and vice
versa.


I agree. So far, what we've seen is that the timeout instability is coming
from too optimistic timeout configurations in some tests (they eventually
also fail in Azure CI; but the GitHub-provided runners seem to be more
sensitive in this regard). Fixing the tests if such a flakiness is observed
should bring us to a stage where the test behavior is matching between
different runners.

We had a similar issue in the Azure CI setup: Certain tests were more
stable on the Alibaba machines than on Azure VMs. That is why we introduced
a dedicated stage for Azure CI VMs as part of the nightly runs (see
FLINK-18370 [1]). We could do the same for GitHub Actions if necessary.

Currently we have such memory limits-related issues in individual vs main

Azure CI pipelines.


I'm not sure I understand what you mean by memory limit-related issues.
The GitHub-provided runners do not seem to run into memory-related issues.
We have to see whether this also applies to Apache INFRA-provided runners.
My hope is that they have even better hardware than what GitHub offers. But
GitHub-provided runners seem to be a good fallback to rely on (see the
workflows I shared in my previous response to Xintong's message).

[1] https://issues.apache.org/jira/browse/FLINK-18370

On Wed, Nov 29, 2023 at 3:17 PM Matthias Pohl 
wrote:


Thanks for your comments, Xintong. See my answers below.



I think it would be helpful if we can at the end migrate the CI to an
ASF-managed Github Action, as long as it provides us a similar
computation capacity and stability.


The current test runs in my Flink fork (using the GitHub-provided

Re: [DISCUSS] FLIP-395: Migration to GitHub Actions

2023-11-30 Thread Yangze Guo
Thanks for the efforts, @Matthias. +1 to start a trial on Github
Actions and migrate the CI if we can prove its computation capacity
and stability.

I share the same concern with Xintong that we do not explicitly claim
the effect of this trial on the contribution procedure. I think you
can elaborate more on this in the migration plan section. Here is my
thought about it:
I prefer to enable the CI workflow based on GitHub Actions for each PR
because it helps us understand its stability and performance under
certain pressures. However, I am not inclined to make "passing the CI
via GitHub Actions" a necessity in the code contribution process, we
can encourage contributors to report unstable cases under a specific
ticket umbrella when they encounter them.

Best,
Yangze Guo

On Thu, Nov 30, 2023 at 12:10 AM Matthias Pohl
 wrote:
>
> With regards to Alex' concerns on hardware disparity: I did a bit more
> digging on that one. I added my findings in a hardware section to FLIP-396
> [1]. It appears that the hardware is more or less the same between the
> different hosts. Apache INFRA's runners have more disk space (1TB in
> comparison to 14GB), though.
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-396%3A+Trial+during+Flink+1.19+Cycle+to+test+migrating+to+GitHub+Actions#FLIP396:TrialduringFlink1.19CycletotestmigratingtoGitHubActions-HardwareSpecifications
>
> On Wed, Nov 29, 2023 at 4:01 PM Matthias Pohl 
> wrote:
>
> > Thanks for your feedback Alex. I responded to your comments below:
> >
> > This is mentioned in the "Limitations of GitHub Actions in the past"
> >> section of the FLIP. Does this also apply to the Apache INFRA setup or
> >> can we expect contributors' runs executed there too?
> >
> >
> > Workflow runs on Flink forks (independent of PRs that would merge to
> > Apache Flink's core repo) will be executed with runners provided by GitHub
> > with their own limitations. Secrets are not set in these runs (similar to
> > what we have right now with PR runs).
> >
> > If we allow the PR CI to run on Apache INFRA-hosted ephemeral runners we
> > might have the same freedom because of their ephemeral nature (the VMs are
> > discarded leaving).
> >
> > We only have to start thinking about self-hosted customized runners if we
> > decide/need to have dedicated VMs for Flink's CI (similar to what we have
> > right now with Azure CI and Alibaba's VMs). This might happen if the
> > waiting times for acquiring a runner are too long. In that case, we might
> > give a certain group of people (e.g. committers) or certain types of events
> > (for PRs,  nightly builds, PR merges) the ability to use the self-hosted
> > runners.
> >
> > As you mentioned in the FLIP, there are some timeout-related test
> >> discrepancies between different setups. Similar discrepancies could
> >> manifest themselves between the Github runners and the Apache INFRA
> >> runners. It would be great if we should have a uniform setup, where if
> >> tests pass in the individual CI, they also pass in the main runner and vice
> >> versa.
> >
> >
> > I agree. So far, what we've seen is that the timeout instability is coming
> > from too optimistic timeout configurations in some tests (they eventually
> > also fail in Azure CI; but the GitHub-provided runners seem to be more
> > sensitive in this regard). Fixing the tests if such a flakiness is observed
> > should bring us to a stage where the test behavior is matching between
> > different runners.
> >
> > We had a similar issue in the Azure CI setup: Certain tests were more
> > stable on the Alibaba machines than on Azure VMs. That is why we introduced
> > a dedicated stage for Azure CI VMs as part of the nightly runs (see
> > FLINK-18370 [1]). We could do the same for GitHub Actions if necessary.
> >
> > Currently we have such memory limits-related issues in individual vs main
> >> Azure CI pipelines.
> >
> >
> > I'm not sure I understand what you mean by memory limit-related issues.
> > The GitHub-provided runners do not seem to run into memory-related issues.
> > We have to see whether this also applies to Apache INFRA-provided runners.
> > My hope is that they have even better hardware than what GitHub offers. But
> > GitHub-provided runners seem to be a good fallback to rely on (see the
> > workflows I shared in my previous response to Xintong's message).
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-18370
> >
> > On Wed, Nov 29, 2023 at 3:17 PM Matthias Pohl 
> > wrote:
> >
> >> Thanks for your comments, Xintong. See my answers below.
> >>
> >>
> >>> I think it would be helpful if we can at the end migrate the CI to an
> >>> ASF-managed Github Action, as long as it provides us a similar
> >>> computation capacity and stability.
> >>
> >>
> >> The current test runs in my Flink fork (using the GitHub-provided
> >> runners) suggest that even with using generic GitHub runners we get decent
> >> performance and stability. In this way I'm confident that we wouldn't lose
> 

Re: [DISCUSS] FLIP-395: Migration to GitHub Actions

2023-11-29 Thread Matthias Pohl
With regards to Alex' concerns on hardware disparity: I did a bit more
digging on that one. I added my findings in a hardware section to FLIP-396
[1]. It appears that the hardware is more or less the same between the
different hosts. Apache INFRA's runners have more disk space (1TB in
comparison to 14GB), though.

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-396%3A+Trial+during+Flink+1.19+Cycle+to+test+migrating+to+GitHub+Actions#FLIP396:TrialduringFlink1.19CycletotestmigratingtoGitHubActions-HardwareSpecifications

On Wed, Nov 29, 2023 at 4:01 PM Matthias Pohl 
wrote:

> Thanks for your feedback Alex. I responded to your comments below:
>
> This is mentioned in the "Limitations of GitHub Actions in the past"
>> section of the FLIP. Does this also apply to the Apache INFRA setup or
>> can we expect contributors' runs executed there too?
>
>
> Workflow runs on Flink forks (independent of PRs that would merge to
> Apache Flink's core repo) will be executed with runners provided by GitHub
> with their own limitations. Secrets are not set in these runs (similar to
> what we have right now with PR runs).
>
> If we allow the PR CI to run on Apache INFRA-hosted ephemeral runners we
> might have the same freedom because of their ephemeral nature (the VMs are
> discarded leaving).
>
> We only have to start thinking about self-hosted customized runners if we
> decide/need to have dedicated VMs for Flink's CI (similar to what we have
> right now with Azure CI and Alibaba's VMs). This might happen if the
> waiting times for acquiring a runner are too long. In that case, we might
> give a certain group of people (e.g. committers) or certain types of events
> (for PRs,  nightly builds, PR merges) the ability to use the self-hosted
> runners.
>
> As you mentioned in the FLIP, there are some timeout-related test
>> discrepancies between different setups. Similar discrepancies could
>> manifest themselves between the Github runners and the Apache INFRA
>> runners. It would be great if we should have a uniform setup, where if
>> tests pass in the individual CI, they also pass in the main runner and vice
>> versa.
>
>
> I agree. So far, what we've seen is that the timeout instability is coming
> from too optimistic timeout configurations in some tests (they eventually
> also fail in Azure CI; but the GitHub-provided runners seem to be more
> sensitive in this regard). Fixing the tests if such a flakiness is observed
> should bring us to a stage where the test behavior is matching between
> different runners.
>
> We had a similar issue in the Azure CI setup: Certain tests were more
> stable on the Alibaba machines than on Azure VMs. That is why we introduced
> a dedicated stage for Azure CI VMs as part of the nightly runs (see
> FLINK-18370 [1]). We could do the same for GitHub Actions if necessary.
>
> Currently we have such memory limits-related issues in individual vs main
>> Azure CI pipelines.
>
>
> I'm not sure I understand what you mean by memory limit-related issues.
> The GitHub-provided runners do not seem to run into memory-related issues.
> We have to see whether this also applies to Apache INFRA-provided runners.
> My hope is that they have even better hardware than what GitHub offers. But
> GitHub-provided runners seem to be a good fallback to rely on (see the
> workflows I shared in my previous response to Xintong's message).
>
> [1] https://issues.apache.org/jira/browse/FLINK-18370
>
> On Wed, Nov 29, 2023 at 3:17 PM Matthias Pohl 
> wrote:
>
>> Thanks for your comments, Xintong. See my answers below.
>>
>>
>>> I think it would be helpful if we can at the end migrate the CI to an
>>> ASF-managed Github Action, as long as it provides us a similar
>>> computation capacity and stability.
>>
>>
>> The current test runs in my Flink fork (using the GitHub-provided
>> runners) suggest that even with using generic GitHub runners we get decent
>> performance and stability. In this way I'm confident that we wouldn't lose
>> much.
>>
>> Here's a comparison of the pipelines once more:
>> * Nightly workflow: GitHub Actions [1] vs Azure CI [2]
>> * PR workflow: GitHub Actions [3] vs Azure CI [4]
>>
>> [1]
>> https://github.com/XComp/flink/actions/workflows/flink-ci-extended.yml
>> [2]
>> https://dev.azure.com/apache-flink/apache-flink/_build?definitionId=1&_a=summary
>> [3] https://github.com/XComp/flink/actions/workflows/flink-ci-basic.yml
>> [4] https://dev.azure.com/apache-flink/apache-flink/_build?definitionId=2
>>
>> Regarding the migration plan, I wonder if we should not disable the CIbot
>>> until we fully decide to migrate to Github Actions? In case the nightly
>>> runs don't really work well, it might be debatable whether we should
>>> maintain the CI in two places (i.e. PRs on Github Actions and cron builds
>>> on Azure).
>>
>>
>> The CIbot handles the PR CI. Disabling it would mean that users would
>> fully rely on the GitHub Actions workflow right away. I like the fact that
>> for PRs we actually 

Re: [DISCUSS] FLIP-395: Migration to GitHub Actions

2023-11-29 Thread Matthias Pohl
Thanks for your feedback Alex. I responded to your comments below:

This is mentioned in the "Limitations of GitHub Actions in the past"
> section of the FLIP. Does this also apply to the Apache INFRA setup or can
> we expect contributors' runs executed there too?


Workflow runs on Flink forks (independent of PRs that would merge to Apache
Flink's core repo) will be executed with runners provided by GitHub with
their own limitations. Secrets are not set in these runs (similar to what
we have right now with PR runs).

If we allow the PR CI to run on Apache INFRA-hosted ephemeral runners we
might have the same freedom because of their ephemeral nature (the VMs are
discarded leaving).

We only have to start thinking about self-hosted customized runners if we
decide/need to have dedicated VMs for Flink's CI (similar to what we have
right now with Azure CI and Alibaba's VMs). This might happen if the
waiting times for acquiring a runner are too long. In that case, we might
give a certain group of people (e.g. committers) or certain types of events
(for PRs,  nightly builds, PR merges) the ability to use the self-hosted
runners.

As you mentioned in the FLIP, there are some timeout-related test
> discrepancies between different setups. Similar discrepancies could
> manifest themselves between the Github runners and the Apache INFRA
> runners. It would be great if we should have a uniform setup, where if
> tests pass in the individual CI, they also pass in the main runner and vice
> versa.


I agree. So far, what we've seen is that the timeout instability is coming
from too optimistic timeout configurations in some tests (they eventually
also fail in Azure CI; but the GitHub-provided runners seem to be more
sensitive in this regard). Fixing the tests if such a flakiness is observed
should bring us to a stage where the test behavior is matching between
different runners.

We had a similar issue in the Azure CI setup: Certain tests were more
stable on the Alibaba machines than on Azure VMs. That is why we introduced
a dedicated stage for Azure CI VMs as part of the nightly runs (see
FLINK-18370 [1]). We could do the same for GitHub Actions if necessary.

Currently we have such memory limits-related issues in individual vs main
> Azure CI pipelines.


I'm not sure I understand what you mean by memory limit-related issues. The
GitHub-provided runners do not seem to run into memory-related issues. We
have to see whether this also applies to Apache INFRA-provided runners. My
hope is that they have even better hardware than what GitHub offers. But
GitHub-provided runners seem to be a good fallback to rely on (see the
workflows I shared in my previous response to Xintong's message).

[1] https://issues.apache.org/jira/browse/FLINK-18370

On Wed, Nov 29, 2023 at 3:17 PM Matthias Pohl 
wrote:

> Thanks for your comments, Xintong. See my answers below.
>
>
>> I think it would be helpful if we can at the end migrate the CI to an
>> ASF-managed Github Action, as long as it provides us a similar
>> computation capacity and stability.
>
>
> The current test runs in my Flink fork (using the GitHub-provided runners)
> suggest that even with using generic GitHub runners we get decent
> performance and stability. In this way I'm confident that we wouldn't lose
> much.
>
> Here's a comparison of the pipelines once more:
> * Nightly workflow: GitHub Actions [1] vs Azure CI [2]
> * PR workflow: GitHub Actions [3] vs Azure CI [4]
>
> [1] https://github.com/XComp/flink/actions/workflows/flink-ci-extended.yml
> [2]
> https://dev.azure.com/apache-flink/apache-flink/_build?definitionId=1&_a=summary
> [3] https://github.com/XComp/flink/actions/workflows/flink-ci-basic.yml
> [4] https://dev.azure.com/apache-flink/apache-flink/_build?definitionId=2
>
> Regarding the migration plan, I wonder if we should not disable the CIbot
>> until we fully decide to migrate to Github Actions? In case the nightly
>> runs don't really work well, it might be debatable whether we should
>> maintain the CI in two places (i.e. PRs on Github Actions and cron builds
>> on Azure).
>
>
> The CIbot handles the PR CI. Disabling it would mean that users would
> fully rely on the GitHub Actions workflow right away. I like the fact that
> for PRs we actually have both. That makes it more obvious if CI is not on
> par.
> For the nightly builds, I'm not too worried because they are not exposed
> to the contributors that much. That's more a question for the release
> managers who are monitoring the nightly runs how they want to handle it.
> But even there I see benefits of having both CIs running for some time to
> see how much they differ from each other in terms of stability
>
> - What exactly are the changes that would affect contributors during the
>> trial period? Is it only an additional CI report that you can potentially
>> just ignore? Or would there be some larger impacts, e.g. you cannot merge a
>> PR if the Github Action CI is not passed (I don't know, I just 

Re: [DISCUSS] FLIP-395: Migration to GitHub Actions

2023-11-29 Thread Matthias Pohl
Thanks for your comments, Xintong. See my answers below.


> I think it would be helpful if we can at the end migrate the CI to an
> ASF-managed Github Action, as long as it provides us a similar computation
> capacity and stability.


The current test runs in my Flink fork (using the GitHub-provided runners)
suggest that even with using generic GitHub runners we get decent
performance and stability. In this way I'm confident that we wouldn't lose
much.

Here's a comparison of the pipelines once more:
* Nightly workflow: GitHub Actions [1] vs Azure CI [2]
* PR workflow: GitHub Actions [3] vs Azure CI [4]

[1] https://github.com/XComp/flink/actions/workflows/flink-ci-extended.yml
[2]
https://dev.azure.com/apache-flink/apache-flink/_build?definitionId=1&_a=summary
[3] https://github.com/XComp/flink/actions/workflows/flink-ci-basic.yml
[4] https://dev.azure.com/apache-flink/apache-flink/_build?definitionId=2

Regarding the migration plan, I wonder if we should not disable the CIbot
> until we fully decide to migrate to Github Actions? In case the nightly
> runs don't really work well, it might be debatable whether we should
> maintain the CI in two places (i.e. PRs on Github Actions and cron builds
> on Azure).


The CIbot handles the PR CI. Disabling it would mean that users would fully
rely on the GitHub Actions workflow right away. I like the fact that for
PRs we actually have both. That makes it more obvious if CI is not on par.
For the nightly builds, I'm not too worried because they are not exposed to
the contributors that much. That's more a question for the release managers
who are monitoring the nightly runs how they want to handle it. But even
there I see benefits of having both CIs running for some time to see how
much they differ from each other in terms of stability

- What exactly are the changes that would affect contributors during the
> trial period? Is it only an additional CI report that you can potentially
> just ignore? Or would there be some larger impacts, e.g. you cannot merge a
> PR if the Github Action CI is not passed (I don't know, I just made this
> up)?


My plan would be to enable the PR CI workflow for PRs as well to have the
comparison. For contributors this would mean that they have an additional
CI point (essentially two CI runs for a PR). If that's not what we want, we
could disable it for PRs and only allow the basic CI run for pushes to
master.

On Wed, Nov 29, 2023 at 2:31 PM Alexander Fedulov <
alexander.fedu...@gmail.com> wrote:

> Thanks for driving this Mathhias! +1 for joining the INFRA trial.
>
>
> > Apache Infra did some experimenting on self-hosted runners in
> collaboration
> > with Apache Airflow (see ashb/runner with releases/pr-security-options
> branch)
> > where they only allow certain groups of users (e.g. committers) to run
> their
> > workflows on self-hosted machines. Any other group would have to rely on
> > GitHub’s runners.
>
> This is mentioned in the "Limitations of GitHub Actions in the past"
> section of
> the FLIP. Does this also apply to the Apache INFRA setup or can we expect
> contributors' runs executed there too? As you mentioned in the FLIP, there
> are
> some timeout-related test discrepancies between different setups. Similar
> discrepancies could manifest themselves between the Github runners and the
> Apache INFRA runners. It would be great if we should have a uniform setup,
> where if tests pass in the individual CI, they also pass in the main runner
> and
> vice versa.  Currently we have such memory limits-related issues in
> individual
> vs main Azure CI pipelines.
>
> >2. Disable Flink’s CI bot for PRs if step #1 is considered successful
> >3. Join trial program for ephemeral GHA runners
>
> Due to potential new kinds of instabilities manifesting themselves in the
> new setup,
> can we keep both CIs running in parallel and keep relying on the existing
> one until
> we are confident in the tests stability on the new ephemeral GHA infra
> (skip 2.)?
>
> Best,
> Alex
>
> On Wed, 29 Nov 2023 at 13:42, Xintong Song  wrote:
>
> > Thanks for the efforts, Matthias.
> >
> >
> > I think it would be helpful if we can at the end migrate the CI to an
> > ASF-managed Github Action, as long as it provides us a similar
> computation
> > capacity and stability. Given that the proposal is only to start a trial
> > and investigate whether the migration is feasible, I don't see much
> concern
> > in this.
> >
> >
> > I have only one suggestion and one question.
> >
> > - Regarding the migration plan, I wonder if we should not disable the CI
> > bot until we fully decide to migrate to Github Actions? In case the
> nightly
> > runs don't really work well, it might be debatable whether we should
> > maintain the CI in two places (i.e. PRs on Github Actions and cron builds
> > on Azure).
> >
> > - What exactly are the changes that would affect contributors during the
> > trial period? Is it only an additional CI report that you can potentially
> > just 

Re: [DISCUSS] FLIP-395: Migration to GitHub Actions

2023-11-29 Thread Alexander Fedulov
Thanks for driving this Mathhias! +1 for joining the INFRA trial.


> Apache Infra did some experimenting on self-hosted runners in
collaboration
> with Apache Airflow (see ashb/runner with releases/pr-security-options
branch)
> where they only allow certain groups of users (e.g. committers) to run
their
> workflows on self-hosted machines. Any other group would have to rely on
> GitHub’s runners.

This is mentioned in the "Limitations of GitHub Actions in the past"
section of
the FLIP. Does this also apply to the Apache INFRA setup or can we expect
contributors' runs executed there too? As you mentioned in the FLIP, there
are
some timeout-related test discrepancies between different setups. Similar
discrepancies could manifest themselves between the Github runners and the
Apache INFRA runners. It would be great if we should have a uniform setup,
where if tests pass in the individual CI, they also pass in the main runner
and
vice versa.  Currently we have such memory limits-related issues in
individual
vs main Azure CI pipelines.

>2. Disable Flink’s CI bot for PRs if step #1 is considered successful
>3. Join trial program for ephemeral GHA runners

Due to potential new kinds of instabilities manifesting themselves in the
new setup,
can we keep both CIs running in parallel and keep relying on the existing
one until
we are confident in the tests stability on the new ephemeral GHA infra
(skip 2.)?

Best,
Alex

On Wed, 29 Nov 2023 at 13:42, Xintong Song  wrote:

> Thanks for the efforts, Matthias.
>
>
> I think it would be helpful if we can at the end migrate the CI to an
> ASF-managed Github Action, as long as it provides us a similar computation
> capacity and stability. Given that the proposal is only to start a trial
> and investigate whether the migration is feasible, I don't see much concern
> in this.
>
>
> I have only one suggestion and one question.
>
> - Regarding the migration plan, I wonder if we should not disable the CI
> bot until we fully decide to migrate to Github Actions? In case the nightly
> runs don't really work well, it might be debatable whether we should
> maintain the CI in two places (i.e. PRs on Github Actions and cron builds
> on Azure).
>
> - What exactly are the changes that would affect contributors during the
> trial period? Is it only an additional CI report that you can potentially
> just ignore? Or would there be some larger impacts, e.g. you cannot merge a
> PR if the Github Action CI is not passed (I don't know, I just made this
> up)?
>
>
> Best,
>
> Xintong
>
>
>
> On Wed, Nov 29, 2023 at 8:07 PM Yuxin Tan  wrote:
>
> > Ok, Thanks for the update and the explanations.
> >
> > Best,
> > Yuxin
> >
> >
> > Matthias Pohl  于2023年11月29日周三 15:43写道:
> >
> > > >
> > > > According to the Flip, the new tests will support arm env.
> > > > I believe that's good news for arm users. I have a minor
> > > > question here. Will it be a blocker before migrating the new
> > > > tests? If not,  If not, when can we expect arm environment
> > > > support to be implemented? Thanks.
> > >
> > >
> > > Thanks for your feedback, everyone.
> > >
> > > About the ARM support. I want to underline that this FLIP is not about
> > > migrating to GitHub Actions but to start a trial run in the Apache
> Flink
> > > repository. That would allow us to come up with a proper decision
> whether
> > > GitHub Actions is what we want. I admit that the title is a bit
> > > "clickbaity". I updated the FLIP's title and its Motivation to make
> > things
> > > clear.
> > >
> > > The FLIP suggests starting a trial period until 1.19 is released to try
> > > things out. A proper decision on whether we want to migrate would be
> made
> > > at the end of the 1.19 release cycle.
> > >
> > > About the ARM support: This related content of the FLIP is entirely
> based
> > > on documentation from Apache INFRAs side. INFRA seems to offer this ARM
> > > support for their ephemeral runners. The ephemeral runners are in the
> > > testing stage, i.e. these runners are still experimental. Apache INFRA
> > asks
> > > Apache projects to join this test.
> > >
> > > Whether the ARM support is actually possible to achieve within Flink is
> > > something we have to figure out as part of the trial run. One
> conclusion
> > of
> > > the trial run could be that we still move ahead with GHA but don't use
> > arm
> > > machines due to some blocking issues.
> > >
> > > Matthias
> > >
> > >
> > >
> > > On Wed, Nov 29, 2023 at 4:46 AM Yuxin Tan 
> > wrote:
> > >
> > > > Hi, Matthias,
> > > >
> > > > Thanks for driving this.
> > > > +1 from my side.
> > > >
> > > > According to the Flip, the new tests will support arm env.
> > > > I believe that's good news for arm users. I have a minor
> > > > question here. Will it be a blocker before migrating the new
> > > > tests? If not,  If not, when can we expect arm environment
> > > > support to be implemented? Thanks.
> > > >
> > > > Best,
> > > > Yuxin
> > > >
> > > >
> > > > Márton Balassi  

Re: [DISCUSS] FLIP-395: Migration to GitHub Actions

2023-11-29 Thread Xintong Song
Thanks for the efforts, Matthias.


I think it would be helpful if we can at the end migrate the CI to an
ASF-managed Github Action, as long as it provides us a similar computation
capacity and stability. Given that the proposal is only to start a trial
and investigate whether the migration is feasible, I don't see much concern
in this.


I have only one suggestion and one question.

- Regarding the migration plan, I wonder if we should not disable the CI
bot until we fully decide to migrate to Github Actions? In case the nightly
runs don't really work well, it might be debatable whether we should
maintain the CI in two places (i.e. PRs on Github Actions and cron builds
on Azure).

- What exactly are the changes that would affect contributors during the
trial period? Is it only an additional CI report that you can potentially
just ignore? Or would there be some larger impacts, e.g. you cannot merge a
PR if the Github Action CI is not passed (I don't know, I just made this
up)?


Best,

Xintong



On Wed, Nov 29, 2023 at 8:07 PM Yuxin Tan  wrote:

> Ok, Thanks for the update and the explanations.
>
> Best,
> Yuxin
>
>
> Matthias Pohl  于2023年11月29日周三 15:43写道:
>
> > >
> > > According to the Flip, the new tests will support arm env.
> > > I believe that's good news for arm users. I have a minor
> > > question here. Will it be a blocker before migrating the new
> > > tests? If not,  If not, when can we expect arm environment
> > > support to be implemented? Thanks.
> >
> >
> > Thanks for your feedback, everyone.
> >
> > About the ARM support. I want to underline that this FLIP is not about
> > migrating to GitHub Actions but to start a trial run in the Apache Flink
> > repository. That would allow us to come up with a proper decision whether
> > GitHub Actions is what we want. I admit that the title is a bit
> > "clickbaity". I updated the FLIP's title and its Motivation to make
> things
> > clear.
> >
> > The FLIP suggests starting a trial period until 1.19 is released to try
> > things out. A proper decision on whether we want to migrate would be made
> > at the end of the 1.19 release cycle.
> >
> > About the ARM support: This related content of the FLIP is entirely based
> > on documentation from Apache INFRAs side. INFRA seems to offer this ARM
> > support for their ephemeral runners. The ephemeral runners are in the
> > testing stage, i.e. these runners are still experimental. Apache INFRA
> asks
> > Apache projects to join this test.
> >
> > Whether the ARM support is actually possible to achieve within Flink is
> > something we have to figure out as part of the trial run. One conclusion
> of
> > the trial run could be that we still move ahead with GHA but don't use
> arm
> > machines due to some blocking issues.
> >
> > Matthias
> >
> >
> >
> > On Wed, Nov 29, 2023 at 4:46 AM Yuxin Tan 
> wrote:
> >
> > > Hi, Matthias,
> > >
> > > Thanks for driving this.
> > > +1 from my side.
> > >
> > > According to the Flip, the new tests will support arm env.
> > > I believe that's good news for arm users. I have a minor
> > > question here. Will it be a blocker before migrating the new
> > > tests? If not,  If not, when can we expect arm environment
> > > support to be implemented? Thanks.
> > >
> > > Best,
> > > Yuxin
> > >
> > >
> > > Márton Balassi  于2023年11月29日周三 03:09写道:
> > >
> > > > Thanks, Matthias. Big +1 from me.
> > > >
> > > > On Tue, Nov 28, 2023 at 5:30 PM Matthias Pohl
> > > >  wrote:
> > > >
> > > > > Thanks for the pointer. I'm planning to join that meeting.
> > > > >
> > > > > On Tue, Nov 28, 2023 at 4:16 PM Etienne Chauchot <
> > echauc...@apache.org
> > > >
> > > > > wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > FYI there is the ASF infra roundtable soon. One of the subjects
> for
> > > > this
> > > > > > session is GitHub Actions. It could be worth passing by:
> > > > > >
> > > > > > December 6th, 2023 at 1700 UTC on the #Roundtablechannel on
> Slack.
> > > > > >
> > > > > > For information about theroundtables, and about how to join,
> > > > > > see:https://infra.apache.org/roundtable.html
> > > > > > 
> > > > > >
> > > > > > Best
> > > > > >
> > > > > > Etienne
> > > > > >
> > > > > > Le 24/11/2023 à 14:16, Maximilian Michels a écrit :
> > > > > > > Thanks for reviving the efforts here Matthias! +1 for the
> > > transition
> > > > > > > to GitHub Actions.
> > > > > > >
> > > > > > > As for ASF Infra Jenkins, it works fine. Jenkins is extremely
> > > > > > > feature-rich. Not sure about the spare capacity though. I know
> > that
> > > > > > > for Apache Beam, Google donated a bunch of servers to get
> > > additional
> > > > > > > build capacity.
> > > > > > >
> > > > > > > -Max
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Nov 23, 2023 at 10:30 AM Matthias Pohl
> > > > > > >   wrote:
> > > > > > >> Btw. even though we've been focusing on GitHub Actions with
> this
> > > > FLIP,
> > > > > > I'm
> > > > > > >> curious whether 

Re: [DISCUSS] FLIP-395: Migration to GitHub Actions

2023-11-29 Thread Yuxin Tan
Ok, Thanks for the update and the explanations.

Best,
Yuxin


Matthias Pohl  于2023年11月29日周三 15:43写道:

> >
> > According to the Flip, the new tests will support arm env.
> > I believe that's good news for arm users. I have a minor
> > question here. Will it be a blocker before migrating the new
> > tests? If not,  If not, when can we expect arm environment
> > support to be implemented? Thanks.
>
>
> Thanks for your feedback, everyone.
>
> About the ARM support. I want to underline that this FLIP is not about
> migrating to GitHub Actions but to start a trial run in the Apache Flink
> repository. That would allow us to come up with a proper decision whether
> GitHub Actions is what we want. I admit that the title is a bit
> "clickbaity". I updated the FLIP's title and its Motivation to make things
> clear.
>
> The FLIP suggests starting a trial period until 1.19 is released to try
> things out. A proper decision on whether we want to migrate would be made
> at the end of the 1.19 release cycle.
>
> About the ARM support: This related content of the FLIP is entirely based
> on documentation from Apache INFRAs side. INFRA seems to offer this ARM
> support for their ephemeral runners. The ephemeral runners are in the
> testing stage, i.e. these runners are still experimental. Apache INFRA asks
> Apache projects to join this test.
>
> Whether the ARM support is actually possible to achieve within Flink is
> something we have to figure out as part of the trial run. One conclusion of
> the trial run could be that we still move ahead with GHA but don't use arm
> machines due to some blocking issues.
>
> Matthias
>
>
>
> On Wed, Nov 29, 2023 at 4:46 AM Yuxin Tan  wrote:
>
> > Hi, Matthias,
> >
> > Thanks for driving this.
> > +1 from my side.
> >
> > According to the Flip, the new tests will support arm env.
> > I believe that's good news for arm users. I have a minor
> > question here. Will it be a blocker before migrating the new
> > tests? If not,  If not, when can we expect arm environment
> > support to be implemented? Thanks.
> >
> > Best,
> > Yuxin
> >
> >
> > Márton Balassi  于2023年11月29日周三 03:09写道:
> >
> > > Thanks, Matthias. Big +1 from me.
> > >
> > > On Tue, Nov 28, 2023 at 5:30 PM Matthias Pohl
> > >  wrote:
> > >
> > > > Thanks for the pointer. I'm planning to join that meeting.
> > > >
> > > > On Tue, Nov 28, 2023 at 4:16 PM Etienne Chauchot <
> echauc...@apache.org
> > >
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > FYI there is the ASF infra roundtable soon. One of the subjects for
> > > this
> > > > > session is GitHub Actions. It could be worth passing by:
> > > > >
> > > > > December 6th, 2023 at 1700 UTC on the #Roundtablechannel on Slack.
> > > > >
> > > > > For information about theroundtables, and about how to join,
> > > > > see:https://infra.apache.org/roundtable.html
> > > > > 
> > > > >
> > > > > Best
> > > > >
> > > > > Etienne
> > > > >
> > > > > Le 24/11/2023 à 14:16, Maximilian Michels a écrit :
> > > > > > Thanks for reviving the efforts here Matthias! +1 for the
> > transition
> > > > > > to GitHub Actions.
> > > > > >
> > > > > > As for ASF Infra Jenkins, it works fine. Jenkins is extremely
> > > > > > feature-rich. Not sure about the spare capacity though. I know
> that
> > > > > > for Apache Beam, Google donated a bunch of servers to get
> > additional
> > > > > > build capacity.
> > > > > >
> > > > > > -Max
> > > > > >
> > > > > >
> > > > > > On Thu, Nov 23, 2023 at 10:30 AM Matthias Pohl
> > > > > >   wrote:
> > > > > >> Btw. even though we've been focusing on GitHub Actions with this
> > > FLIP,
> > > > > I'm
> > > > > >> curious whether somebody has experience with Apache Infra's
> > Jenkins
> > > > > >> deployment. The discussion I found about Jenkins [1] is quite
> > > > out-dated
> > > > > >> (2014). I haven't worked with it myself but could imagine that
> > there
> > > > are
> > > > > >> some features provided through plugins which are missing in
> GitHub
> > > > > Actions.
> > > > > >>
> > > > > >> [1]
> > https://lists.apache.org/thread/vs81xdhn3q777r7x9k7wd4dyl9kvoqn4
> > > > > >>
> > > > > >> On Tue, Nov 21, 2023 at 4:19 PM Matthias Pohl<
> > > matthias.p...@aiven.io>
> > > > > >> wrote:
> > > > > >>
> > > > > >>> That's a valid point. I updated the FLIP accordingly:
> > > > > >>>
> > > > >  Currently, the secrets (e.g. for S3 access tokens) are
> > maintained
> > > by
> > > > >  certain PMC members with access to the corresponding
> > configuration
> > > > in
> > > > > the
> > > > >  Azure CI project. This responsibility will be moved to Apache
> > > Infra.
> > > > > They
> > > > >  are in charge of handling secrets in the Apache organization.
> > As a
> > > > >  consequence, updating secrets is becoming a bit more
> > complicated.
> > > > > This can
> > > > >  be still considered an improvement from a legal standpoint
> > because
> > > > the
> > > > >  responsibility is 

Re: [DISCUSS] FLIP-395: Migration to GitHub Actions

2023-11-28 Thread Matthias Pohl
>
> According to the Flip, the new tests will support arm env.
> I believe that's good news for arm users. I have a minor
> question here. Will it be a blocker before migrating the new
> tests? If not,  If not, when can we expect arm environment
> support to be implemented? Thanks.


Thanks for your feedback, everyone.

About the ARM support. I want to underline that this FLIP is not about
migrating to GitHub Actions but to start a trial run in the Apache Flink
repository. That would allow us to come up with a proper decision whether
GitHub Actions is what we want. I admit that the title is a bit
"clickbaity". I updated the FLIP's title and its Motivation to make things
clear.

The FLIP suggests starting a trial period until 1.19 is released to try
things out. A proper decision on whether we want to migrate would be made
at the end of the 1.19 release cycle.

About the ARM support: This related content of the FLIP is entirely based
on documentation from Apache INFRAs side. INFRA seems to offer this ARM
support for their ephemeral runners. The ephemeral runners are in the
testing stage, i.e. these runners are still experimental. Apache INFRA asks
Apache projects to join this test.

Whether the ARM support is actually possible to achieve within Flink is
something we have to figure out as part of the trial run. One conclusion of
the trial run could be that we still move ahead with GHA but don't use arm
machines due to some blocking issues.

Matthias



On Wed, Nov 29, 2023 at 4:46 AM Yuxin Tan  wrote:

> Hi, Matthias,
>
> Thanks for driving this.
> +1 from my side.
>
> According to the Flip, the new tests will support arm env.
> I believe that's good news for arm users. I have a minor
> question here. Will it be a blocker before migrating the new
> tests? If not,  If not, when can we expect arm environment
> support to be implemented? Thanks.
>
> Best,
> Yuxin
>
>
> Márton Balassi  于2023年11月29日周三 03:09写道:
>
> > Thanks, Matthias. Big +1 from me.
> >
> > On Tue, Nov 28, 2023 at 5:30 PM Matthias Pohl
> >  wrote:
> >
> > > Thanks for the pointer. I'm planning to join that meeting.
> > >
> > > On Tue, Nov 28, 2023 at 4:16 PM Etienne Chauchot  >
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > FYI there is the ASF infra roundtable soon. One of the subjects for
> > this
> > > > session is GitHub Actions. It could be worth passing by:
> > > >
> > > > December 6th, 2023 at 1700 UTC on the #Roundtablechannel on Slack.
> > > >
> > > > For information about theroundtables, and about how to join,
> > > > see:https://infra.apache.org/roundtable.html
> > > > 
> > > >
> > > > Best
> > > >
> > > > Etienne
> > > >
> > > > Le 24/11/2023 à 14:16, Maximilian Michels a écrit :
> > > > > Thanks for reviving the efforts here Matthias! +1 for the
> transition
> > > > > to GitHub Actions.
> > > > >
> > > > > As for ASF Infra Jenkins, it works fine. Jenkins is extremely
> > > > > feature-rich. Not sure about the spare capacity though. I know that
> > > > > for Apache Beam, Google donated a bunch of servers to get
> additional
> > > > > build capacity.
> > > > >
> > > > > -Max
> > > > >
> > > > >
> > > > > On Thu, Nov 23, 2023 at 10:30 AM Matthias Pohl
> > > > >   wrote:
> > > > >> Btw. even though we've been focusing on GitHub Actions with this
> > FLIP,
> > > > I'm
> > > > >> curious whether somebody has experience with Apache Infra's
> Jenkins
> > > > >> deployment. The discussion I found about Jenkins [1] is quite
> > > out-dated
> > > > >> (2014). I haven't worked with it myself but could imagine that
> there
> > > are
> > > > >> some features provided through plugins which are missing in GitHub
> > > > Actions.
> > > > >>
> > > > >> [1]
> https://lists.apache.org/thread/vs81xdhn3q777r7x9k7wd4dyl9kvoqn4
> > > > >>
> > > > >> On Tue, Nov 21, 2023 at 4:19 PM Matthias Pohl<
> > matthias.p...@aiven.io>
> > > > >> wrote:
> > > > >>
> > > > >>> That's a valid point. I updated the FLIP accordingly:
> > > > >>>
> > > >  Currently, the secrets (e.g. for S3 access tokens) are
> maintained
> > by
> > > >  certain PMC members with access to the corresponding
> configuration
> > > in
> > > > the
> > > >  Azure CI project. This responsibility will be moved to Apache
> > Infra.
> > > > They
> > > >  are in charge of handling secrets in the Apache organization.
> As a
> > > >  consequence, updating secrets is becoming a bit more
> complicated.
> > > > This can
> > > >  be still considered an improvement from a legal standpoint
> because
> > > the
> > > >  responsibility is transferred from an individual company (i.e.
> > > > Ververica
> > > >  who's the maintainer of the Azure CI project) to the Apache
> > > > Foundation.
> > > > >>>
> > > > >>> On Tue, Nov 21, 2023 at 3:37 PM Martijn Visser<
> > > > martijnvis...@apache.org>
> > > > >>> wrote:
> > > > >>>
> > > >  Hi Matthias,
> > > > 
> > > >  Thanks for the write-up and for the efforts on this. I really
> 

Re: [DISCUSS] FLIP-395: Migration to GitHub Actions

2023-11-28 Thread Yuxin Tan
Hi, Matthias,

Thanks for driving this.
+1 from my side.

According to the Flip, the new tests will support arm env.
I believe that's good news for arm users. I have a minor
question here. Will it be a blocker before migrating the new
tests? If not,  If not, when can we expect arm environment
support to be implemented? Thanks.

Best,
Yuxin


Márton Balassi  于2023年11月29日周三 03:09写道:

> Thanks, Matthias. Big +1 from me.
>
> On Tue, Nov 28, 2023 at 5:30 PM Matthias Pohl
>  wrote:
>
> > Thanks for the pointer. I'm planning to join that meeting.
> >
> > On Tue, Nov 28, 2023 at 4:16 PM Etienne Chauchot 
> > wrote:
> >
> > > Hi all,
> > >
> > > FYI there is the ASF infra roundtable soon. One of the subjects for
> this
> > > session is GitHub Actions. It could be worth passing by:
> > >
> > > December 6th, 2023 at 1700 UTC on the #Roundtablechannel on Slack.
> > >
> > > For information about theroundtables, and about how to join,
> > > see:https://infra.apache.org/roundtable.html
> > > 
> > >
> > > Best
> > >
> > > Etienne
> > >
> > > Le 24/11/2023 à 14:16, Maximilian Michels a écrit :
> > > > Thanks for reviving the efforts here Matthias! +1 for the transition
> > > > to GitHub Actions.
> > > >
> > > > As for ASF Infra Jenkins, it works fine. Jenkins is extremely
> > > > feature-rich. Not sure about the spare capacity though. I know that
> > > > for Apache Beam, Google donated a bunch of servers to get additional
> > > > build capacity.
> > > >
> > > > -Max
> > > >
> > > >
> > > > On Thu, Nov 23, 2023 at 10:30 AM Matthias Pohl
> > > >   wrote:
> > > >> Btw. even though we've been focusing on GitHub Actions with this
> FLIP,
> > > I'm
> > > >> curious whether somebody has experience with Apache Infra's Jenkins
> > > >> deployment. The discussion I found about Jenkins [1] is quite
> > out-dated
> > > >> (2014). I haven't worked with it myself but could imagine that there
> > are
> > > >> some features provided through plugins which are missing in GitHub
> > > Actions.
> > > >>
> > > >> [1]https://lists.apache.org/thread/vs81xdhn3q777r7x9k7wd4dyl9kvoqn4
> > > >>
> > > >> On Tue, Nov 21, 2023 at 4:19 PM Matthias Pohl<
> matthias.p...@aiven.io>
> > > >> wrote:
> > > >>
> > > >>> That's a valid point. I updated the FLIP accordingly:
> > > >>>
> > >  Currently, the secrets (e.g. for S3 access tokens) are maintained
> by
> > >  certain PMC members with access to the corresponding configuration
> > in
> > > the
> > >  Azure CI project. This responsibility will be moved to Apache
> Infra.
> > > They
> > >  are in charge of handling secrets in the Apache organization. As a
> > >  consequence, updating secrets is becoming a bit more complicated.
> > > This can
> > >  be still considered an improvement from a legal standpoint because
> > the
> > >  responsibility is transferred from an individual company (i.e.
> > > Ververica
> > >  who's the maintainer of the Azure CI project) to the Apache
> > > Foundation.
> > > >>>
> > > >>> On Tue, Nov 21, 2023 at 3:37 PM Martijn Visser<
> > > martijnvis...@apache.org>
> > > >>> wrote:
> > > >>>
> > >  Hi Matthias,
> > > 
> > >  Thanks for the write-up and for the efforts on this. I really hope
> > >  that we can move away from Azure towards GHA for a better
> > integration
> > >  as well (directly seeing if a PR can be merged due to CI passing
> for
> > >  example).
> > > 
> > >  The one thing I'm missing in the FLIP is how we would setup the
> > >  secrets for the nightly runs (for the S3 tests, potential tests
> with
> > >  external services etc). My guess is we need to provide the secret
> to
> > >  ASF Infra and then we would be able to refer to them in a
> pipeline?
> > > 
> > >  Best regards,
> > > 
> > >  Martijn
> > > 
> > >  On Tue, Nov 21, 2023 at 3:05 PM Matthias Pohl
> > >    wrote:
> > > > I realized that I mixed up FLIP IDs. FLIP-395 is already reserved
> > > [1]. I
> > > > switched to FLIP-396 [2] for the sake of consistency. 8)
> > > >
> > > > [1]
> > https://lists.apache.org/thread/wjd3nbvg6nt93lb0sd52f0lzls6559tv
> > > > [2]
> > > >
> > > 
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-396%3A+Migration+to+GitHub+Actions
> > > > On Tue, Nov 21, 2023 at 2:58 PM Matthias Pohl<
> > matthias.p...@aiven.io
> > > >
> > > > wrote:
> > > >
> > > >> Hi everyone,
> > > >>
> > > >> The Flink community discussed migrating from Azure CI to GitHub
> > >  Actions
> > > >> quite some time ago [1]. The efforts around that stalled due to
> > >  limitations
> > > >> around self-hosted runner support from Apache Infra’s side.
> There
> > >  were some
> > > >> recent developments on that topic. Apache Infra is experimenting
> > > with
> > > >> ephemeral runners now which might enable us to move ahead with
> > > GitHub
> > > >> Actions.
> > > 

Re: [DISCUSS] FLIP-395: Migration to GitHub Actions

2023-11-28 Thread Márton Balassi
Thanks, Matthias. Big +1 from me.

On Tue, Nov 28, 2023 at 5:30 PM Matthias Pohl
 wrote:

> Thanks for the pointer. I'm planning to join that meeting.
>
> On Tue, Nov 28, 2023 at 4:16 PM Etienne Chauchot 
> wrote:
>
> > Hi all,
> >
> > FYI there is the ASF infra roundtable soon. One of the subjects for this
> > session is GitHub Actions. It could be worth passing by:
> >
> > December 6th, 2023 at 1700 UTC on the #Roundtablechannel on Slack.
> >
> > For information about theroundtables, and about how to join,
> > see:https://infra.apache.org/roundtable.html
> > 
> >
> > Best
> >
> > Etienne
> >
> > Le 24/11/2023 à 14:16, Maximilian Michels a écrit :
> > > Thanks for reviving the efforts here Matthias! +1 for the transition
> > > to GitHub Actions.
> > >
> > > As for ASF Infra Jenkins, it works fine. Jenkins is extremely
> > > feature-rich. Not sure about the spare capacity though. I know that
> > > for Apache Beam, Google donated a bunch of servers to get additional
> > > build capacity.
> > >
> > > -Max
> > >
> > >
> > > On Thu, Nov 23, 2023 at 10:30 AM Matthias Pohl
> > >   wrote:
> > >> Btw. even though we've been focusing on GitHub Actions with this FLIP,
> > I'm
> > >> curious whether somebody has experience with Apache Infra's Jenkins
> > >> deployment. The discussion I found about Jenkins [1] is quite
> out-dated
> > >> (2014). I haven't worked with it myself but could imagine that there
> are
> > >> some features provided through plugins which are missing in GitHub
> > Actions.
> > >>
> > >> [1]https://lists.apache.org/thread/vs81xdhn3q777r7x9k7wd4dyl9kvoqn4
> > >>
> > >> On Tue, Nov 21, 2023 at 4:19 PM Matthias Pohl
> > >> wrote:
> > >>
> > >>> That's a valid point. I updated the FLIP accordingly:
> > >>>
> >  Currently, the secrets (e.g. for S3 access tokens) are maintained by
> >  certain PMC members with access to the corresponding configuration
> in
> > the
> >  Azure CI project. This responsibility will be moved to Apache Infra.
> > They
> >  are in charge of handling secrets in the Apache organization. As a
> >  consequence, updating secrets is becoming a bit more complicated.
> > This can
> >  be still considered an improvement from a legal standpoint because
> the
> >  responsibility is transferred from an individual company (i.e.
> > Ververica
> >  who's the maintainer of the Azure CI project) to the Apache
> > Foundation.
> > >>>
> > >>> On Tue, Nov 21, 2023 at 3:37 PM Martijn Visser<
> > martijnvis...@apache.org>
> > >>> wrote:
> > >>>
> >  Hi Matthias,
> > 
> >  Thanks for the write-up and for the efforts on this. I really hope
> >  that we can move away from Azure towards GHA for a better
> integration
> >  as well (directly seeing if a PR can be merged due to CI passing for
> >  example).
> > 
> >  The one thing I'm missing in the FLIP is how we would setup the
> >  secrets for the nightly runs (for the S3 tests, potential tests with
> >  external services etc). My guess is we need to provide the secret to
> >  ASF Infra and then we would be able to refer to them in a pipeline?
> > 
> >  Best regards,
> > 
> >  Martijn
> > 
> >  On Tue, Nov 21, 2023 at 3:05 PM Matthias Pohl
> >    wrote:
> > > I realized that I mixed up FLIP IDs. FLIP-395 is already reserved
> > [1]. I
> > > switched to FLIP-396 [2] for the sake of consistency. 8)
> > >
> > > [1]
> https://lists.apache.org/thread/wjd3nbvg6nt93lb0sd52f0lzls6559tv
> > > [2]
> > >
> > 
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-396%3A+Migration+to+GitHub+Actions
> > > On Tue, Nov 21, 2023 at 2:58 PM Matthias Pohl<
> matthias.p...@aiven.io
> > >
> > > wrote:
> > >
> > >> Hi everyone,
> > >>
> > >> The Flink community discussed migrating from Azure CI to GitHub
> >  Actions
> > >> quite some time ago [1]. The efforts around that stalled due to
> >  limitations
> > >> around self-hosted runner support from Apache Infra’s side. There
> >  were some
> > >> recent developments on that topic. Apache Infra is experimenting
> > with
> > >> ephemeral runners now which might enable us to move ahead with
> > GitHub
> > >> Actions.
> > >>
> > >> The goal is to join the trial phase for ephemeral runners and
> >  experiment
> > >> with our CI workflows in terms of stability and performance. At
> the
> >  end we
> > >> can decide whether we want to abandon Azure CI and move to GitHub
> >  Actions
> > >> or stick to the former one.
> > >>
> > >> Nico Weidner and Chesnay laid the groundwork on this topic in the
> >  past. I
> > >> picked up the work they did and continued experimenting with it in
> > my
> >  own
> > >> fork XComp/flink [2] the past few weeks. The workflows are in a
> > state
> >  where
> > >> I think that we start moving the 

Re: [DISCUSS] FLIP-395: Migration to GitHub Actions

2023-11-28 Thread Matthias Pohl
Thanks for the pointer. I'm planning to join that meeting.

On Tue, Nov 28, 2023 at 4:16 PM Etienne Chauchot 
wrote:

> Hi all,
>
> FYI there is the ASF infra roundtable soon. One of the subjects for this
> session is GitHub Actions. It could be worth passing by:
>
> December 6th, 2023 at 1700 UTC on the #Roundtablechannel on Slack.
>
> For information about theroundtables, and about how to join,
> see:https://infra.apache.org/roundtable.html
> 
>
> Best
>
> Etienne
>
> Le 24/11/2023 à 14:16, Maximilian Michels a écrit :
> > Thanks for reviving the efforts here Matthias! +1 for the transition
> > to GitHub Actions.
> >
> > As for ASF Infra Jenkins, it works fine. Jenkins is extremely
> > feature-rich. Not sure about the spare capacity though. I know that
> > for Apache Beam, Google donated a bunch of servers to get additional
> > build capacity.
> >
> > -Max
> >
> >
> > On Thu, Nov 23, 2023 at 10:30 AM Matthias Pohl
> >   wrote:
> >> Btw. even though we've been focusing on GitHub Actions with this FLIP,
> I'm
> >> curious whether somebody has experience with Apache Infra's Jenkins
> >> deployment. The discussion I found about Jenkins [1] is quite out-dated
> >> (2014). I haven't worked with it myself but could imagine that there are
> >> some features provided through plugins which are missing in GitHub
> Actions.
> >>
> >> [1]https://lists.apache.org/thread/vs81xdhn3q777r7x9k7wd4dyl9kvoqn4
> >>
> >> On Tue, Nov 21, 2023 at 4:19 PM Matthias Pohl
> >> wrote:
> >>
> >>> That's a valid point. I updated the FLIP accordingly:
> >>>
>  Currently, the secrets (e.g. for S3 access tokens) are maintained by
>  certain PMC members with access to the corresponding configuration in
> the
>  Azure CI project. This responsibility will be moved to Apache Infra.
> They
>  are in charge of handling secrets in the Apache organization. As a
>  consequence, updating secrets is becoming a bit more complicated.
> This can
>  be still considered an improvement from a legal standpoint because the
>  responsibility is transferred from an individual company (i.e.
> Ververica
>  who's the maintainer of the Azure CI project) to the Apache
> Foundation.
> >>>
> >>> On Tue, Nov 21, 2023 at 3:37 PM Martijn Visser<
> martijnvis...@apache.org>
> >>> wrote:
> >>>
>  Hi Matthias,
> 
>  Thanks for the write-up and for the efforts on this. I really hope
>  that we can move away from Azure towards GHA for a better integration
>  as well (directly seeing if a PR can be merged due to CI passing for
>  example).
> 
>  The one thing I'm missing in the FLIP is how we would setup the
>  secrets for the nightly runs (for the S3 tests, potential tests with
>  external services etc). My guess is we need to provide the secret to
>  ASF Infra and then we would be able to refer to them in a pipeline?
> 
>  Best regards,
> 
>  Martijn
> 
>  On Tue, Nov 21, 2023 at 3:05 PM Matthias Pohl
>    wrote:
> > I realized that I mixed up FLIP IDs. FLIP-395 is already reserved
> [1]. I
> > switched to FLIP-396 [2] for the sake of consistency. 8)
> >
> > [1]https://lists.apache.org/thread/wjd3nbvg6nt93lb0sd52f0lzls6559tv
> > [2]
> >
> 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-396%3A+Migration+to+GitHub+Actions
> > On Tue, Nov 21, 2023 at 2:58 PM Matthias Pohl >
> > wrote:
> >
> >> Hi everyone,
> >>
> >> The Flink community discussed migrating from Azure CI to GitHub
>  Actions
> >> quite some time ago [1]. The efforts around that stalled due to
>  limitations
> >> around self-hosted runner support from Apache Infra’s side. There
>  were some
> >> recent developments on that topic. Apache Infra is experimenting
> with
> >> ephemeral runners now which might enable us to move ahead with
> GitHub
> >> Actions.
> >>
> >> The goal is to join the trial phase for ephemeral runners and
>  experiment
> >> with our CI workflows in terms of stability and performance. At the
>  end we
> >> can decide whether we want to abandon Azure CI and move to GitHub
>  Actions
> >> or stick to the former one.
> >>
> >> Nico Weidner and Chesnay laid the groundwork on this topic in the
>  past. I
> >> picked up the work they did and continued experimenting with it in
> my
>  own
> >> fork XComp/flink [2] the past few weeks. The workflows are in a
> state
>  where
> >> I think that we start moving the relevant code into Flink’s
>  repository.
> >> Example runs for the basic workflow [3] and the extended (nightly)
>  workflow
> >> [4] are provided.
> >>
> >> This will bring a few more changes to the Flink contributors. That
> is
>  why
> >> I wanted to bring this discussion to the mailing list first. I did a
>  write
> >> up on (hopefully) all related 

Re: [DISCUSS] FLIP-395: Migration to GitHub Actions

2023-11-28 Thread Etienne Chauchot

Hi all,

FYI there is the ASF infra roundtable soon. One of the subjects for this 
session is GitHub Actions. It could be worth passing by:


December 6th, 2023 at 1700 UTC on the #Roundtablechannel on Slack.

For information about theroundtables, and about how to join, 
see:https://infra.apache.org/roundtable.html 



Best

Etienne

Le 24/11/2023 à 14:16, Maximilian Michels a écrit :

Thanks for reviving the efforts here Matthias! +1 for the transition
to GitHub Actions.

As for ASF Infra Jenkins, it works fine. Jenkins is extremely
feature-rich. Not sure about the spare capacity though. I know that
for Apache Beam, Google donated a bunch of servers to get additional
build capacity.

-Max


On Thu, Nov 23, 2023 at 10:30 AM Matthias Pohl
  wrote:

Btw. even though we've been focusing on GitHub Actions with this FLIP, I'm
curious whether somebody has experience with Apache Infra's Jenkins
deployment. The discussion I found about Jenkins [1] is quite out-dated
(2014). I haven't worked with it myself but could imagine that there are
some features provided through plugins which are missing in GitHub Actions.

[1]https://lists.apache.org/thread/vs81xdhn3q777r7x9k7wd4dyl9kvoqn4

On Tue, Nov 21, 2023 at 4:19 PM Matthias Pohl
wrote:


That's a valid point. I updated the FLIP accordingly:


Currently, the secrets (e.g. for S3 access tokens) are maintained by
certain PMC members with access to the corresponding configuration in the
Azure CI project. This responsibility will be moved to Apache Infra. They
are in charge of handling secrets in the Apache organization. As a
consequence, updating secrets is becoming a bit more complicated. This can
be still considered an improvement from a legal standpoint because the
responsibility is transferred from an individual company (i.e. Ververica
who's the maintainer of the Azure CI project) to the Apache Foundation.


On Tue, Nov 21, 2023 at 3:37 PM Martijn Visser
wrote:


Hi Matthias,

Thanks for the write-up and for the efforts on this. I really hope
that we can move away from Azure towards GHA for a better integration
as well (directly seeing if a PR can be merged due to CI passing for
example).

The one thing I'm missing in the FLIP is how we would setup the
secrets for the nightly runs (for the S3 tests, potential tests with
external services etc). My guess is we need to provide the secret to
ASF Infra and then we would be able to refer to them in a pipeline?

Best regards,

Martijn

On Tue, Nov 21, 2023 at 3:05 PM Matthias Pohl
  wrote:

I realized that I mixed up FLIP IDs. FLIP-395 is already reserved [1]. I
switched to FLIP-396 [2] for the sake of consistency. 8)

[1]https://lists.apache.org/thread/wjd3nbvg6nt93lb0sd52f0lzls6559tv
[2]


https://cwiki.apache.org/confluence/display/FLINK/FLIP-396%3A+Migration+to+GitHub+Actions

On Tue, Nov 21, 2023 at 2:58 PM Matthias Pohl
wrote:


Hi everyone,

The Flink community discussed migrating from Azure CI to GitHub

Actions

quite some time ago [1]. The efforts around that stalled due to

limitations

around self-hosted runner support from Apache Infra’s side. There

were some

recent developments on that topic. Apache Infra is experimenting with
ephemeral runners now which might enable us to move ahead with GitHub
Actions.

The goal is to join the trial phase for ephemeral runners and

experiment

with our CI workflows in terms of stability and performance. At the

end we

can decide whether we want to abandon Azure CI and move to GitHub

Actions

or stick to the former one.

Nico Weidner and Chesnay laid the groundwork on this topic in the

past. I

picked up the work they did and continued experimenting with it in my

own

fork XComp/flink [2] the past few weeks. The workflows are in a state

where

I think that we start moving the relevant code into Flink’s

repository.

Example runs for the basic workflow [3] and the extended (nightly)

workflow

[4] are provided.

This will bring a few more changes to the Flink contributors. That is

why

I wanted to bring this discussion to the mailing list first. I did a

write

up on (hopefully) all related topics in FLIP-395 [5].

I’m looking forward to your feedback.

Matthias

[1]https://lists.apache.org/thread/vcyx2nx0mhklqwm827vgykv8pc54gg3k

[2]https://github.com/XComp/flink/actions

[3]https://github.com/XComp/flink/actions/runs/6926309782

[4]https://github.com/XComp/flink/actions/runs/6927443941

[5]


https://cwiki.apache.org/confluence/display/FLINK/FLIP-395%3A+Migration+to+GitHub+Actions


--

[image: Aiven]

*Matthias Pohl*
Opensource Software Engineer, *Aiven*
matthias.p...@aiven.io  |  +49 170 9869525
aiven.io|

<

https://twitter.com/aiven_io>

*Aiven Deutschland GmbH*
Alexanderufer 3-7, 10117 Berlin
Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
Amtsgericht Charlottenburg, HRB 209739 B


Re: [DISCUSS] FLIP-395: Migration to GitHub Actions

2023-11-24 Thread Maximilian Michels
Thanks for reviving the efforts here Matthias! +1 for the transition
to GitHub Actions.

As for ASF Infra Jenkins, it works fine. Jenkins is extremely
feature-rich. Not sure about the spare capacity though. I know that
for Apache Beam, Google donated a bunch of servers to get additional
build capacity.

-Max


On Thu, Nov 23, 2023 at 10:30 AM Matthias Pohl
 wrote:
>
> Btw. even though we've been focusing on GitHub Actions with this FLIP, I'm
> curious whether somebody has experience with Apache Infra's Jenkins
> deployment. The discussion I found about Jenkins [1] is quite out-dated
> (2014). I haven't worked with it myself but could imagine that there are
> some features provided through plugins which are missing in GitHub Actions.
>
> [1] https://lists.apache.org/thread/vs81xdhn3q777r7x9k7wd4dyl9kvoqn4
>
> On Tue, Nov 21, 2023 at 4:19 PM Matthias Pohl 
> wrote:
>
> > That's a valid point. I updated the FLIP accordingly:
> >
> >> Currently, the secrets (e.g. for S3 access tokens) are maintained by
> >> certain PMC members with access to the corresponding configuration in the
> >> Azure CI project. This responsibility will be moved to Apache Infra. They
> >> are in charge of handling secrets in the Apache organization. As a
> >> consequence, updating secrets is becoming a bit more complicated. This can
> >> be still considered an improvement from a legal standpoint because the
> >> responsibility is transferred from an individual company (i.e. Ververica
> >> who's the maintainer of the Azure CI project) to the Apache Foundation.
> >
> >
> > On Tue, Nov 21, 2023 at 3:37 PM Martijn Visser 
> > wrote:
> >
> >> Hi Matthias,
> >>
> >> Thanks for the write-up and for the efforts on this. I really hope
> >> that we can move away from Azure towards GHA for a better integration
> >> as well (directly seeing if a PR can be merged due to CI passing for
> >> example).
> >>
> >> The one thing I'm missing in the FLIP is how we would setup the
> >> secrets for the nightly runs (for the S3 tests, potential tests with
> >> external services etc). My guess is we need to provide the secret to
> >> ASF Infra and then we would be able to refer to them in a pipeline?
> >>
> >> Best regards,
> >>
> >> Martijn
> >>
> >> On Tue, Nov 21, 2023 at 3:05 PM Matthias Pohl
> >>  wrote:
> >> >
> >> > I realized that I mixed up FLIP IDs. FLIP-395 is already reserved [1]. I
> >> > switched to FLIP-396 [2] for the sake of consistency. 8)
> >> >
> >> > [1] https://lists.apache.org/thread/wjd3nbvg6nt93lb0sd52f0lzls6559tv
> >> > [2]
> >> >
> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-396%3A+Migration+to+GitHub+Actions
> >> >
> >> > On Tue, Nov 21, 2023 at 2:58 PM Matthias Pohl 
> >> > wrote:
> >> >
> >> > > Hi everyone,
> >> > >
> >> > > The Flink community discussed migrating from Azure CI to GitHub
> >> Actions
> >> > > quite some time ago [1]. The efforts around that stalled due to
> >> limitations
> >> > > around self-hosted runner support from Apache Infra’s side. There
> >> were some
> >> > > recent developments on that topic. Apache Infra is experimenting with
> >> > > ephemeral runners now which might enable us to move ahead with GitHub
> >> > > Actions.
> >> > >
> >> > > The goal is to join the trial phase for ephemeral runners and
> >> experiment
> >> > > with our CI workflows in terms of stability and performance. At the
> >> end we
> >> > > can decide whether we want to abandon Azure CI and move to GitHub
> >> Actions
> >> > > or stick to the former one.
> >> > >
> >> > > Nico Weidner and Chesnay laid the groundwork on this topic in the
> >> past. I
> >> > > picked up the work they did and continued experimenting with it in my
> >> own
> >> > > fork XComp/flink [2] the past few weeks. The workflows are in a state
> >> where
> >> > > I think that we start moving the relevant code into Flink’s
> >> repository.
> >> > > Example runs for the basic workflow [3] and the extended (nightly)
> >> workflow
> >> > > [4] are provided.
> >> > >
> >> > > This will bring a few more changes to the Flink contributors. That is
> >> why
> >> > > I wanted to bring this discussion to the mailing list first. I did a
> >> write
> >> > > up on (hopefully) all related topics in FLIP-395 [5].
> >> > >
> >> > > I’m looking forward to your feedback.
> >> > >
> >> > > Matthias
> >> > >
> >> > > [1] https://lists.apache.org/thread/vcyx2nx0mhklqwm827vgykv8pc54gg3k
> >> > >
> >> > > [2] https://github.com/XComp/flink/actions
> >> > >
> >> > > [3] https://github.com/XComp/flink/actions/runs/6926309782
> >> > >
> >> > > [4] https://github.com/XComp/flink/actions/runs/6927443941
> >> > >
> >> > > [5]
> >> > >
> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-395%3A+Migration+to+GitHub+Actions
> >> > >
> >> > >
> >> > > --
> >> > >
> >> > > [image: Aiven] 
> >> > >
> >> > > *Matthias Pohl*
> >> > > Opensource Software Engineer, *Aiven*
> >> > > matthias.p...@aiven.io|  +49 170 9869525
> >> > > aiven.io 

Re: [DISCUSS] FLIP-395: Migration to GitHub Actions

2023-11-23 Thread Matthias Pohl
Btw. even though we've been focusing on GitHub Actions with this FLIP, I'm
curious whether somebody has experience with Apache Infra's Jenkins
deployment. The discussion I found about Jenkins [1] is quite out-dated
(2014). I haven't worked with it myself but could imagine that there are
some features provided through plugins which are missing in GitHub Actions.

[1] https://lists.apache.org/thread/vs81xdhn3q777r7x9k7wd4dyl9kvoqn4

On Tue, Nov 21, 2023 at 4:19 PM Matthias Pohl 
wrote:

> That's a valid point. I updated the FLIP accordingly:
>
>> Currently, the secrets (e.g. for S3 access tokens) are maintained by
>> certain PMC members with access to the corresponding configuration in the
>> Azure CI project. This responsibility will be moved to Apache Infra. They
>> are in charge of handling secrets in the Apache organization. As a
>> consequence, updating secrets is becoming a bit more complicated. This can
>> be still considered an improvement from a legal standpoint because the
>> responsibility is transferred from an individual company (i.e. Ververica
>> who's the maintainer of the Azure CI project) to the Apache Foundation.
>
>
> On Tue, Nov 21, 2023 at 3:37 PM Martijn Visser 
> wrote:
>
>> Hi Matthias,
>>
>> Thanks for the write-up and for the efforts on this. I really hope
>> that we can move away from Azure towards GHA for a better integration
>> as well (directly seeing if a PR can be merged due to CI passing for
>> example).
>>
>> The one thing I'm missing in the FLIP is how we would setup the
>> secrets for the nightly runs (for the S3 tests, potential tests with
>> external services etc). My guess is we need to provide the secret to
>> ASF Infra and then we would be able to refer to them in a pipeline?
>>
>> Best regards,
>>
>> Martijn
>>
>> On Tue, Nov 21, 2023 at 3:05 PM Matthias Pohl
>>  wrote:
>> >
>> > I realized that I mixed up FLIP IDs. FLIP-395 is already reserved [1]. I
>> > switched to FLIP-396 [2] for the sake of consistency. 8)
>> >
>> > [1] https://lists.apache.org/thread/wjd3nbvg6nt93lb0sd52f0lzls6559tv
>> > [2]
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-396%3A+Migration+to+GitHub+Actions
>> >
>> > On Tue, Nov 21, 2023 at 2:58 PM Matthias Pohl 
>> > wrote:
>> >
>> > > Hi everyone,
>> > >
>> > > The Flink community discussed migrating from Azure CI to GitHub
>> Actions
>> > > quite some time ago [1]. The efforts around that stalled due to
>> limitations
>> > > around self-hosted runner support from Apache Infra’s side. There
>> were some
>> > > recent developments on that topic. Apache Infra is experimenting with
>> > > ephemeral runners now which might enable us to move ahead with GitHub
>> > > Actions.
>> > >
>> > > The goal is to join the trial phase for ephemeral runners and
>> experiment
>> > > with our CI workflows in terms of stability and performance. At the
>> end we
>> > > can decide whether we want to abandon Azure CI and move to GitHub
>> Actions
>> > > or stick to the former one.
>> > >
>> > > Nico Weidner and Chesnay laid the groundwork on this topic in the
>> past. I
>> > > picked up the work they did and continued experimenting with it in my
>> own
>> > > fork XComp/flink [2] the past few weeks. The workflows are in a state
>> where
>> > > I think that we start moving the relevant code into Flink’s
>> repository.
>> > > Example runs for the basic workflow [3] and the extended (nightly)
>> workflow
>> > > [4] are provided.
>> > >
>> > > This will bring a few more changes to the Flink contributors. That is
>> why
>> > > I wanted to bring this discussion to the mailing list first. I did a
>> write
>> > > up on (hopefully) all related topics in FLIP-395 [5].
>> > >
>> > > I’m looking forward to your feedback.
>> > >
>> > > Matthias
>> > >
>> > > [1] https://lists.apache.org/thread/vcyx2nx0mhklqwm827vgykv8pc54gg3k
>> > >
>> > > [2] https://github.com/XComp/flink/actions
>> > >
>> > > [3] https://github.com/XComp/flink/actions/runs/6926309782
>> > >
>> > > [4] https://github.com/XComp/flink/actions/runs/6927443941
>> > >
>> > > [5]
>> > >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-395%3A+Migration+to+GitHub+Actions
>> > >
>> > >
>> > > --
>> > >
>> > > [image: Aiven] 
>> > >
>> > > *Matthias Pohl*
>> > > Opensource Software Engineer, *Aiven*
>> > > matthias.p...@aiven.io|  +49 170 9869525
>> > > aiven.io    |
>> > > 
>> > >    <
>> https://twitter.com/aiven_io>
>> > > *Aiven Deutschland GmbH*
>> > > Alexanderufer 3-7, 10117 Berlin
>> > > Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
>> > > Amtsgericht Charlottenburg, HRB 209739 B
>> > >
>>
>


Re: [DISCUSS] FLIP-395: Migration to GitHub Actions

2023-11-21 Thread Matthias Pohl
That's a valid point. I updated the FLIP accordingly:

> Currently, the secrets (e.g. for S3 access tokens) are maintained by
> certain PMC members with access to the corresponding configuration in the
> Azure CI project. This responsibility will be moved to Apache Infra. They
> are in charge of handling secrets in the Apache organization. As a
> consequence, updating secrets is becoming a bit more complicated. This can
> be still considered an improvement from a legal standpoint because the
> responsibility is transferred from an individual company (i.e. Ververica
> who's the maintainer of the Azure CI project) to the Apache Foundation.


On Tue, Nov 21, 2023 at 3:37 PM Martijn Visser 
wrote:

> Hi Matthias,
>
> Thanks for the write-up and for the efforts on this. I really hope
> that we can move away from Azure towards GHA for a better integration
> as well (directly seeing if a PR can be merged due to CI passing for
> example).
>
> The one thing I'm missing in the FLIP is how we would setup the
> secrets for the nightly runs (for the S3 tests, potential tests with
> external services etc). My guess is we need to provide the secret to
> ASF Infra and then we would be able to refer to them in a pipeline?
>
> Best regards,
>
> Martijn
>
> On Tue, Nov 21, 2023 at 3:05 PM Matthias Pohl
>  wrote:
> >
> > I realized that I mixed up FLIP IDs. FLIP-395 is already reserved [1]. I
> > switched to FLIP-396 [2] for the sake of consistency. 8)
> >
> > [1] https://lists.apache.org/thread/wjd3nbvg6nt93lb0sd52f0lzls6559tv
> > [2]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-396%3A+Migration+to+GitHub+Actions
> >
> > On Tue, Nov 21, 2023 at 2:58 PM Matthias Pohl 
> > wrote:
> >
> > > Hi everyone,
> > >
> > > The Flink community discussed migrating from Azure CI to GitHub Actions
> > > quite some time ago [1]. The efforts around that stalled due to
> limitations
> > > around self-hosted runner support from Apache Infra’s side. There were
> some
> > > recent developments on that topic. Apache Infra is experimenting with
> > > ephemeral runners now which might enable us to move ahead with GitHub
> > > Actions.
> > >
> > > The goal is to join the trial phase for ephemeral runners and
> experiment
> > > with our CI workflows in terms of stability and performance. At the
> end we
> > > can decide whether we want to abandon Azure CI and move to GitHub
> Actions
> > > or stick to the former one.
> > >
> > > Nico Weidner and Chesnay laid the groundwork on this topic in the
> past. I
> > > picked up the work they did and continued experimenting with it in my
> own
> > > fork XComp/flink [2] the past few weeks. The workflows are in a state
> where
> > > I think that we start moving the relevant code into Flink’s repository.
> > > Example runs for the basic workflow [3] and the extended (nightly)
> workflow
> > > [4] are provided.
> > >
> > > This will bring a few more changes to the Flink contributors. That is
> why
> > > I wanted to bring this discussion to the mailing list first. I did a
> write
> > > up on (hopefully) all related topics in FLIP-395 [5].
> > >
> > > I’m looking forward to your feedback.
> > >
> > > Matthias
> > >
> > > [1] https://lists.apache.org/thread/vcyx2nx0mhklqwm827vgykv8pc54gg3k
> > >
> > > [2] https://github.com/XComp/flink/actions
> > >
> > > [3] https://github.com/XComp/flink/actions/runs/6926309782
> > >
> > > [4] https://github.com/XComp/flink/actions/runs/6927443941
> > >
> > > [5]
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-395%3A+Migration+to+GitHub+Actions
> > >
> > >
> > > --
> > >
> > > [image: Aiven] 
> > >
> > > *Matthias Pohl*
> > > Opensource Software Engineer, *Aiven*
> > > matthias.p...@aiven.io|  +49 170 9869525
> > > aiven.io    |
> > > 
> > >    <
> https://twitter.com/aiven_io>
> > > *Aiven Deutschland GmbH*
> > > Alexanderufer 3-7, 10117 Berlin
> > > Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> > > Amtsgericht Charlottenburg, HRB 209739 B
> > >
>


Re: [DISCUSS] FLIP-395: Migration to GitHub Actions

2023-11-21 Thread Martijn Visser
Hi Matthias,

Thanks for the write-up and for the efforts on this. I really hope
that we can move away from Azure towards GHA for a better integration
as well (directly seeing if a PR can be merged due to CI passing for
example).

The one thing I'm missing in the FLIP is how we would setup the
secrets for the nightly runs (for the S3 tests, potential tests with
external services etc). My guess is we need to provide the secret to
ASF Infra and then we would be able to refer to them in a pipeline?

Best regards,

Martijn

On Tue, Nov 21, 2023 at 3:05 PM Matthias Pohl
 wrote:
>
> I realized that I mixed up FLIP IDs. FLIP-395 is already reserved [1]. I
> switched to FLIP-396 [2] for the sake of consistency. 8)
>
> [1] https://lists.apache.org/thread/wjd3nbvg6nt93lb0sd52f0lzls6559tv
> [2]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-396%3A+Migration+to+GitHub+Actions
>
> On Tue, Nov 21, 2023 at 2:58 PM Matthias Pohl 
> wrote:
>
> > Hi everyone,
> >
> > The Flink community discussed migrating from Azure CI to GitHub Actions
> > quite some time ago [1]. The efforts around that stalled due to limitations
> > around self-hosted runner support from Apache Infra’s side. There were some
> > recent developments on that topic. Apache Infra is experimenting with
> > ephemeral runners now which might enable us to move ahead with GitHub
> > Actions.
> >
> > The goal is to join the trial phase for ephemeral runners and experiment
> > with our CI workflows in terms of stability and performance. At the end we
> > can decide whether we want to abandon Azure CI and move to GitHub Actions
> > or stick to the former one.
> >
> > Nico Weidner and Chesnay laid the groundwork on this topic in the past. I
> > picked up the work they did and continued experimenting with it in my own
> > fork XComp/flink [2] the past few weeks. The workflows are in a state where
> > I think that we start moving the relevant code into Flink’s repository.
> > Example runs for the basic workflow [3] and the extended (nightly) workflow
> > [4] are provided.
> >
> > This will bring a few more changes to the Flink contributors. That is why
> > I wanted to bring this discussion to the mailing list first. I did a write
> > up on (hopefully) all related topics in FLIP-395 [5].
> >
> > I’m looking forward to your feedback.
> >
> > Matthias
> >
> > [1] https://lists.apache.org/thread/vcyx2nx0mhklqwm827vgykv8pc54gg3k
> >
> > [2] https://github.com/XComp/flink/actions
> >
> > [3] https://github.com/XComp/flink/actions/runs/6926309782
> >
> > [4] https://github.com/XComp/flink/actions/runs/6927443941
> >
> > [5]
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-395%3A+Migration+to+GitHub+Actions
> >
> >
> > --
> >
> > [image: Aiven] 
> >
> > *Matthias Pohl*
> > Opensource Software Engineer, *Aiven*
> > matthias.p...@aiven.io|  +49 170 9869525
> > aiven.io    |
> > 
> >    
> > *Aiven Deutschland GmbH*
> > Alexanderufer 3-7, 10117 Berlin
> > Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> > Amtsgericht Charlottenburg, HRB 209739 B
> >


Re: [DISCUSS] FLIP-395: Migration to GitHub Actions

2023-11-21 Thread Matthias Pohl
I realized that I mixed up FLIP IDs. FLIP-395 is already reserved [1]. I
switched to FLIP-396 [2] for the sake of consistency. 8)

[1] https://lists.apache.org/thread/wjd3nbvg6nt93lb0sd52f0lzls6559tv
[2]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-396%3A+Migration+to+GitHub+Actions

On Tue, Nov 21, 2023 at 2:58 PM Matthias Pohl 
wrote:

> Hi everyone,
>
> The Flink community discussed migrating from Azure CI to GitHub Actions
> quite some time ago [1]. The efforts around that stalled due to limitations
> around self-hosted runner support from Apache Infra’s side. There were some
> recent developments on that topic. Apache Infra is experimenting with
> ephemeral runners now which might enable us to move ahead with GitHub
> Actions.
>
> The goal is to join the trial phase for ephemeral runners and experiment
> with our CI workflows in terms of stability and performance. At the end we
> can decide whether we want to abandon Azure CI and move to GitHub Actions
> or stick to the former one.
>
> Nico Weidner and Chesnay laid the groundwork on this topic in the past. I
> picked up the work they did and continued experimenting with it in my own
> fork XComp/flink [2] the past few weeks. The workflows are in a state where
> I think that we start moving the relevant code into Flink’s repository.
> Example runs for the basic workflow [3] and the extended (nightly) workflow
> [4] are provided.
>
> This will bring a few more changes to the Flink contributors. That is why
> I wanted to bring this discussion to the mailing list first. I did a write
> up on (hopefully) all related topics in FLIP-395 [5].
>
> I’m looking forward to your feedback.
>
> Matthias
>
> [1] https://lists.apache.org/thread/vcyx2nx0mhklqwm827vgykv8pc54gg3k
>
> [2] https://github.com/XComp/flink/actions
>
> [3] https://github.com/XComp/flink/actions/runs/6926309782
>
> [4] https://github.com/XComp/flink/actions/runs/6927443941
>
> [5]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-395%3A+Migration+to+GitHub+Actions
>
>
> --
>
> [image: Aiven] 
>
> *Matthias Pohl*
> Opensource Software Engineer, *Aiven*
> matthias.p...@aiven.io|  +49 170 9869525
> aiven.io    |
> 
>    
> *Aiven Deutschland GmbH*
> Alexanderufer 3-7, 10117 Berlin
> Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> Amtsgericht Charlottenburg, HRB 209739 B
>