Thanks for kicking this conversation off. I'm +1 on migrating, but only
once we've found a specific replacement for easy observability (which
workflows have been failing lately, and how often) and trigger phrases (for
retries and workflows that aren't automatically kicked off but should be
run for extra validation, e.g. postcommits). Until we have viable
replacements, I don't think we should make the move. Publishing nightly
snapshots is eventually also a must to fully migrate, but probably doesn't
need to block us from making progress here.

With those caveats, the reason that I'm +1 on moving is that our Jenkins
reliability has been rough. Since I joined the project in January, I can
think of 3 different incidents that significantly harmed our ability to do
work.

1. Jenkins triggers cause multi-day outage
<https://lists.apache.org/thread/ro4yjbd2k9vjzsd0jhl46o20sm729jl8> - this
led to a multi-day code freeze, and we lost our trigger functionality for
days afterwards. Investigating/restoring our state ate up a pretty full
week for me.
2. Jenkins plugin cause multi-day outage
<https://issues.apache.org/jira/browse/INFRA-22878> - this led to multiple
days of Jenkins downtime before eventually being resolved by Infra.
3. Cert issues cause many workers to go down - I don't have a thread for
this because I handled most of the investigation the day of, but many of
our workers went down for around a day and nobody noticed until queue time
reached 6+ hours for each workflow.

There may be others that I'm overlooking.

GitHub Actions isn't a magic bullet to fix these problems, but it minimizes
the amount of infra that we're maintaining ourselves, increases the
isolation between workflows (catastrophic failure is less likely), has
uptime guarantees, and is more likely to receive investment going forward
(we're likely to get increasing benefits over time for free). We've also
done a lot of exploration in this area already, so we're not starting from
scratch.

Thanks,
Danny

On Wed, Oct 19, 2022 at 11:32 AM Kenneth Knowles <k...@apache.org> wrote:

> Hi all,
>
> As you probably noticed, there's a lot of work going on around adding more
> GitHub Actions workflows.
>
> Can we fully migrate to GitHub Actions? Similar to our GitHub Issues
> migration (but less user-facing) it would bring us on to "default"
> infrastructure that more people understand and is maintained by GitHub.
>
> So far we have hit some serious roadblocks. It isn't just a simple
> migration. We have to weigh doing the work to get there.
>
> I started a document with a table of the things we get from Jenkins that
> we need to be sure to have for GitHub Actions before we could think about
> migrating:
>
> https://s.apache.org/beam-jenkins-to-gha
>
> Can you please help me by adding things that we get from Jenkins, and if
> you know how to get them from GitHub Actions add that too.
>
> Thanks!
>
> Kenn
>

Reply via email to