Re: RFC: No koji builds during mass branching and updates-testing enablement

2023-04-24 Thread Kevin Fenzi
On Sun, Apr 23, 2023 at 07:28:08AM +, Mattia Verga via devel wrote:
> Isn't simpler to schedule:
> 
> 1. lock down Koji in  (stop accepting new builds,
> possibly only for Rawhide)
> 2. let Koji finish running builds (assuming there are none which
> requires more than 24h)
> 3. at  check any stuck Rawhide update in Bodhi
> 4. branch
> 5. unlock Koji
> 
> I think a 24h Koji outage is much clearer to users other than cancelling
> their builds and unpushing their updates. Unless releng wants to take
> note of those builds and updates and resubmit them after the mass
> branching...

well, 24hours is both too long and too short.

On one hand we have some things that take longer than 24h sometimes.
On the other we have a bunch of builds that only take a few minutes.

I'd hate for a important update that only takes a few minutes to build
to be delayed for 24hours. ;( 

I think we could perhaps look at resubmitting things.
That could be a bit complex with targets and such, but it might be
doable.

kevin


signature.asc
Description: PGP signature
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue


Re: RFC: No koji builds during mass branching and updates-testing enablement

2023-04-23 Thread Mattia Verga via devel
Il 22/04/23 23:46, Kevin Fenzi ha scritto:
> On Fri, Apr 21, 2023 at 09:03:11PM +0200, Fabio Valentini wrote:
>> On Thu, Mar 9, 2023 at 8:56 PM Kevin Fenzi  wrote:
>>> * Cancel all builds that are in progress. Maintainers can resubmit after
>>> the outage with the appropriate branches.
>>> * unpush all updates stuck in gating/pending? Is this too much?
>>> * do the branching steps, get everything in place, then open things on
>>> the hub.
>>>
>>> This is a lot more disruptive, but it's only for part of a day and I
>>> agree it's nicer to not have things to clean up.
>> Sorry for the long RTT. My email inbox is only now no longer looking
>> like a dumpster fire. :)
>>
>> It sounds like koji actually supports giving an outage message, so
>> that would be great.
>> Concerning the three steps listed above: I think they would make sense.
>> Maybe it could look like this:
>>
>> 1. lock down the koji hub
>> 2. cancel all builds that are still running (I think this could
>> exclude builds that are targeting stable branches?)
>> 3. unpush all Rawhide updates that are stuck (maybe adding a comment
>> to the bodhi update why it happened)
>> 4. do the mass branching steps (i.e. Rawhide == Fedora N+2, Branched
>> == Fedora N+1)
>> 5. unlock koji hub
>>
>> Parts of steps 2,3,4 could even happen with more granularity (I think
>> mass branching steps happen alphabetically for all packages? that
>> would give running builds more time to finish.).
> This seems doable. We should make sure it is as best we can, and then
> probibly announce it before the next mass branching so everyone knows to
> expect it. :)
>
> Hopefully this will prevent problems the next time...
>
> Adding Tomas here on cc
>
> kevin

Isn't simpler to schedule:

1. lock down Koji in  (stop accepting new builds,
possibly only for Rawhide)
2. let Koji finish running builds (assuming there are none which
requires more than 24h)
3. at  check any stuck Rawhide update in Bodhi
4. branch
5. unlock Koji

I think a 24h Koji outage is much clearer to users other than cancelling
their builds and unpushing their updates. Unless releng wants to take
note of those builds and updates and resubmit them after the mass
branching...

Mattia

___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue


Re: RFC: No koji builds during mass branching and updates-testing enablement

2023-04-22 Thread Kevin Fenzi
On Fri, Apr 21, 2023 at 09:03:11PM +0200, Fabio Valentini wrote:
> On Thu, Mar 9, 2023 at 8:56 PM Kevin Fenzi  wrote:
> >
> > * Cancel all builds that are in progress. Maintainers can resubmit after
> > the outage with the appropriate branches.
> > * unpush all updates stuck in gating/pending? Is this too much?
> > * do the branching steps, get everything in place, then open things on
> > the hub.
> >
> > This is a lot more disruptive, but it's only for part of a day and I
> > agree it's nicer to not have things to clean up.
> 
> Sorry for the long RTT. My email inbox is only now no longer looking
> like a dumpster fire. :)
> 
> It sounds like koji actually supports giving an outage message, so
> that would be great.
> Concerning the three steps listed above: I think they would make sense.
> Maybe it could look like this:
> 
> 1. lock down the koji hub
> 2. cancel all builds that are still running (I think this could
> exclude builds that are targeting stable branches?)
> 3. unpush all Rawhide updates that are stuck (maybe adding a comment
> to the bodhi update why it happened)
> 4. do the mass branching steps (i.e. Rawhide == Fedora N+2, Branched
> == Fedora N+1)
> 5. unlock koji hub
> 
> Parts of steps 2,3,4 could even happen with more granularity (I think
> mass branching steps happen alphabetically for all packages? that
> would give running builds more time to finish.).

This seems doable. We should make sure it is as best we can, and then
probibly announce it before the next mass branching so everyone knows to
expect it. :)

Hopefully this will prevent problems the next time...

Adding Tomas here on cc

kevin


signature.asc
Description: PGP signature
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue


Re: RFC: No koji builds during mass branching and updates-testing enablement

2023-04-21 Thread Fabio Valentini
On Thu, Mar 9, 2023 at 8:56 PM Kevin Fenzi  wrote:
>
> * Cancel all builds that are in progress. Maintainers can resubmit after
> the outage with the appropriate branches.
> * unpush all updates stuck in gating/pending? Is this too much?
> * do the branching steps, get everything in place, then open things on
> the hub.
>
> This is a lot more disruptive, but it's only for part of a day and I
> agree it's nicer to not have things to clean up.

Sorry for the long RTT. My email inbox is only now no longer looking
like a dumpster fire. :)

It sounds like koji actually supports giving an outage message, so
that would be great.
Concerning the three steps listed above: I think they would make sense.
Maybe it could look like this:

1. lock down the koji hub
2. cancel all builds that are still running (I think this could
exclude builds that are targeting stable branches?)
3. unpush all Rawhide updates that are stuck (maybe adding a comment
to the bodhi update why it happened)
4. do the mass branching steps (i.e. Rawhide == Fedora N+2, Branched
== Fedora N+1)
5. unlock koji hub

Parts of steps 2,3,4 could even happen with more granularity (I think
mass branching steps happen alphabetically for all packages? that
would give running builds more time to finish.).

Fabio
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue


Re: RFC: No koji builds during mass branching and updates-testing enablement

2023-03-09 Thread Kevin Fenzi
On Thu, Mar 09, 2023 at 02:37:14PM +0100, Fabio Valentini wrote:
> Hi all,
> 
> As a follow-up from a recent discussion on Matrix/IRC, I'm proposing
> the following change to the development cycle / release schedule:
> 
> "Koji builds are blocked while mass branching and updates-testing
> enablement are in progress."
> 
> That's it, that's the entire RFC.

Yeah, we have wanted to do this for a long time actually, but...
The implemetation is tricky.

As part of the branching process, after git repos and koji are switched,
we have to commit and build fedora-repos/fedora-release in order for
other builds to get the right disttag and to use the right repos. 
So, we need koji to let releng do that while blocking everyone else. ;( 
But that might be possible/doable now. (see below)

Also, there's still going to be some corner cases I fear, like builds
that were in progress before branching, or things that were stuck in
ci/gating tests?

Waiting for builds to finish is pretty time consuming. If someone
started a gcc or libreoffice or chromium build it could be... a long
time. 

I guess the way to be really sure would be: 

* set koji hub:
ServerOffline = True
OfflineMessage = Branching in progress, come back later
LockOut = True

(the comments for those is:
## If ServerOffline is True, the server will always report a ServerOffline 
fault (with
## OfflineMessage as the fault string).
## If LockOut is True, the server will report a ServerOffline fault for all 
non-admin
## requests.
)

* Cancel all builds that are in progress. Maintainers can resubmit after
the outage with the appropriate branches. 
* unpush all updates stuck in gating/pending? Is this too much?
* do the branching steps, get everything in place, then open things on
the hub.

This is a lot more disruptive, but it's only for part of a day and I
agree it's nicer to not have things to clean up. 

kevin


signature.asc
Description: PGP signature
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue


Re: RFC: No koji builds during mass branching and updates-testing enablement

2023-03-09 Thread Dan Čermák
Hi Fabio,

On March 9, 2023 1:37:14 PM UTC, Fabio Valentini  wrote:
>Hi all,
>
>As a follow-up from a recent discussion on Matrix/IRC, I'm proposing
>the following change to the development cycle / release schedule:
>
>"Koji builds are blocked while mass branching and updates-testing
>enablement are in progress."
>
>That's it, that's the entire RFC.
>
>Roughly every six months, I run a check for updates that are present
>in the current "stable" release, but missing from "branched", and
>every six months, there's a non-negligible number of builds and / or
>bodhi updates that get stuck in a void because they just happened to
>have been run at the exact worst moment.
>
>In my opinion, the benefits of implementing this change (less releng
>time spent on fixing builds that are stuck in an inconsistent state)
>would outweigh the downsides (two windows of a few hours each during
>the early development cycle where no builds can be launched).
>
>Issues that I see with builds that just "happened to be in the wrong
>place at the wrong time" fall broadly into two categories (though I
>have seen other types of problems that are more rare):
>
>1. Builds launched while the mass branching is in progress have the
>fcXX (where XX = old-rawhide / branched) dist-tag, but only gets
>tagged with fXY (XY = new-rawhide) by koji. This results in them only
>being available in the rawhide repos, and not from "branched" at all.
>Just resubmitting the build for "branched" doesn't work, because the
>wrong dist-tag causes NVR conflicts. Fixing this requires either
>releng intervention (useless busywork) or bumping the release and
>submitting new builds for *both rawhide and branched* (waste of
>resources).
>
>2. Builds launched just before updates-testing enablement can get
>stuck in "testing" state before there is an actual updates-testing
>repo, and are hence not available from *any* repository (for testing?)
>during the beta freeze, but will get pushed to stable afterwards. This
>results in users who want to test the beta release (or "pre-beta" with
>updates-testing enabled) to not see these updates at all, but they
>will be pushed to "stable" immediately after the beta freeze is lifted
>(i.e. without *any* amount of testing).

As someone who accidentally build a package in the wrong time period, I'm very 
much in favor of preventing this from happening in the future.

Thanks for this proposal!


Dan
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue


Re: RFC: No koji builds during mass branching and updates-testing enablement

2023-03-09 Thread Fabio Valentini
On Thu, Mar 9, 2023 at 3:46 PM Dennis Gilmore via devel
 wrote:
>
>> 2. Builds launched just before updates-testing enablement can get
>> stuck in "testing" state before there is an actual updates-testing
>> repo, and are hence not available from *any* repository (for testing?)
>> during the beta freeze, but will get pushed to stable afterwards. This
>> results in users who want to test the beta release (or "pre-beta" with
>> updates-testing enabled) to not see these updates at all, but they
>> will be pushed to "stable" immediately after the beta freeze is lifted
>> (i.e. without *any* amount of testing).
>
> I do not understand how this is at all possible. If a build has the tag to be 
> stable it will show up freeze or not. it may not be in the beta compose, but 
> will be in the nightly composes and being tested and available there.

As far as I can tell, this is the way this has happened:

1. build is launched for "branched" *before* the beta freeze went into
effect (i.e. before the "updates-testing" repo exists for that
release)
2. build is automatically submitted to bodhi and goes through the
stages of "pending" -> "testing" -> "stable" (and the build can spend
a non-negligible amount of time in the "testing" state if there are
any gating tests)
3. the beta freeze goes into effect while the build is still in
"testing" state, but "testing" pre-beta-freeze means something
different than "testing" post-updates-testing-enablement

Fabio
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue


Re: RFC: No koji builds during mass branching and updates-testing enablement

2023-03-09 Thread Fabio Valentini
On Thu, Mar 9, 2023 at 3:28 PM Björn Persson  wrote:
>
> What will packagers see?

This is still up for discussion, I guess. And it will also depend on
how this can be implemented.

Ideally, "fedpkg build" would print a warning like "Mass branching is
in progress, no builds can be submitted at this time.", but I'm not
sure how hard that would be to implement.
I think koji can already respond with something like an "there's an
outage, try again later" error message, so it at least seems like it
should be doable.

> Will builds be queued, and get processed when the lock is released?

No. Builds are doomed to be in the wrong state as soon as they are
submitted during that time window (for example, "build was submitted
when rawhide was still f38, but rawhide will be f39 once this build is
finished"). The packager would need to retry to submit after releng
has finished with their work (so, similar to how other infrastructure
outages).

> Will build attempts be rejected with a clear explanation? "You can't
> build while we're branching. Please try again later."

That would be nice, but I don't know how hard that would be to implement.

> Or will packagers start asking why they get an incomprehensible stack
> trace from fedpkg?

Incomprehensible stack trace would be quite bad.

Fabio
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue


Re: RFC: No koji builds during mass branching and updates-testing enablement

2023-03-09 Thread Dennis Gilmore via devel
On Thu, Mar 9, 2023 at 7:37 AM Fabio Valentini  wrote:

> Hi all,
>
> As a follow-up from a recent discussion on Matrix/IRC, I'm proposing
> the following change to the development cycle / release schedule:
>
> "Koji builds are blocked while mass branching and updates-testing
> enablement are in progress."
>
> That's it, that's the entire RFC.
>
> Roughly every six months, I run a check for updates that are present
> in the current "stable" release, but missing from "branched", and
> every six months, there's a non-negligible number of builds and / or
> bodhi updates that get stuck in a void because they just happened to
> have been run at the exact worst moment.
>
> In my opinion, the benefits of implementing this change (less releng
> time spent on fixing builds that are stuck in an inconsistent state)
> would outweigh the downsides (two windows of a few hours each during
> the early development cycle where no builds can be launched).
>
> Issues that I see with builds that just "happened to be in the wrong
> place at the wrong time" fall broadly into two categories (though I
> have seen other types of problems that are more rare):
>
> 1. Builds launched while the mass branching is in progress have the
> fcXX (where XX = old-rawhide / branched) dist-tag, but only gets
> tagged with fXY (XY = new-rawhide) by koji. This results in them only
> being available in the rawhide repos, and not from "branched" at all.
> Just resubmitting the build for "branched" doesn't work, because the
> wrong dist-tag causes NVR conflicts. Fixing this requires either
> releng intervention (useless busywork) or bumping the release and
> submitting new builds for *both rawhide and branched* (waste of
> resources).
>

I think this can be resolved by changing the branching process, though
there would still be a small race condition window. We used to lock
everything when branching when we used CVS and went to a lot of effort not
to lock everything. Doing the koji side completely before doing anything in
git should help significantly here.  you can also disable all the builders
and wait for the active builds to complete then do everything, that way new
builds will queue up and wait for the builders to be renabled.



> 2. Builds launched just before updates-testing enablement can get
> stuck in "testing" state before there is an actual updates-testing
> repo, and are hence not available from *any* repository (for testing?)
> during the beta freeze, but will get pushed to stable afterwards. This
> results in users who want to test the beta release (or "pre-beta" with
> updates-testing enabled) to not see these updates at all, but they
> will be pushed to "stable" immediately after the beta freeze is lifted
> (i.e. without *any* amount of testing).
>

I do not understand how this is at all possible. If a build has the tag to
be stable it will show up freeze or not. it may not be in the beta compose,
but will be in the nightly composes and being tested and available there.

Dennis

Fabio
> ___
> devel mailing list -- devel@lists.fedoraproject.org
> To unsubscribe send an email to devel-le...@lists.fedoraproject.org
> Fedora Code of Conduct:
> https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives:
> https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
> Do not reply to spam, report it:
> https://pagure.io/fedora-infrastructure/new_issue
>
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue


Re: RFC: No koji builds during mass branching and updates-testing enablement

2023-03-09 Thread Björn Persson
Fabio Valentini wrote:
> As a follow-up from a recent discussion on Matrix/IRC, I'm proposing
> the following change to the development cycle / release schedule:
> 
> "Koji builds are blocked while mass branching and updates-testing
> enablement are in progress."

What will packagers see?

Will builds be queued, and get processed when the lock is released?

Will build attempts be rejected with a clear explanation? "You can't
build while we're branching. Please try again later."

Or will packagers start asking why they get an incomprehensible stack
trace from fedpkg?

Björn Persson


pgpHuGO6RgD0M.pgp
Description: OpenPGP digital signatur
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue


RFC: No koji builds during mass branching and updates-testing enablement

2023-03-09 Thread Fabio Valentini
Hi all,

As a follow-up from a recent discussion on Matrix/IRC, I'm proposing
the following change to the development cycle / release schedule:

"Koji builds are blocked while mass branching and updates-testing
enablement are in progress."

That's it, that's the entire RFC.

Roughly every six months, I run a check for updates that are present
in the current "stable" release, but missing from "branched", and
every six months, there's a non-negligible number of builds and / or
bodhi updates that get stuck in a void because they just happened to
have been run at the exact worst moment.

In my opinion, the benefits of implementing this change (less releng
time spent on fixing builds that are stuck in an inconsistent state)
would outweigh the downsides (two windows of a few hours each during
the early development cycle where no builds can be launched).

Issues that I see with builds that just "happened to be in the wrong
place at the wrong time" fall broadly into two categories (though I
have seen other types of problems that are more rare):

1. Builds launched while the mass branching is in progress have the
fcXX (where XX = old-rawhide / branched) dist-tag, but only gets
tagged with fXY (XY = new-rawhide) by koji. This results in them only
being available in the rawhide repos, and not from "branched" at all.
Just resubmitting the build for "branched" doesn't work, because the
wrong dist-tag causes NVR conflicts. Fixing this requires either
releng intervention (useless busywork) or bumping the release and
submitting new builds for *both rawhide and branched* (waste of
resources).

2. Builds launched just before updates-testing enablement can get
stuck in "testing" state before there is an actual updates-testing
repo, and are hence not available from *any* repository (for testing?)
during the beta freeze, but will get pushed to stable afterwards. This
results in users who want to test the beta release (or "pre-beta" with
updates-testing enabled) to not see these updates at all, but they
will be pushed to "stable" immediately after the beta freeze is lifted
(i.e. without *any* amount of testing).

Fabio
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue