Re: RFC: No koji builds during mass branching and updates-testing enablement
On Sun, Apr 23, 2023 at 07:28:08AM +, Mattia Verga via devel wrote: > Isn't simpler to schedule: > > 1. lock down Koji in (stop accepting new builds, > possibly only for Rawhide) > 2. let Koji finish running builds (assuming there are none which > requires more than 24h) > 3. at check any stuck Rawhide update in Bodhi > 4. branch > 5. unlock Koji > > I think a 24h Koji outage is much clearer to users other than cancelling > their builds and unpushing their updates. Unless releng wants to take > note of those builds and updates and resubmit them after the mass > branching... well, 24hours is both too long and too short. On one hand we have some things that take longer than 24h sometimes. On the other we have a bunch of builds that only take a few minutes. I'd hate for a important update that only takes a few minutes to build to be delayed for 24hours. ;( I think we could perhaps look at resubmitting things. That could be a bit complex with targets and such, but it might be doable. kevin signature.asc Description: PGP signature ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Re: RFC: No koji builds during mass branching and updates-testing enablement
Il 22/04/23 23:46, Kevin Fenzi ha scritto: > On Fri, Apr 21, 2023 at 09:03:11PM +0200, Fabio Valentini wrote: >> On Thu, Mar 9, 2023 at 8:56 PM Kevin Fenzi wrote: >>> * Cancel all builds that are in progress. Maintainers can resubmit after >>> the outage with the appropriate branches. >>> * unpush all updates stuck in gating/pending? Is this too much? >>> * do the branching steps, get everything in place, then open things on >>> the hub. >>> >>> This is a lot more disruptive, but it's only for part of a day and I >>> agree it's nicer to not have things to clean up. >> Sorry for the long RTT. My email inbox is only now no longer looking >> like a dumpster fire. :) >> >> It sounds like koji actually supports giving an outage message, so >> that would be great. >> Concerning the three steps listed above: I think they would make sense. >> Maybe it could look like this: >> >> 1. lock down the koji hub >> 2. cancel all builds that are still running (I think this could >> exclude builds that are targeting stable branches?) >> 3. unpush all Rawhide updates that are stuck (maybe adding a comment >> to the bodhi update why it happened) >> 4. do the mass branching steps (i.e. Rawhide == Fedora N+2, Branched >> == Fedora N+1) >> 5. unlock koji hub >> >> Parts of steps 2,3,4 could even happen with more granularity (I think >> mass branching steps happen alphabetically for all packages? that >> would give running builds more time to finish.). > This seems doable. We should make sure it is as best we can, and then > probibly announce it before the next mass branching so everyone knows to > expect it. :) > > Hopefully this will prevent problems the next time... > > Adding Tomas here on cc > > kevin Isn't simpler to schedule: 1. lock down Koji in (stop accepting new builds, possibly only for Rawhide) 2. let Koji finish running builds (assuming there are none which requires more than 24h) 3. at check any stuck Rawhide update in Bodhi 4. branch 5. unlock Koji I think a 24h Koji outage is much clearer to users other than cancelling their builds and unpushing their updates. Unless releng wants to take note of those builds and updates and resubmit them after the mass branching... Mattia ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Re: RFC: No koji builds during mass branching and updates-testing enablement
On Fri, Apr 21, 2023 at 09:03:11PM +0200, Fabio Valentini wrote: > On Thu, Mar 9, 2023 at 8:56 PM Kevin Fenzi wrote: > > > > * Cancel all builds that are in progress. Maintainers can resubmit after > > the outage with the appropriate branches. > > * unpush all updates stuck in gating/pending? Is this too much? > > * do the branching steps, get everything in place, then open things on > > the hub. > > > > This is a lot more disruptive, but it's only for part of a day and I > > agree it's nicer to not have things to clean up. > > Sorry for the long RTT. My email inbox is only now no longer looking > like a dumpster fire. :) > > It sounds like koji actually supports giving an outage message, so > that would be great. > Concerning the three steps listed above: I think they would make sense. > Maybe it could look like this: > > 1. lock down the koji hub > 2. cancel all builds that are still running (I think this could > exclude builds that are targeting stable branches?) > 3. unpush all Rawhide updates that are stuck (maybe adding a comment > to the bodhi update why it happened) > 4. do the mass branching steps (i.e. Rawhide == Fedora N+2, Branched > == Fedora N+1) > 5. unlock koji hub > > Parts of steps 2,3,4 could even happen with more granularity (I think > mass branching steps happen alphabetically for all packages? that > would give running builds more time to finish.). This seems doable. We should make sure it is as best we can, and then probibly announce it before the next mass branching so everyone knows to expect it. :) Hopefully this will prevent problems the next time... Adding Tomas here on cc kevin signature.asc Description: PGP signature ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Re: RFC: No koji builds during mass branching and updates-testing enablement
On Thu, Mar 9, 2023 at 8:56 PM Kevin Fenzi wrote: > > * Cancel all builds that are in progress. Maintainers can resubmit after > the outage with the appropriate branches. > * unpush all updates stuck in gating/pending? Is this too much? > * do the branching steps, get everything in place, then open things on > the hub. > > This is a lot more disruptive, but it's only for part of a day and I > agree it's nicer to not have things to clean up. Sorry for the long RTT. My email inbox is only now no longer looking like a dumpster fire. :) It sounds like koji actually supports giving an outage message, so that would be great. Concerning the three steps listed above: I think they would make sense. Maybe it could look like this: 1. lock down the koji hub 2. cancel all builds that are still running (I think this could exclude builds that are targeting stable branches?) 3. unpush all Rawhide updates that are stuck (maybe adding a comment to the bodhi update why it happened) 4. do the mass branching steps (i.e. Rawhide == Fedora N+2, Branched == Fedora N+1) 5. unlock koji hub Parts of steps 2,3,4 could even happen with more granularity (I think mass branching steps happen alphabetically for all packages? that would give running builds more time to finish.). Fabio ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Re: RFC: No koji builds during mass branching and updates-testing enablement
On Thu, Mar 09, 2023 at 02:37:14PM +0100, Fabio Valentini wrote: > Hi all, > > As a follow-up from a recent discussion on Matrix/IRC, I'm proposing > the following change to the development cycle / release schedule: > > "Koji builds are blocked while mass branching and updates-testing > enablement are in progress." > > That's it, that's the entire RFC. Yeah, we have wanted to do this for a long time actually, but... The implemetation is tricky. As part of the branching process, after git repos and koji are switched, we have to commit and build fedora-repos/fedora-release in order for other builds to get the right disttag and to use the right repos. So, we need koji to let releng do that while blocking everyone else. ;( But that might be possible/doable now. (see below) Also, there's still going to be some corner cases I fear, like builds that were in progress before branching, or things that were stuck in ci/gating tests? Waiting for builds to finish is pretty time consuming. If someone started a gcc or libreoffice or chromium build it could be... a long time. I guess the way to be really sure would be: * set koji hub: ServerOffline = True OfflineMessage = Branching in progress, come back later LockOut = True (the comments for those is: ## If ServerOffline is True, the server will always report a ServerOffline fault (with ## OfflineMessage as the fault string). ## If LockOut is True, the server will report a ServerOffline fault for all non-admin ## requests. ) * Cancel all builds that are in progress. Maintainers can resubmit after the outage with the appropriate branches. * unpush all updates stuck in gating/pending? Is this too much? * do the branching steps, get everything in place, then open things on the hub. This is a lot more disruptive, but it's only for part of a day and I agree it's nicer to not have things to clean up. kevin signature.asc Description: PGP signature ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Re: RFC: No koji builds during mass branching and updates-testing enablement
Hi Fabio, On March 9, 2023 1:37:14 PM UTC, Fabio Valentini wrote: >Hi all, > >As a follow-up from a recent discussion on Matrix/IRC, I'm proposing >the following change to the development cycle / release schedule: > >"Koji builds are blocked while mass branching and updates-testing >enablement are in progress." > >That's it, that's the entire RFC. > >Roughly every six months, I run a check for updates that are present >in the current "stable" release, but missing from "branched", and >every six months, there's a non-negligible number of builds and / or >bodhi updates that get stuck in a void because they just happened to >have been run at the exact worst moment. > >In my opinion, the benefits of implementing this change (less releng >time spent on fixing builds that are stuck in an inconsistent state) >would outweigh the downsides (two windows of a few hours each during >the early development cycle where no builds can be launched). > >Issues that I see with builds that just "happened to be in the wrong >place at the wrong time" fall broadly into two categories (though I >have seen other types of problems that are more rare): > >1. Builds launched while the mass branching is in progress have the >fcXX (where XX = old-rawhide / branched) dist-tag, but only gets >tagged with fXY (XY = new-rawhide) by koji. This results in them only >being available in the rawhide repos, and not from "branched" at all. >Just resubmitting the build for "branched" doesn't work, because the >wrong dist-tag causes NVR conflicts. Fixing this requires either >releng intervention (useless busywork) or bumping the release and >submitting new builds for *both rawhide and branched* (waste of >resources). > >2. Builds launched just before updates-testing enablement can get >stuck in "testing" state before there is an actual updates-testing >repo, and are hence not available from *any* repository (for testing?) >during the beta freeze, but will get pushed to stable afterwards. This >results in users who want to test the beta release (or "pre-beta" with >updates-testing enabled) to not see these updates at all, but they >will be pushed to "stable" immediately after the beta freeze is lifted >(i.e. without *any* amount of testing). As someone who accidentally build a package in the wrong time period, I'm very much in favor of preventing this from happening in the future. Thanks for this proposal! Dan ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Re: RFC: No koji builds during mass branching and updates-testing enablement
On Thu, Mar 9, 2023 at 3:46 PM Dennis Gilmore via devel wrote: > >> 2. Builds launched just before updates-testing enablement can get >> stuck in "testing" state before there is an actual updates-testing >> repo, and are hence not available from *any* repository (for testing?) >> during the beta freeze, but will get pushed to stable afterwards. This >> results in users who want to test the beta release (or "pre-beta" with >> updates-testing enabled) to not see these updates at all, but they >> will be pushed to "stable" immediately after the beta freeze is lifted >> (i.e. without *any* amount of testing). > > I do not understand how this is at all possible. If a build has the tag to be > stable it will show up freeze or not. it may not be in the beta compose, but > will be in the nightly composes and being tested and available there. As far as I can tell, this is the way this has happened: 1. build is launched for "branched" *before* the beta freeze went into effect (i.e. before the "updates-testing" repo exists for that release) 2. build is automatically submitted to bodhi and goes through the stages of "pending" -> "testing" -> "stable" (and the build can spend a non-negligible amount of time in the "testing" state if there are any gating tests) 3. the beta freeze goes into effect while the build is still in "testing" state, but "testing" pre-beta-freeze means something different than "testing" post-updates-testing-enablement Fabio ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Re: RFC: No koji builds during mass branching and updates-testing enablement
On Thu, Mar 9, 2023 at 3:28 PM Björn Persson wrote: > > What will packagers see? This is still up for discussion, I guess. And it will also depend on how this can be implemented. Ideally, "fedpkg build" would print a warning like "Mass branching is in progress, no builds can be submitted at this time.", but I'm not sure how hard that would be to implement. I think koji can already respond with something like an "there's an outage, try again later" error message, so it at least seems like it should be doable. > Will builds be queued, and get processed when the lock is released? No. Builds are doomed to be in the wrong state as soon as they are submitted during that time window (for example, "build was submitted when rawhide was still f38, but rawhide will be f39 once this build is finished"). The packager would need to retry to submit after releng has finished with their work (so, similar to how other infrastructure outages). > Will build attempts be rejected with a clear explanation? "You can't > build while we're branching. Please try again later." That would be nice, but I don't know how hard that would be to implement. > Or will packagers start asking why they get an incomprehensible stack > trace from fedpkg? Incomprehensible stack trace would be quite bad. Fabio ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Re: RFC: No koji builds during mass branching and updates-testing enablement
On Thu, Mar 9, 2023 at 7:37 AM Fabio Valentini wrote: > Hi all, > > As a follow-up from a recent discussion on Matrix/IRC, I'm proposing > the following change to the development cycle / release schedule: > > "Koji builds are blocked while mass branching and updates-testing > enablement are in progress." > > That's it, that's the entire RFC. > > Roughly every six months, I run a check for updates that are present > in the current "stable" release, but missing from "branched", and > every six months, there's a non-negligible number of builds and / or > bodhi updates that get stuck in a void because they just happened to > have been run at the exact worst moment. > > In my opinion, the benefits of implementing this change (less releng > time spent on fixing builds that are stuck in an inconsistent state) > would outweigh the downsides (two windows of a few hours each during > the early development cycle where no builds can be launched). > > Issues that I see with builds that just "happened to be in the wrong > place at the wrong time" fall broadly into two categories (though I > have seen other types of problems that are more rare): > > 1. Builds launched while the mass branching is in progress have the > fcXX (where XX = old-rawhide / branched) dist-tag, but only gets > tagged with fXY (XY = new-rawhide) by koji. This results in them only > being available in the rawhide repos, and not from "branched" at all. > Just resubmitting the build for "branched" doesn't work, because the > wrong dist-tag causes NVR conflicts. Fixing this requires either > releng intervention (useless busywork) or bumping the release and > submitting new builds for *both rawhide and branched* (waste of > resources). > I think this can be resolved by changing the branching process, though there would still be a small race condition window. We used to lock everything when branching when we used CVS and went to a lot of effort not to lock everything. Doing the koji side completely before doing anything in git should help significantly here. you can also disable all the builders and wait for the active builds to complete then do everything, that way new builds will queue up and wait for the builders to be renabled. > 2. Builds launched just before updates-testing enablement can get > stuck in "testing" state before there is an actual updates-testing > repo, and are hence not available from *any* repository (for testing?) > during the beta freeze, but will get pushed to stable afterwards. This > results in users who want to test the beta release (or "pre-beta" with > updates-testing enabled) to not see these updates at all, but they > will be pushed to "stable" immediately after the beta freeze is lifted > (i.e. without *any* amount of testing). > I do not understand how this is at all possible. If a build has the tag to be stable it will show up freeze or not. it may not be in the beta compose, but will be in the nightly composes and being tested and available there. Dennis Fabio > ___ > devel mailing list -- devel@lists.fedoraproject.org > To unsubscribe send an email to devel-le...@lists.fedoraproject.org > Fedora Code of Conduct: > https://docs.fedoraproject.org/en-US/project/code-of-conduct/ > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines > List Archives: > https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org > Do not reply to spam, report it: > https://pagure.io/fedora-infrastructure/new_issue > ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Re: RFC: No koji builds during mass branching and updates-testing enablement
Fabio Valentini wrote: > As a follow-up from a recent discussion on Matrix/IRC, I'm proposing > the following change to the development cycle / release schedule: > > "Koji builds are blocked while mass branching and updates-testing > enablement are in progress." What will packagers see? Will builds be queued, and get processed when the lock is released? Will build attempts be rejected with a clear explanation? "You can't build while we're branching. Please try again later." Or will packagers start asking why they get an incomprehensible stack trace from fedpkg? Björn Persson pgpHuGO6RgD0M.pgp Description: OpenPGP digital signatur ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
RFC: No koji builds during mass branching and updates-testing enablement
Hi all, As a follow-up from a recent discussion on Matrix/IRC, I'm proposing the following change to the development cycle / release schedule: "Koji builds are blocked while mass branching and updates-testing enablement are in progress." That's it, that's the entire RFC. Roughly every six months, I run a check for updates that are present in the current "stable" release, but missing from "branched", and every six months, there's a non-negligible number of builds and / or bodhi updates that get stuck in a void because they just happened to have been run at the exact worst moment. In my opinion, the benefits of implementing this change (less releng time spent on fixing builds that are stuck in an inconsistent state) would outweigh the downsides (two windows of a few hours each during the early development cycle where no builds can be launched). Issues that I see with builds that just "happened to be in the wrong place at the wrong time" fall broadly into two categories (though I have seen other types of problems that are more rare): 1. Builds launched while the mass branching is in progress have the fcXX (where XX = old-rawhide / branched) dist-tag, but only gets tagged with fXY (XY = new-rawhide) by koji. This results in them only being available in the rawhide repos, and not from "branched" at all. Just resubmitting the build for "branched" doesn't work, because the wrong dist-tag causes NVR conflicts. Fixing this requires either releng intervention (useless busywork) or bumping the release and submitting new builds for *both rawhide and branched* (waste of resources). 2. Builds launched just before updates-testing enablement can get stuck in "testing" state before there is an actual updates-testing repo, and are hence not available from *any* repository (for testing?) during the beta freeze, but will get pushed to stable afterwards. This results in users who want to test the beta release (or "pre-beta" with updates-testing enabled) to not see these updates at all, but they will be pushed to "stable" immediately after the beta freeze is lifted (i.e. without *any* amount of testing). Fabio ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue