On Fri, Dec 15, 2023 at 11:37:33AM -0800, Erich Eickmeyer wrote: > Additionally, the SRU team, Release Team, and Archive Admin team have not > done any work on what it means to onboard any team members, which is in > itself a breach of the Code of Conduct:
Erich, you have a pattern of invoking the Code of Conduct against fellow developers when they disagree with you which is inappropriately escalatory and does not advance your purpose. Please stop. The teams in question have been asked to document their onboarding requirements and process. This is a fair ask, which has not yet been delivered on publicly for any of the teams in question because it must be balanced against day-to-day responsibilities. But you are implying with your message that the lack of DOCUMENTATION for onboarding onto these teams is the cause of problems with the response to the recent high-impact SRU regression. There are many things I think can be improved about how this SRU regression was handled, and I will go into details below. But it is unrealistic to argue that having this documentation in place would have changed the composition of the teams at the time and thereby prevented this incident. In particular, the perceived problem at the time was lack of availability of an Archive Admin, and a defining principle of membership in the AA team is this: there are many competent and trustworthy Ubuntu developers who could do the job of archive admins; but because of the raw control over the archive that membership in this team confers, the team should be as large as it needs to be to fulfill its responsibility to the community of Ubuntu developers, *and no larger*. So no, writing this down on a wiki page would not have changed the composition of the Archive Team prior to this event; nor does the fact that this event happened imply that expanding the archive team is the correct remedy. A timeline of events; all times given in US/Pacific to minimize the possibility of miscalculations on my side. 2023-12-07 13:25: mutter 45.2-0ubuntu1 SRU accepted into mantic-proposed. 2023-12-13 08:03: bug #2046360 opened, reporting a regression in this SRU. uploader of SRU subscribed to bug and bug was tagged regression-proposed. 2023-12-14 04:52: mutter 45.2-0ubuntu1 SRU released into mantic-updates. 2023-12-15 01:22: bug #2046360 re-tagged regression-update. 2023-12-15 05:36: ahasenack asks on a Canonical-internal SRU team chat about stopping phasing for an update. 2023-12-15 07:40: ahasenack pings ubuntu-archive on #ubuntu-release. 2023-12-15 08:54: tsimonq2 responds to the pings on #ubuntu-release. 2023-12-15 09:33: ahasenack asks on a Canonical internal chat for an archive admin but does not highlight AAs by name. 2023-12-15 10:04: aaronprisk (Community Team at Canonical) reaches out to me directly on Canonical internal chat, indicating he had been contacted by tsimonq2. I do not know if he reached out to other AAs. 2023-12-15 11:09: I notice Aaron's message and indicate I will address this with an ETA of an hour (I am out of the office at the time) 2023-12-15 11:37: preceding message is sent to tech board mailing list. 2023-12-15 12:26: I make it to my computer where I'm able to effect the requested change to SRU phasing. 2023-12-17 19:44: I upload a revert of mutter to mantic-proposed. 2023-12-17 21:41: the revert of mutter is accepted to mantic-proposed by another SRU team member. So there are a number of things that didn't work well here in terms of process. - The regression in the SRU was reported by Dan and an appropriate tag was set. However, he did not mark the corresponding SRU bug verification-failed, which is part of the process for regression handling documented on <https://wiki.ubuntu.com/StableReleaseUpdates#Verification>. So a longstanding member of the Ubuntu Desktop team (but not an Ubuntu developer?) was unfamiliar with the necessary process for blocking an SRU when a regression is detected. Do we have gaps in how the existing process has been communicated? - I subscribe to the regression-update and regression-proposed bug tags, but we have not set an expectation that all members of the SRU team subscribe to these tags. Comparing the "May be notified" lists on the side bar of sample bugs suggested in fact that I was the only member of the SRU team subscribed to the regression-proposed tag at the time; and only about half of the SRU team members appear to be subscribed to the regression-update tag. Should we require SRU team members to be subscribed to both tags, as an additional guard against accidental mis-release of regressions? - Even if everyone was subscribed to the regression-proposed tag, there's no guarantee they've received/seen/read the email before processing the list of to-be-released packages on <https://ubuntu-archive-team.ubuntu.com/pending-sru.html>; and even if they have, they may overlook the connection between such a mail and an SRU they are about to release. Should this report flag all regression-proposed bugs open against a package, regardless of series targeting? - Was there agreement about the urgency of the need to disable phasing, and was this urgency communicated? Dan did not set a severity on the bug when filing it. Exchange between ahasenack and jbicha (uploader) on IRC yielded a "yes, let's pause phasing", with no apparent expression of urgency. There was no effort, visible to me, to escalate to any archive admins individually using available communications channels until aaronprisk pinged me, over 4 hours after the initial IRC pings. (At the time the decision was initially made to request a stop for phasing, it was still well within European business hours, for any EU-based archive admins who were not already out for the end of the year.) - We have a standing policy of not releasing SRUs on Friday, unless there's an exceptional reason to do so and a member of the SRU team commits to being available on the weekend to handle any regressions. This SRU was not released on a Friday, it was released on a Thursday; but it was the Thursday before a company-wide end-of-year shutdown and many folks were already out on vacation (including myself). Should we have been releasing SRUs this day without verifying there was appropriate capacity for dealing with any regressions? Should there have been an explicit conversation about end-of-year plans for SRU releases among the SRU team? I understand there was a specific request to release this SRU before the end of year, but it's not clear that this request should have been honored under the circumstances. - The normal process for handling a regression in an SRU is to set phasing to zero, to minimize the propagation of the bad update to additional users; AND to immediately begin the process of doing a follow-up SRU to revert the bad changes so that any users who have already received the bad update before changes to the phasing are able to get a fix. The first part of this blocks on availability of an Archive Admin. The second part of this is entirely within the power of the uploader together with a member of the SRU team. But a full day later, there had still not yet been an upload of mutter to mantic-proposed to fix this problem for affected users. Why is that? Comments from the uploader on IRC: 12:28 <jbicha> vorlon: unfortunately I'm out of time today to do a new mutter upload. Would we want the new targeted mutter fix to wait for 7 days too? 12:29 <jbicha> mutter 45.2 fixes important enough issues that I'd rather go forward than backwards But "rather go forward than backwards" has resulted in neither happening for over a day. And "out of time today" came 5 hours after the decision that phasing should be halted. Robie commented on IRC that there should be a clearer playbook for handling regressions. Absent that, however, it should still be clear that turning off phasing for an SRU only prevents it from being delivered to MORE users, it does not un-break users who have already received a broken SRU. In summary: no Ubuntu core-dev involved in this SRU thought the severity of the bug was high enough to warrant doing the work of uploading a revert for over 48 hours after it was known to be a regression in mantic-updates; yet you are accusing the Archive Team of mismanagement because of a 4-hour delay in response to a non-urgent request for dealing with the same bug. -- Steve Langasek Give me a lever long enough and a Free OS Debian Developer to set it on, and I can move the world. Ubuntu Developer https://www.debian.org/ slanga...@ubuntu.com vor...@debian.org
signature.asc
Description: PGP signature
-- Ubuntu-release mailing list Ubuntu-release@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-release