Re: FAS issue (was Re: Another mass rebuild blocker: glibc qsort regression)
On Tue, Jan 23, 2024 at 09:04:53AM -0600, Chris Adams wrote: > Once upon a time, Richard W.M. Jones said: > > The authentication issue being this one: > > > > https://pagure.io/fedora-infrastructure/issue/11733 > > I'd be interested in an after report on this one... as someone who has > managed FreeIPA, I'd like to know how this happened (so I can file away > how to NOT do the same thing in my own setups). Form the comment (https://pagure.io/fedora-infrastructure/issue/11733#comment-892793) it seems that new requirement of users having SID caught Fedora FreeIPA of guard. There's `ipa config-mod` invocation to add SIDs to users, but you must make sure you have ID Ranges defined covering all your UIDs and GIDs. -- Tomasz TorczOnly gods can safely risk perfection, to...@pipebreaker.pl it's a dangerous thing for a man. — Alia -- ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Re: FAS issue (was Re: Another mass rebuild blocker: glibc qsort regression)
Once upon a time, kevin said: > On Tue, Jan 23, 2024 at 09:04:53AM -0600, Chris Adams wrote: > > Once upon a time, Richard W.M. Jones said: > > > The authentication issue being this one: > > > > > > https://pagure.io/fedora-infrastructure/issue/11733 > > > > I'd be interested in an after report on this one... as someone who has > > managed FreeIPA, I'd like to know how this happened (so I can file away > > how to NOT do the same thing in my own setups). > > It seemed to be a number of things at once sadly, as often such things > are. We took a cluster member down and reinstalled rhel9 on it (to start > upgrading the cluster), but then the replication agreements for all > nodes were accidentally removed. That might have been easily > recoverable, but then we also hit that in our case the cluster was > installed a long time ago and didn't have SID's, which became manditory > to fix a CVE in the most recent version. And then we also hit some old > kruft leftover from when our cluster was in another datacenter long > ago. ;( Tech debt always wins, doesn't it... it's not always due to a lack of effort or anything, but it does seem to jump up at the worst times. > Many kudos to everyone who worked on this. Especially the IPA folks. > They have been calm and understanding and really helped us track > things down and get back working. Thanks to all who worked on this for getting it back into a serviceable state! Hope the path to fully finishing is smooth. -- Chris Adams -- ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Re: FAS issue (was Re: Another mass rebuild blocker: glibc qsort regression)
On Tue, Jan 23, 2024 at 09:04:53AM -0600, Chris Adams wrote: > Once upon a time, Richard W.M. Jones said: > > The authentication issue being this one: > > > > https://pagure.io/fedora-infrastructure/issue/11733 > > I'd be interested in an after report on this one... as someone who has > managed FreeIPA, I'd like to know how this happened (so I can file away > how to NOT do the same thing in my own setups). It seemed to be a number of things at once sadly, as often such things are. We took a cluster member down and reinstalled rhel9 on it (to start upgrading the cluster), but then the replication agreements for all nodes were accidentally removed. That might have been easily recoverable, but then we also hit that in our case the cluster was installed a long time ago and didn't have SID's, which became manditory to fix a CVE in the most recent version. And then we also hit some old kruft leftover from when our cluster was in another datacenter long ago. ;( Many kudos to everyone who worked on this. Especially the IPA folks. They have been calm and understanding and really helped us track things down and get back working. > Certainly not bothering anybody while there's still an outage (or while > they're recovering from dealing with it), but when things like this > happen, it's good for everybody to document how it happened - NOT to > cast blame or anything like that (sooner or later, we all do something > that breaks in wildly unexpected ways), but so we can all learn from the > mistake. Absolutely. Things are not fully normal now, but everything should be up from the user perspective. We will be working to get the cluster back to a normal state in the next few days, then we can look at retrospective. kevin signature.asc Description: PGP signature -- ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
FAS issue (was Re: Another mass rebuild blocker: glibc qsort regression)
Once upon a time, Richard W.M. Jones said: > The authentication issue being this one: > > https://pagure.io/fedora-infrastructure/issue/11733 I'd be interested in an after report on this one... as someone who has managed FreeIPA, I'd like to know how this happened (so I can file away how to NOT do the same thing in my own setups). Certainly not bothering anybody while there's still an outage (or while they're recovering from dealing with it), but when things like this happen, it's good for everybody to document how it happened - NOT to cast blame or anything like that (sooner or later, we all do something that breaks in wildly unexpected ways), but so we can all learn from the mistake. -- Chris Adams -- ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue