Control: severity -1 normal On Wed, 2018-03-21 at 08:17 +0000, Michael Schaller wrote: > Please reconsider that this is merely an annoyance and that this is a > wishlist item. > If a NVIDIA driver security update is pushed and security updates are > installed unattendedly then all NVIDIA user space components will > stop > working immediately after the respective package updates as the > loaded > kernel module and the user space components have a version mismatch. > The consequences are not immediately visible to the user as NVIDIA > components in memory are still properly matched and hence still work. > The > real issue is with new processes as for an instance no OpenGL > applications > or CUDA workloads can be launched anymore. This is especially severe > for > CUDA server farms as they currently can't enable unattended security > updates unless they specifically exclude NVIDIA driver updates.
That's fine, I didn't grok that you had large installations where this was causing issues already, personally I'm fine with talking about possible solutions. Seeing your email address domain - any chance your company could use its gargantuan soft-power to get Nvidia to publish the specs for the missing parts of Nouveau (reclocking, power managerment, etc)? That would solve all our problems once and for all :-P > On Wed, Mar 21, 2018 at 9:00 AM Philipp Kern <pk...@debian.org> > wrote: > > > On 03/20/2018 10:59 PM, Luca Boccassi wrote: > > > The problems I see are that it would make an already quite > > > complex > > > packaging system, over which we have very little control (most of > > > it > > > it's binary blobs) even more complicated. We already have 2 > > > layers of > > > update-alternatives (mesa vs nvidia and then current vs legacy). > > > > > > It would also mean we have to start maintaining multiple versions > > > at > > > the same time - again being all binary blobs, which will multiply > > > the > > > source of problems. Basically, it would mean that instead of > > > having > > > current vs legacy340xx (up until a few months ago also > > > legacy304xx), > > > every single driver update would have to be maintained > > > separately. > > I don't propose this as the solution, though. I think that'd indeed > > be > > infeasible. What I'm saying is that the *binary* packages are > > versioned > > like this, not the source packages. It's like the kernel in a way, > > where > > every ABI version gets its own binary package name. Although in > > Debian > > the hesitance to change the ABI is much higher than in Ubuntu, for > > reasons that I assume have to do with the NEW queue. Cleaning up > > older > > versions is something we'd find a solution for, just like people > > clean > > up their old kernels. > > So please separate out maintenance from the proposal. ;-) > > I get it with the two layers of alternatives. Is the reason for > > mesa vs. > > nvidia because we don't put Nvidia into the library search path > > first > > and need to deal with the corresponding file conflicts in a sane > > way? Or > > because we want to keep co-installability between mesa and nvidia? > > > In the end the problem is an annoyance but not a deal breaker - > > > updates > > > can be scheduled and delayed (unlike some other OSes...), and on > > > top of > > > that, version bumps are not that common - at most once a month, > > > and > > > only for those running unstable or testing - in stable we just > > > ship LTS > > > versions. > > Actually it's a real deal breaker in mass deployments. If your > > users are > > hesitant to do reboots because it resets their work environment, > > you > > really need to detach nvidia updates from the rest of the package > > updates, which means having a custom-built solution to do that. > > That has > > turned out to be brittle, as it turns out that you end up > > installing > > pre-downloaded modules at boot, blocking it for about ten minutes. > > (It > > has gotten better with SSDs, but still.) > > Even if you just ship LTS versions there are sometimes updates > > needed, > > be it for Meltdown/Spectre or new hardware. In our case we actually > > do > > use testing, but even then we had the need to push updates to > > drivers. I > > think a setup that separates out binaries for every version that > > allows > > for consistent rollbacks[1] and rollforwards would be beneficial > > not > > just for us but also for the whole userbase of Debian. > > We'd be willing to invest some time into a solution - as our own to > > work > > around the flaws in the packaging has turned out to be a > > maintenance > > headache. But that only works if we at least agree on a plan. I'm > > also > > happy to clarify more that I probably missed in the proposal. :) > > Kind regards and thanks > > Philipp Kern > > [1] We had a bunch of regressions with newer drivers in the past > > that > > made them dead on arrival, like missing repaints in terminals for a > > fraction of the cards. > > -- > > To unsubscribe, send mail to 889669-unsubscr...@bugs.debian.org. -- Kind regards, Luca Boccassi
signature.asc
Description: This is a digitally signed message part