Hi Sasha, On Mon, Apr 11, 2016 at 04:38:17PM -0400, Sasha Levin wrote: > > How are you > > going to judge which driver fixes to take and which not to? Why not > > take them all if they fix bugs? > > Because some fixes introduce bug on their own? Take a look at how many > commits in the stable tree have a "Fixes:" tag that points to a commit > that's also in the stable tree.
I'm using stable trees myself in the load balancing products we ship at work. I've met a single bug during the whole 3.10 lifetime and it was caused by one of our out-of-tree patch that applied at the wrong place after an update. I'd generally say that -stable quality is very good, if not excellent. Several people review the patches before they get merged, several ones build and even boot them. It's not that random. Look, one patch was just dropped from 3.14.64 because it failed a build test in one environment. This one will never hit end users. > Look at the opposite side of this question: why would anyone take a commit > that fixes a bug he doesn't care about? Are the benefits really worth it > considering the risks? That's exactly what most people do. I don't update to each and every kernel. When I see xen, lvm, drm and audio changes I don't need them in my products. But when I'm seeing network fixes I study them and often decide that it's worth upgrading. Sometimes I pick a single fix from the queue because I can't wait for next release. Many of Greg's kernels more or less focus on certain topics, probably due to the way he deals with his mailbox and patch storms, so it's often easy to quickly decide if you're going to need to update or not. > [snip] > > >>> Define "important". Now go and look at the tty bug we fixed that people > >>> only realized was "important" 1 1/2 years later and explain if you > >>> would, or would not have, taken that patch in this tree. > >> > >> Probably not, but I would have taken it after it received a CVE number. > >> > >> Same applies to quite a few commits that end up in stable - no one thinks > >> they're stable material at first until someone points out it's crashing > >> his production boxes for the past few months. > > > > Yes, but those are rare, what you are doing here is suddenly having to > > judge if a bug is a "security" issue or not. You are now in the > > position of trying to determine "can this be exploited or not", for > > every commit, and that's a very hard call, as is seen by this specific > > issue. Especially for networking stuff or things related to local resource usage where some people consider it represents a local DoS risk and others consider that it's just irrelevant to their servers since they have no local users. > The stable stuff isn't rare as you might think, even more: the amount of > actual CVE fixes that are not in the stable tree might surprise you. I would personally not be surprized since Ben used to feed me with a lot of fixes I had never seen previously. What is unclear to me is if your tree will contain only a selection of patches that are already in the respective branches, or a backport of security fixes that we can pick from to feed our stable branches and limit the risk of missing them. *This* actually could be useful to everyone, starting from our users. (...) > This is actually what happens now; projects get to the point they don't > want to update their whole kernel tree anymore so that just freezes because > they don't want to re-validate the whole thing over and over, but they > still cherry pick upstream and out-of-tree commits that they care about. > > If they added a handful of security commits to cherry pick and carefully > review their security will be much better than what happens now. Actually I do think that for end users it's a regression. People will start reusing outdated kernels which only contain the most critical fixes known, but will still suffer from memory leaks, deadlocks, kernel panics, data corruption etc. Every single bug that doesn't have a CVE attached to it in fact, which means 99% of the bugs that bring a system down in production. It makes me think about people who only pick security fixes from openssl and not the regular batch of missing null checks, and who complain all the time that their systems are unstable while they simply don't apply fixes. You see, when I started with the "hotfix" tree 11 years ago for kernel 2.4, I intented to only pick the most critical fixes, they would fit in just a README and were counted on one hand. One year later there were 150 just because everything becomes critical for *some* workloads. I *do* think that having a central reference for fixes that come with a reproducer (hence many security fixes) can be useful as it would offer an opportunity for better testing backports when they become tricky : it often takes much more time to try to set up a test with a reproducer than it takes to backport and adjust the fix (not always true). But when it comes to security issues often the reporter cares about the quality of the backport and helps there. Just my two cents, Willy