On Sun, Jun 10, 2018 at 3:51 AM, Theodore Y. Ts'o <ty...@mit.edu> wrote: > On Sat, Jun 09, 2018 at 03:17:21PM -0700, Linus Torvalds wrote: >> I think it would be lovely to get linux-next back eventually, but it >> sounds like it's just too noisy right now, and yes, we should have a >> baseline for the standard tree first. >> >> But once there's a "this is known for the baseline", I think adding >> linux-next back in and then maybe even have linux-next simply just >> kick out trees that cause problems would be a good idea. >> >> Right now linux-next only kicks things out based on build issues (or >> extreme merge issues), afaik. But it *would* be good to also have >> things like syzbot do quality control on linux-next. > > Syzbot is always getting improved to find new classes of problems. So > the only way to get a baseline would be to use an older version of > syzbot for linux-next, and to have it suppress sending e-mails about > failures that are duplicates that were already found via the mainline > tree. > > Then periodically, once version N has run for M weeks, and has spewed > some large number of new failures to LKML, then you could promote > version N to be run against linux-next, and so hopefully the only > thing it would report against linux-next are regressions, and not > duplicates of new bugs also being found via the latest and greatest > version of syzbot being run against the mainline kernel.
The set of trees where a crash happened is visible on dashboard, so one can see if it's only linux-next or whole set of trees. Potentially syzbot can act differently depending on this predicate, but I don't see what should be the difference. However, this does not fully save from falsely assessing bugs as linux-next-only just because they happened few times and only on linux-next so far. But using an older syzkaller revision won't save from this fully either, because (1) some bugs take long time to find, and (2) a bug can be hidden by another known bug, so when the second bug is fixed the first one suddenly pops up, but it's not a new bug (and the chances are that the second one will be fixed on linux-next first, so the first bug will look like linux-next-only). I think re removing commits from linux-next, one of the main signals can be: were there recent changes related to the bug. Looking at new bugs being reported, frequently it's quite obvious (e.g. "use-after-free in foo" and a recent "make foo faster"). But in general, if we go with linux-next, maintainers and developers need to agree to deal with this additional aspect during bug triage. There is also a problem with rebasing of linux-next: reported commit hashes do not make sense and we can forget about bisection. On a related note, recently Greg suggested to onboard more subsystem -next trees (currently we test only net-next and bpf-next), so I tried to formulate requirements for these trees: https://github.com/google/syzkaller/issues/592 - not rebased (commit hashes work, bisection works) - maintained in a reasonably good shape (no tons of assorted crashes) - reasonably active (makes sense to test) - merge upstream periodically (bugs are getting fixed) - with maintainers who are willing to cooperate and fix bugs Any volunteers? Thanks