On Sat, Apr 21, 2018 at 04:33:33PM -0400, Jonathan Looney wrote: > On Sat, Apr 21, 2018 at 1:53 PM, Conrad Meyer <c...@freebsd.org> wrote: > > > > I don't think this should be enabled by default. Can we leave it > > disabled by default and let consumers opt-in? > > I'm willing to change the default if there's a good reason or consensus for > that. However, it is not obvious to me that the default is actually wrong. > > I think its important that we remember where we are when we hit this code. > > By the time we hit this code, we've already panic'd (whether due to a > "true" panic or a violated assertion). At this point, the system is already > in an unknown/unexpected state and we want to preserve our ability to > troubleshoot and/or recover. (And, since we're running a system with > INVARIANTS, presumably the ability to troubleshoot is important.) > > We may well violate a second assertion simply because the system is in an > unknown/unexpected state. Once we do that, the question is whether we > should try to preserve the ability to troubleshoot, or should give up and > reset the system. > > All too often, my ability to debug assertion violations is hindered because > the system trips over yet another assertion while dumping the core. If we > skip the assertion, nothing bad happens. (The post-panic debugging code > already needs to deal with systems that are inconsistent, and it does a > pretty good job at it.)
I think we make a decent effort to fix such problems as they arise, but you are declaring defeat on behalf of everyone. Did you make some effort to fix or report these issues before resorting to the more drastic measure taken here? > On the other hand, I really am not sure what you are worried might happen > if we skip checking assertions after we've already panic'd. As far as I can > tell, the likely worst case is that we hit a true panic of some kind. In > that case, we're no worse off than before. > > I think the one obvious exception is when we're purposely trying to > validate the post-panic debugging code. In that case, you can change the > sysctl/tunable to enable troubleshooting. What about a user whose test system panics and fails to dump? With assertions enabled, a developer has a better chance of spotting the problem. Now we need at least one extra round trip to the user to diagnose the problem, which may not be readily reproducible in the first place. > I would honestly appreciate someone explaining the dangers in disabling a > response to assertion violations after we've already panic'd and are simply > trying to troubleshoot, because they are not obvious to me. But, I could > simply be missing them. The assertions help identify code that is being executed during a dump when it shouldn't be. In general we rely on users to opt in to running INVARIANTS kernels because developers don't catch every single bug. With this change it's harder to be confident in the kernel dump code. (Or in any post-panic debugging code for that matter.) I dislike the change and would prefer the default to be inverted. At the very least I think we should print the assertion message rather than returning silently from kassert_panic(). _______________________________________________ svn-src-head@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"