On Fri, Jun 5, 2020 at 3:25 PM 'Andrey Konovalov' via kasan-dev <kasan-...@googlegroups.com> wrote: > > On Fri, Jun 05, 2020 at 12:57:15PM +0200, Dmitry Vyukov wrote: > > > On Fri, Jun 5, 2020 at 10:28 AM Marco Elver <el...@google.com> wrote: > > > > > > > > While we lack a compiler attribute to add to noinstr that would disable > > > > KCOV, make the KCOV runtime functions return if the caller is in a > > > > noinstr section, and mark them noinstr. > > > > > > > > Declare write_comp_data() as __always_inline to ensure it is inlined, > > > > which also reduces stack usage and removes one extra call from the > > > > fast-path. > > > > > > > > In future, our compilers may provide an attribute to implement > > > > __no_sanitize_coverage, which can then be added to noinstr, and the > > > > checks added in this patch can be guarded by an #ifdef checking if the > > > > compiler has such an attribute or not. > > > > > > Adding noinstr attribute to instrumentation callbacks looks fine to me. > > > > > > But I don't understand the within_noinstr_section part. > > > As the cover letter mentions, kcov callbacks don't do much and we > > > already have it inserted and called. What is the benefit of bailing > > > out a bit earlier rather than letting it run to completion? > > > Is the only reason for potential faults on access to the vmalloc-ed > > > region? > > > > Vmalloc faults (on x86, the only arch that had them IIRC) are gone, per > > this merge window. > > > > The reason I mentioned them is because it is important that they are > > gone, and that this hard relies on them being gone, and the patch didn't > > call that out. > > > > There is one additional issue though; you can set hardware breakpoint on > > vmalloc space, and that would trigger #DB and then we'd be dead when we > > were already in #DB (IST recursion FTW). > > > > And that is not something you can trivially fix, because you can set the > > breakpoint before the allocation (or perhaps on a previous allocation). > > > > That said; we already have this problem with task_struct (and > > task_stack). IIRC Andy wants to fix the task_stack issue by making all > > of noinstr run on the entry stack, but we're not there yet. > > > > There are no good proposals for random allocations like task_struct or > > in your case kcov_area. > > > > > Andrey, Mark, do you know if it's possible to pre-fault these areas? > > > > Under the assumption that vmalloc faults are still a thing: > > > > You cannot pre-fault the remote area thing, kernel threads use the mm of > > the previous user task, and there is no guarantee that mm will have had > > the vmalloc fault. > > To clarify this part AFAIU it, even if we try to prefault the whole > remote area each time kcov_remote_start() is called, then (let alone > the performance impact) the kernel thread can be rescheduled between > kcov_remote_start() and kcov_remote_stop(), and then it might be > running with a different mm than the one that was used when > kcov_remote_start() happened.
Ugh, this is nasty. But this has also gone, which I am happy about :) Why I am looking at this is because with coverage instrumentation __sanitizer_cov_trace_pc is the hottest function in the kernel and we are adding additional branches to it. Can we touch at least some per-cpu data within noinstr code? If yes, we could try to affect the existing in_task()/in_serving_softirq() check. If not, it would be useful to have a comment clarifying that within_noinstr_section check must happen before we touch anything else. I assume objtool can now also detect all violations. How bad is it now without within_noinstr_section check? I am assuming we marking noinstr functions to not be instrumented, but we are getting some stray instrumentation from inlined functions or something, right? How many are there? Is it fixable/unfixable? Marco, do you know the list, or could you please collect the list of violations? Is there any config that disables #DB? We could well disable it on syzbot, I think we already disable some production hardening/debugging confings, which are not too useful for testing setup. E.g. we support RANDOMIZE_BASE, no problem, but if one disables it (which we do), that becomes no-op: #ifdef CONFIG_RANDOMIZE_BASE ip -= kaslr_offset(); #endif return ip;