On Tue, Aug 18, 2015 at 04:07:51PM -0700, Andy Lutomirski wrote: > On Tue, Aug 18, 2015 at 4:02 PM, Frederic Weisbecker <[email protected]> > wrote: > > On Tue, Aug 18, 2015 at 03:35:30PM -0700, Andy Lutomirski wrote: > >> On Tue, Aug 18, 2015 at 3:16 PM, Frederic Weisbecker <[email protected]> > >> wrote: > >> > On Tue, Aug 18, 2015 at 12:11:59PM -0700, Andy Lutomirski wrote: > >> >> This fixes a couple minor holes if we took an IRQ very early in syscall > >> >> processing: > >> >> > >> >> - We could enter the IRQ with CONTEXT_USER. Everything worked (RCU > >> >> was fine), but we could warn if all the debugging options were > >> >> set. > >> > > >> > So this is fixing issues after your changes that call user_exit() from > >> > IRQs, right? > >> > >> Yes. Here's an example splat, courtesy of Sasha: > >> > >> https://gist.github.com/sashalevin/a006a44989312f6835e7 > >> > >> > > >> > But the IRQs aren't supposed to call user_exit(), they have their own > >> > hooks. > >> > That's where the real issue is. > >> > >> In -tip, the assumption is that we *always* switch to CONTEXT_KERNEL > >> when entering the kernel for a non-NMI reason. > > > > Why? IRQs don't need that! We already have irq_enter()/irq_exit(). > > > > Those are certainly redundant.
So? What's the point in duplicating a hook in arch code that core code already has? > I want to have a real hook to call > that says "switch to IRQ context from CONTEXT_USER" or "switch to IRQ > context from CONTEXT_KERNEL" (aka noop), but that doesn't currently > exist. You're not answering _why_ you want that. > > > And we don't want to call rcu_user_*() pairs on IRQs, you're > > introducing a serious performance regression here! And I'm talking about > > the code that's currently in -tip. > > Is there an easy way to fix it? For example, could we figure out what > makes it take so long and make it faster? Sure, just remove your arch IRQ hook. > If we need to, we could > back out the IRQ bit and change the assertions for 4.3, but I'd rather > keep the exact context tracking if at all possible. I have no idea what you mean by exact context tracking here. But If we ever want to call irq_enter() using arch hooks, and I have no idea why we would ever want to do that since that involve complexifying the code by $NR_ARCHS and moving C code to ASM, we need serious reasons! And that's certainly not something we are going to plan now for the next week's merge window. > >> That means that we can > >> avoid all of the (expensive!) checks for what context we're in. > > > > If you're referring to context tracking, the context check is a per-cpu > > read. Not something that's usually considered expensive. > > In -tip, there aren't even extra branches, except those imposed by the > user_exit implementation. No there is the "call enter_from_user_mode" in the IRQ fast path. > > > > >> It also means that (other than IRQs, which need further cleanup), we only > >> switch once per user/kernel switch. > > > > ??? > > In 4.2 and before, we can switch multiple times on the way out of the > kernel, via SCHEDULE_USER, do_notify_resume, etc. In -tip, we do it > exactly once no matter what. That's what we want for syscalls but not for IRQs. > > > > >> > >> The cost for doing should be essentially zero, modulo artifacts from > >> poor inlining. > > > > And modulo rcu_user_*() that do multiple costly atomic_add_return() > > operations > > implying full memory barriers. Plus the unnecessary vtime accounting that > > doubles > > the existing one in irq_enter/exit() (those even imply a lock currently, > > which will > > probably be turned to seqcount, but still, full memory barriers...). > > > > I'm sorry but I'm going to NACK any code that does that in IRQs (and again > > that > > concerns current tip:x86/asm). > > Why do we need these heavyweight barriers? Actually it's not full barriers but atomic ones (smp_mb__after_atomic_stuff()) I suspect we can't do much better given RCU requirements. Still we don't need to call it twice. > > If there's actually a measurable performance hit in IRQs in -tip, then > can we come up with a better fix? I'm sure it's very easily measurable. > For example, we could change all > the new CT_WARN_ON calls to check "are we in CONTEXT_KERNEL or in IRQ > context" and make the IRQ entry do a lighter weight context tracking > operation. I don't see what we need to check actually. Context tracking can be in any state while in IRQ. > > But I think I'm still missing something fundamental about the > performance: why is irq_enter() any faster than user_exit()? It's stlightly faster at least because it takes care of nesting IRQs which is likely with softirqs that get interrupted. Now of course we wouldn't call user_exit() in this case, but the hook is there in generic code, no need for anything from the arch. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

