On Sun, Dec 7, 2014 at 10:40 AM, Robert Jarzmik <[email protected]> wrote: > Hi Andy, > > Andy Lutomirski <[email protected]> writes: >> On Dec 6, 2014 2:31 AM, "Robert Jarzmik" <[email protected]> wrote: >>> We would have a "LBR resource" variable to track who owns the LBR : >>> - nobody : LBR_UNCLAIMED >>> - the exception handler : LBR_EXCEPTION_DEBUG_USAGE >> >> Which exception handler? There can be several on the stack. > All of them, ie. LBR is used by exception handlers, ie. perf cannot use it, > just > as what Emmanuel's patch is doing I think. Or said differently LBR are > reserved > for expeption handlers only, whichever have the implementation to use them. > >>> - case 3d: kernel exception with a reschedule inside >>> -> exception entry >>> -> test lbr_dump_state == EXCEPTION_OWNED => true => STOP LBR >>> -> exception handling >>> -> context_switch() >>> -> perf cannot touch LBR, nobody can >>> -> test lbr_dump_state == EXCEPTION_OWNED => true => START LBR >> >> Careful. This is still the nested exception, and it just did the wrong thing. > Can you be more explicit about the "wrong" thing ? And would that wrong thing > be > solved by a per-cpu reference counter ?
Suppose you have an int3 with a page fault inside. If the int3 disabled LBR, then the int3 should re-enable it, and the page fault should not. This means that, if the inner page fault is, in fact, an OOPS, then you don't get the LBR trace. A per-cpu reference counter would solve it. So would using rdmsr instead of wrmsr, because there would be nothing to re-enable. (The latter also means that both exceptions get the LBR trace if they turn out to be OOPSes.) But a per-cpu reference counter still has the per-cpu issue below. > >>> I might be very wrong in the description as I'm not that sharp on x86, but >>> is >>> there a flaw in the above cases ? >>> >>> If not, a couple of tests and Thomas's per-cpu variable can solve the issue, >>> while keeping the exception handler code simple as Emmanual has proposed >> (given >>> the additionnal test inclusion - which will be designed to not pollute the >> LBR), >>> and having a small impact on perf to solve the resource acquire issue. >> >> On current kernels, percpu memory is vmalloced, so accessing it can fault, so >> you can't touch percpu memory at all from page_fault until the vmalloc fixup >> runs. Sorry :( > What about INIT_PER_CPU_VAR (as in gdt_page) ? Won't that be mapped all the > time > without need for faulting in pages ? I'm not sure. It may not if CPUs are hotplugged. > >> This is a problem with rdmsr, too. > You mean rdmsr can fault in a non-hypervisor environment ? Because that > definetely opens a new range of corner cases. > >> It may be worth fixing that. In fact, it may be worth getting rid of lazy >> vmap >> entirely. > Your battle ? ;) > > Anyway, would a static per-cpu variable (or variables, one about resources > usage, one reference counter) solve our cases (ie. 3d) ? > Possibly, but only if static per-cpu reference counters are safe to touch in the exception entry code. Tejun? --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

