On 07/11/2012 02:52 PM, Alexander Graf wrote: > > On 11.07.2012, at 13:23, Avi Kivity wrote: > >> On 07/11/2012 02:16 PM, Alexander Graf wrote: >>>> >>>>> yes the data structure itself seems based on the algorithm >>>>> and not on arch specific things. That should work. If we move that to >>>>> common >>>>> code then s390 will use that scheme automatically for the cases were we >>>>> call >>>>> kvm_vcpu_on_spin(). All others archs as well. >>>> >>>> ARM doesn't have an instruction for cpu_relax(), so it can't intercept >>>> it. Given ppc's dislike of overcommit, >>> >>> What dislike of overcommit? >> >> I understood ppc virtualization is more of the partitioning sort. >> Perhaps I misunderstood it. But the reliance on device assignment, the >> restrictions on scheduling, etc. all point to it. > > Well, you need to distinguish the different PPC targets here. In the embedded > world, partitioning is obviously the biggest use case, though overcommit is > possible. For big servers however, we usually do want overcommit and we do > support it within the constraints hardware gives us. > > It's really no different from x86 when it comes to the use case wideness :).
Okay, thanks for the correction. > >> >>> >>>> and the way it implements cpu_relax() by adjusting hw thread priority, >>> >>> Yeah, I don't think we can intercept relaxing. >> >> ... and the lack of ability to intercept cpu_relax() ... >> >>> It's basically a nop-like instruction that gives hardware hints on its >>> current priorities. >> >> That's what x88 PAUSE does. But we can intercept it (and not just any >> execution - we can restrict intercept to tight loops executed more than >> a specific number of times). > > Yeah, it's pretty hard to fetch that information from PPC, since unlike x86 > we split the hint from the loop. But I'll let Ben speak to that, he certainly > knows way better how the hardware works. On x86 it's split from the loop as well (inside cpu_relax() like ppc). But the hardware detects the loop and lets us know about it. > >> >>> That said, we can always add PV code. >> >> Sure, but that's defeated by advancements like self-tuning PLE exits. >> It's hard to get this right. > > Well, eventually everything we do in PV is going to be moot as soon as > hardware catches up. In most cases from what I've seen it's only useful as an > interim solution. But for that time it's good to have :). Depends on how interim it is. If better hardware is coming, I'd rather not add more pv-ness. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/