>>>>> "JM" == Jean-Yves Migeon <jeanyves.mig...@free.fr> writes:
JM> On 21.08.2011 12:26, Jean-Yves Migeon wrote: >> - second, the lock is not placed at the correct abstraction level >> IMHO, it is way too high in the caller/callee hierarchy. It >> should remain hidden from most consumers of the xpq_queue API, >> and should only be used to protect the xpq_queue array together >> with its counters (and everything that isn't safe for all memory >> operations happening in xpq). >> I agree - part of the reason I did this was to make sure I didn't mess with the current queueing scheme before having a working implementation/PoC. >> Reason behind this is that your lock protects calls to hypervisor >> MMU operations, which are hypercalls (hence, a "slow" operation >> with regard to kernel). You are serializing lots of memory >> operations, something that should not happen from a performance >> point of view (some may take a faire amount of cycles to >> complete, like TLB flushes). I'd expect all Xen MMU hypercalls to >> be reentrant. I agree - it's not meant to be an efficient implementation - just a correct one. JM> An alternative would be to have per-CPU xpq_queue[] also. This JM> is not completely stupid, xpq_queue is meant as a way to put JM> multiple MMU operations in a queue asynchronously before issuing JM> only one hypercall to handle them all. This is slightly more complicated than it appears. Some of the "ops" in a per-cpu queue may have ordering dependencies with other cpu queues, and I think this would be hard to express trivially. (an example would be a pte update on one queue, and reading the same pte read on another queue - these cases are quite analogous (although completely unrelated) to classic RAW and other ordering dependencies in out-of-order execution scenarios due to pipelining, etc.). I'm thinking that it might be easier and more justifiable to nuke the current queue scheme and implement shadow page tables, which would fit more naturally and efficiently with CAS pte updates, etc. Otherwise, we'll have to re-invent scoreboarding or other dynamic scheduling tricks - I think that's a bit overkill :-) Cheers, -- Cherry