On 11/16/2009 11:22 PM, Jan Kiszka wrote:
Avi Kivity wrote:
On 11/16/2009 07:00 PM, Jan Kiszka wrote:
This patch aims at addressing the mp_state writeback issue in a cleaner
fashion.
What's the issue?  the fact that mp_state is updated whenever state is
synchronized, while it could be simultaneously updated from other vcpus
(which latter updates are then lost)?
Right, the issue b8a7857071 addressed. But that approach spreads more
kvm_* fragments in unrelated qemu code, e.g. the monitor, and fails to
update other parts (gdbstub). And it doesn't care about what happens if
kvm is off at build or runtime. Such things are better addressed in
upstream by encapsulating kvm calls in synchronization points.

Note we have the same issue with nmi and the sipi vector - any vcpu state that is updated outside the vcpu thread. These are particularly bad since we can't exclude them from updates without excluding other state as well.

The whole issue is tricky. I'm inclined to pretend we never meant any vcpu state (outside lapic) to be asynchronous and declare the whole thing a bug. We could fix it by modeling external changes to state (INIT, SIPI, NMI) as messages queued to the vcpu, to be processed in the vcpu thread. The queue would be drained before running the vcpu or before reading state from userspace, so the message queue contents can never be observed and never lost.

Of course, we can't really implement this as a queue (SIGSTOP vcpu thread -> overflow), but a word is sufficient. INIT writes the word, everything else uses compare-and-swap or set_bit to raise events (e.g. SIPI = do { oldq = vcpu->queue; newq = (oldq & ~SIPI_MASK) | sipi_vector | RUNNING; } while (!cas(&vcpu->queue, oldq, newq)))

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to