On 02/08/2010 12:09 AM, Anthony Liguori wrote:
On 02/07/2010 10:31 AM, Avi Kivity wrote:
Only insofar as you don't have to deal with getting at the VM fd.
You can avoid the problem by having the kvm ioctl interface take a
pid or something.
That's a racy interface.
The mechanism itself is racy. That said, pid's don't recycle very
quickly so the chances of running into a practical issue is quite small.
While a low probability of a race is acceptable for a test tool, it
isn't for a kernel interface.
Well, we need to provide a reasonable alternative.
I think this is the sort of thing that really needs to be a utility
that lives outside of qemu. I'm absolutely in favor of exposing
enough internals to let people do interesting things provided it's
reasonably correct.
I agree that's desirable. However in light of the changable
gpa->hva->hpa mappings, this may not be feasible.
One might be to use -mempath (which is hacky by itself, but so far we
have no alternative) and use an external tool on the memory object to
poison it. An advantage is that you can use it independently of kvm.
It would help if the actual requirements were spelled out a bit more.
What exactly needs validating? Do we need to validate that a
poisoning a host physical address results in a very particular guest
page getting poisoned?
Is it not enough to just choose a random anonymous memory area within
the qemu process, generate an MCE to that location, see whether qemu
SIGBUS's. If it doesn't, validate that an MCE has been received in
the guest?
/proc/pid/pagemap may help, though that's racy too. If you pick the
largest vma (or use -mempath) you're pretty much guaranteed to hit on
the guest memory area.
But FWIW, I think a set of per-VM directories in sysfs could be very
useful for this sort of debugging.
Maybe we should consider having the equivalent of a QMP-for-debugging
session. This would be a special QMP session that we basically
provided no compatibility or even sanity guarantees that was
specifically there for debugging. I would expect that it be disabled
in any production build (even perhaps even by default in the general
build).
We have 'info cpus' that shows the vcpu->thread mappings, allowing
management to pin cpus. Why not have 'info memory' that shows guest
numa nodes and host virtual addresses? The migrate_pages() syscall
takes a pid so it can be used by qemu's controller to load-balance a
numa machine, and this can also be used by the poisoner to do its work.
--
error compiling committee.c: too many arguments to function