On Sun, 2009-04-12 at 02:59 -0400, Christoffer Dall wrote:
> Hi Hollis.
> 
> We are about to begin integration between QEMU and our KVM module for
> ARM, but we have a few architectural questions, for which we would
> very much appreciate your view based on your experience with the PPC
> implementation.

Please also CC a mailing list or two so that more people can help you,
or correct or elaborate on my answers...

By the way, if you have working kernel and userspace code (even
incomplete), you should consider posting it now for feedback and even
attract other developers or users. Right now it doesn't look like
there's any KVM ARM development going on at all, so potential users
might look elsewhere and never come back.

> First, it's our impression that we only need to implement the MMU
> functionality in the KVM module and QEMU takes care of the rest.
> However, we can see that the decrementer unit is implemented in KVM
> and not in QEMU. Is it not possible to let a timer reside in QEMU and
> inject interrupt into the guest through KVM_INTERRUPT when needed?

It is possible. However, in PowerPC, the decrementer timer is part of
the core. I believe on other architectures it's common for it to be an
off-core device, so there it would make sense to use qemu's emulation.
x86 KVM does this with the PIT, for example. 

(To complicate things slightly, x86 KVM emulates a few devices in the
kernel instead of using Qemu's emulation, in particular the PIT and the
APIC. They do this because those devices are accessed relatively
frequently, and they get better performance that way. This is not
entirely uncontroversial though.)

> Second, regarding interrupts in general, it seems that the QEMU
> architecture for delivering / receiving interrupts is generally
> adapted to work with KVM (through the kvm_arch_pre_run function) and
> thus from a high architectural point of view, this 'flow of things'
> does not need to be modified. Is this correct?

Right, you can see the interesting parts in kvm_cpu_exec(), which I
think is pretty easy to read. Basically qemu's KVM code polls qemu state
for pending interrupts. We could add a KVM hook into the qemu "set irq"
path, but instead we check for pending interrupts after the fact. I
think that's to minimize KVM-specific code changes to qemu.

> As far as we have gathered interrupts are sent to QEMU by signalling
> the process, and on each trap to the host, KVM will detect a pending
> signal, resume the QEMU process, which will deal with the interrupt in
> the main_loop_wait(...) function. Is this far off track?

Yes, I think you've got the basic idea. Qemu sets its file descriptors
to generate signals when host data is available, so that's how the
signals are generated in the first place. (Of course, host data doesn't
*necessarily* cause guest interrupts.)

So with -nographic, typing into the qemu tty could cause qemu's UART
emulation to raise an interrupt to the guest indicating there's stuff to
read from the virtual UART FIFO.

> Third, it seems that the PPC implementation only transfers a subset of
> the CPUPPCState fields to the kvm_regs struct and subsequently only a
> subset of the kvm_vcpu_arch struct. Is it possible in short to
> summarize how to determine what it is necessary to transfer?

The PowerPC implementation should transfer more state than it currently
does. What we do now is functional, but we completely omit FPU state and
a number of SPRs. If we want to do full-guest debugging with the Qemu
debugger, we'd need to fill in the rest. This is somewhat complicated by
the differing supervisor register set found in different PowerPC
implementations (e.g. 440 vs e500 vs 970).

-- 
Hollis Blanchard
IBM Linux Technology Center

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to