On Mon, 2007-06-25 at 21:54 +0800, Dong, Eddie wrote: > > >-----Original Message----- > >From: Gregory Haskins [mailto:[EMAIL PROTECTED] > >Sent: 2007年6月25日 21:43 > >To: [EMAIL PROTECTED] > >Cc: Dong, Eddie; kvm-devel@lists.sourceforge.net > >Subject: Re: [kvm-devel] In kernel PIC support: kernel patch > > > >On Sat, 2007-06-23 at 20:41 +0300, Avi Kivity wrote: > >> Dong, Eddie wrote: > >> > 3: IOAPIC position > >> > Though it looks like neutral to have this one in user or kernel > >> > space, > >> > but I'd like to suggest we only support one model. > >Considering future > >> > VT-d > >> > case, hypervisor need to inject an IRQ directly in KVM (still thru > >> > IOAPIC) > >> > without going to user level, so probably moving IOAPIC to > >kernel is good > >> > > >> > in this point. > >> > > >> > >> Even paravirt device drivers will want to inject IRQs via the ioapic > >> (when the guest is not paravirt_ops enabled, like older Linux and > >> Windows). > > > >Note that its probably not a requirement to do so. The IOAPIC > >essentially provides a standard mechanism to map input "irq pins" to > >APIC messages. A pv driver could conceivably call kvm_apicbus_send() > >directly if I knew its vector instead of calling ioapic_setpin() and it > >would achieve the same goal. > > Basically we are talking about same thing. calling kvm_apicbus_send or > raise a virtual ioapic pin is basically same. The minor different is who does > the translation from pin to APIC message, device model or IOAPIC model. > > Using IOAPIC pin probably makes PV driver writter a little bit easier and > may be reused by other VMM easily in future.
I fully agree with everything you say here. I just wanted to point out that pv-devices dont have a hard-requirement for an IOAPIC connection in case it makes something easier for someone. ;) > > > > >What is nice about using an IOAPIC is that the irq routing can be done > >for you automatically by any ACPI compliant OS. The device just has to > >say "raise my output pin" and the IOAPIC translates that into a vector. > >If you forgo the IOAPIC in favor of direct APIC messages, you have to > >solve the problem of irq-routing a different way. > > > >Note that there is another standard that would allow us to use built-in > >routing without ACPI/IOAPIC: MSI. I know the battle over > >virtual-bus-rendering is still raging w.r.t. to PCI or not-PCI. But I > >will point out that if we do use PCI, setting the PV devices up as MSI > >capable in the config space potentially eliminates the need to > >also wire > >them to an IOAPIC. They will get their vector data from the MSI setup > >and can then send APIC messages directly. (I think there is some > >confusion about how we can do MSI later in this thread...I > >will reply to > >those mails separately) > > Yes, all you mentioned here is correct, but it doesn;t impact the > necessity of moving IOAPIC to kernel, see below. > > > > >Thoughts? > > > > > > > >> It's probably okay to implement the device side of a block > >> device in qemu, but more difficult for networking. If we have device > >> implementations in the kernel then we'll need an ioapic in > >the kernel. > > > >> > >> Also, if we end up sharing interrupts between the kernel and > >userspace, > >> we'll need the kernel to perform the OR between the level > >specified by > >> the kernel devices and the level specified by userspace. > > > >I not sure I fully understand what you are getting at here. It sounds > >like you are talking about splitting a single IOAPIC into both a user > >and kernel space device? If we do end up needing the IOAPIC in the > > No, the IOAPIC is in kernel, but some device such as IDE are in user while > others > are in kernel such as pv NIC. > > Due to this, a kernel pv driver will ask for IOAPIC service, so we have to > keep > it in kernel, otherwise we have to go back to user when kernel device fires, > which > is very slow. > > But, like you mentio> it helps, but then 1: we have to implement MSI now, 2: > a little bit more complicated > in writting pv driver. Ok, I think I understand what Avi was saying now. The "or" operation is in reference to a shared irq-line coming into the IOAPIC. e.g. We need to maintain a logical OR against the current level of a particular line. I thought he was talking about maintaining internal IOAPIC state between user/kern, which is probably messier than two IOAPICs. ;) > > >kernel and userspace, I would suggest using two IOAPICs (we > >just need to > >update the ACPI tables to say there are two) so that they can operate > >independently. This should be supportable from a virtual system > > It can, but a little bit more complicated, and probably unable to live > migration between KVM and Qemu since Qemu only has one IOAPIC. Good point, but is that a design requirement? > > >infrastructure standpoint, and is much cleaner IMHO. If you were > >talking about a different scenario. please clarify. > > > > Thx,eddie ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel