On 03/08/2010 07:57 PM, Cam Macdonell wrote:
Can you provide a spec that describes the device? This would be useful for
maintaining the code, writing guest drivers, and as a framework for review.
I'm not sure if you want the Qemu command-line part as part of the
spec here, but I've included for completeness.
I meant something from the guest's point of view, so command line syntax
is less important. It should be equally applicable to a real PCI card
that works with the same driver.
See http://ozlabs.org/~rusty/virtio-spec/ for an example.
The Inter-VM Shared Memory PCI device
-----------------------------------------------------------
BARs
The device supports two BARs. BAR0 is a 256-byte MMIO region to
support registers
(but might be extended in the future)
and BAR1 is used to map the shared memory object from the host. The size of
BAR1 is specified on the command-line and must be a power of 2 in size.
Registers
BAR0 currently supports 5 registers of 16-bits each.
Suggest making registers 32-bits, friendlier towards non-x86.
Registers are used
for synchronization between guests sharing the same memory object when
interrupts are supported (this requires using the shared memory server).
How does the driver detect whether interrupts are supported or not?
When using interrupts, VMs communicate with a shared memory server that passes
the shared memory object file descriptor using SCM_RIGHTS. The server assigns
each VM an ID number and sends this ID number to the Qemu process along with a
series of eventfd file descriptors, one per guest using the shared memory
server. These eventfds will be used to send interrupts between guests. Each
guest listens on the eventfd corresponding to their ID and may use the others
for sending interrupts to other guests.
enum ivshmem_registers {
IntrMask = 0,
IntrStatus = 2,
Doorbell = 4,
IVPosition = 6,
IVLiveList = 8
};
The first two registers are the interrupt mask and status registers.
Interrupts are triggered when a message is received on the guest's eventfd from
another VM. Writing to the 'Doorbell' register is how synchronization messages
are sent to other VMs.
The IVPosition register is read-only and reports the guest's ID number. The
IVLiveList register is also read-only and reports a bit vector of currently
live VM IDs.
That limits the number of guests to 16.
The Doorbell register is 16-bits, but is treated as two 8-bit values. The
upper 8-bits are used for the destination VM ID. The lower 8-bits are the
value which will be written to the destination VM and what the guest status
register will be set to when the interrupt is trigger is the destination guest.
What happens when two interrupts are sent back-to-back to the same
guest? Will the first status value be lost?
Also, reading the status register requires a vmexit. I suggest dropping
it and requiring the application to manage this information in the
shared memory area (where it could do proper queueing of multiple messages).
A value of 255 in the upper 8-bits will trigger a broadcast where the message
will be sent to all other guests.
Please consider adding:
- MSI support
- interrupt on a guest attaching/detaching to the shared memory device
With MSI you could also have the doorbell specify both guest ID and
vector number, which may be useful.
Thanks for this - it definitely makes reviewing easier.
--
error compiling committee.c: too many arguments to function