ron minnich wrote:
> Avoiding too much detail, in the plan 9 world, read and write of data
> to a disk is via file read and write system calls.

For low speed devices, I think paravirtualization doesn't make a lot of 
sense unless it's absolutely required.  I don't know enough about s390 
to know if it supports things like uarts but if so, then emulating a 
uart would in my mind make a lot more sense than a PV console device.

>  Same for a network.
> Same for the mouse, the window system, the serial port, the console,
> USB, and so on. Please see this note from IBM on what is
> possible:http://domino.watson.ibm.com/library/CyberDig.nsf/0/c6c779bbf1650fa4852570670054f3ca?OpenDocument
> or http://plan9.escet.urjc.es/iwp9/cready/PROSE_iwp9_2006.pdf
> Different resources, same interface. In the hypervisor world, you
> build one shared memory queue as a basic abstraction. On top of that
> queue, you run 9P. The provider (network, block device, etc.) provides
> certain resources to you, the guest domain The resources have names. A
> network can look like this, to a kvm guest (this command from a Plan 9
> system):
> cpu% ls /net/ether0
> /net/ether0/0
> /net/ether0/1
> /net/ether0/2
> /net/ether0/addr
> /net/ether0/clone
> /net/ether0/ifstats
> /net/ether0/stats
>   

This smells a bit like XenStore which I think most will agree was an 
unmitigated disaster.  This sort of thing gets terribly complicated to 
deal with in the corner cases.  Atomic operation of multiple read/write 
operations is difficult to express.  Moreover, quite a lot of things are 
naturally expressed as a state machine which is not straight forward to 
do in this sort of model.  This may have been all figured out in 9P but 
it's certainly not a simple thing to get right.

I think a general rule of thumb for a virtualized environment is that 
the closer you stick to the way hardware tends to do things, the less 
likely you are to screw yourself up and the easier it will be for other 
platforms to support your devices.  Implementing a full 9P client just 
to get console access in something like mini-os would be unfortunate.  
At least the posted s390 console driver behaves roughly like a uart so 
it's pretty obvious that it will be easy to implement in any OS that 
supports uarts already.

Regards,

Anthony Liguori

> To get network stats, or do I/O, one simply gains access to the
> appropriate ring buffer, by finding the name, and does the ring buffer
> sends and receives via shared memory queues. The I/O operations can be
> very efficient.
>
> Disk looks like this:
> cpu% ls -l /dev/sdC0
> --rw-r----- S 0 bootes bootes   104857600 Jan 22 15:49 /dev/sdC0/9fat
> --rw-r----- S 0 bootes bootes 65361213440 Jan 22 15:49 /dev/sdC0/arenas
> --rw-r----- S 0 bootes bootes           0 Jan 22 15:49 /dev/sdC0/ctl
> --rw-r----- S 0 bootes bootes 82348277760 Jan 22 15:49 /dev/sdC0/data
> --rw-r----- S 0 bootes bootes 13072242688 Jan 22 15:49 /dev/sdC0/fossil
> --rw-r----- S 0 bootes bootes  3268060672 Jan 22 15:49 /dev/sdC0/isect
> --rw-r----- S 0 bootes bootes         512 Jan 22 15:49 /dev/sdC0/nvram
> --rw-r----- S 0 bootes bootes 82343245824 Jan 22 15:49 /dev/sdC0/plan9
> -lrw------- S 0 bootes bootes           0 Jan 22 15:49 /dev/sdC0/raw
> --rw-r----- S 0 bootes bootes   536870912 Jan 22 15:49 /dev/sdC0/swap
> cpu%
>
> So the disk partitions are "files", with the "data" file being the
> whole disk. Again, on a hypervisor system, to do I/O, software could
> create a connection to the "file" and establish the in-memory ring
> buffer, for that partition. This I/O can be very efficient; IBM
> research is working on zero-copy mechanisms for moving data between
> domains.
>
> The result is a single, consistent mechanism for accessing all
> resources from a guest domain. The resources have names, and it is
> easy to examine the status -- binary interfaces can be minimized. The
> resources can be provided by in-kernel servers -- Linux drivers -- or
> out-of-kernel servers -- proceses. Same interface, and yet the
> implementation of the provider of the resource can be utterly
> different.
>
> We had hoped to get something like this into Xen. On Xen, for example,
> the block device and ethernet device interfaces are as different as
> one could imagine. Disk I/O does not steal pages from the guest. The
> network does. Disk I/O is in 4k chunks, period, with a bitmap
> describing which of the 8 512-byte subunits are being sent. The enet
> device, on read, returns a page with your packet, but also potentially
> containing bits of other domain's packets too. The interfaces are as
> dissimilar as they can be, and I see no reason for such a huge
> variance between what are basically read/write devices.
>
> Another issue is that kvm, in its current form (-24) is beautifully
> simple. These additions seem to detract from the beauty a  bit. Might
> it be worth taking a little time to consider these ideas in order to
> preserve the basic elegance of KVM?
>
> So, before we go too far down the Xen-like paravirtualized device
> route, can we discuss the way this ought to look a bit?
>
> thanks
>
> ron
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by DB2 Express
> Download DB2 Express C - the FREE version of DB2 express and take
> control of your XML. No limits. Just data. Click to get it now.
> http://sourceforge.net/powerbar/db2/
> _______________________________________________
> kvm-devel mailing list
> kvm-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/kvm-devel
>
>   


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Reply via email to