ron minnich wrote: > Avoiding too much detail, in the plan 9 world, read and write of data > to a disk is via file read and write system calls.
For low speed devices, I think paravirtualization doesn't make a lot of sense unless it's absolutely required. I don't know enough about s390 to know if it supports things like uarts but if so, then emulating a uart would in my mind make a lot more sense than a PV console device. > Same for a network. > Same for the mouse, the window system, the serial port, the console, > USB, and so on. Please see this note from IBM on what is > possible:http://domino.watson.ibm.com/library/CyberDig.nsf/0/c6c779bbf1650fa4852570670054f3ca?OpenDocument > or http://plan9.escet.urjc.es/iwp9/cready/PROSE_iwp9_2006.pdf > Different resources, same interface. In the hypervisor world, you > build one shared memory queue as a basic abstraction. On top of that > queue, you run 9P. The provider (network, block device, etc.) provides > certain resources to you, the guest domain The resources have names. A > network can look like this, to a kvm guest (this command from a Plan 9 > system): > cpu% ls /net/ether0 > /net/ether0/0 > /net/ether0/1 > /net/ether0/2 > /net/ether0/addr > /net/ether0/clone > /net/ether0/ifstats > /net/ether0/stats > This smells a bit like XenStore which I think most will agree was an unmitigated disaster. This sort of thing gets terribly complicated to deal with in the corner cases. Atomic operation of multiple read/write operations is difficult to express. Moreover, quite a lot of things are naturally expressed as a state machine which is not straight forward to do in this sort of model. This may have been all figured out in 9P but it's certainly not a simple thing to get right. I think a general rule of thumb for a virtualized environment is that the closer you stick to the way hardware tends to do things, the less likely you are to screw yourself up and the easier it will be for other platforms to support your devices. Implementing a full 9P client just to get console access in something like mini-os would be unfortunate. At least the posted s390 console driver behaves roughly like a uart so it's pretty obvious that it will be easy to implement in any OS that supports uarts already. Regards, Anthony Liguori > To get network stats, or do I/O, one simply gains access to the > appropriate ring buffer, by finding the name, and does the ring buffer > sends and receives via shared memory queues. The I/O operations can be > very efficient. > > Disk looks like this: > cpu% ls -l /dev/sdC0 > --rw-r----- S 0 bootes bootes 104857600 Jan 22 15:49 /dev/sdC0/9fat > --rw-r----- S 0 bootes bootes 65361213440 Jan 22 15:49 /dev/sdC0/arenas > --rw-r----- S 0 bootes bootes 0 Jan 22 15:49 /dev/sdC0/ctl > --rw-r----- S 0 bootes bootes 82348277760 Jan 22 15:49 /dev/sdC0/data > --rw-r----- S 0 bootes bootes 13072242688 Jan 22 15:49 /dev/sdC0/fossil > --rw-r----- S 0 bootes bootes 3268060672 Jan 22 15:49 /dev/sdC0/isect > --rw-r----- S 0 bootes bootes 512 Jan 22 15:49 /dev/sdC0/nvram > --rw-r----- S 0 bootes bootes 82343245824 Jan 22 15:49 /dev/sdC0/plan9 > -lrw------- S 0 bootes bootes 0 Jan 22 15:49 /dev/sdC0/raw > --rw-r----- S 0 bootes bootes 536870912 Jan 22 15:49 /dev/sdC0/swap > cpu% > > So the disk partitions are "files", with the "data" file being the > whole disk. Again, on a hypervisor system, to do I/O, software could > create a connection to the "file" and establish the in-memory ring > buffer, for that partition. This I/O can be very efficient; IBM > research is working on zero-copy mechanisms for moving data between > domains. > > The result is a single, consistent mechanism for accessing all > resources from a guest domain. The resources have names, and it is > easy to examine the status -- binary interfaces can be minimized. The > resources can be provided by in-kernel servers -- Linux drivers -- or > out-of-kernel servers -- proceses. Same interface, and yet the > implementation of the provider of the resource can be utterly > different. > > We had hoped to get something like this into Xen. On Xen, for example, > the block device and ethernet device interfaces are as different as > one could imagine. Disk I/O does not steal pages from the guest. The > network does. Disk I/O is in 4k chunks, period, with a bitmap > describing which of the 8 512-byte subunits are being sent. The enet > device, on read, returns a page with your packet, but also potentially > containing bits of other domain's packets too. The interfaces are as > dissimilar as they can be, and I see no reason for such a huge > variance between what are basically read/write devices. > > Another issue is that kvm, in its current form (-24) is beautifully > simple. These additions seem to detract from the beauty a bit. Might > it be worth taking a little time to consider these ideas in order to > preserve the basic elegance of KVM? > > So, before we go too far down the Xen-like paravirtualized device > route, can we discuss the way this ought to look a bit? > > thanks > > ron > > ------------------------------------------------------------------------- > This SF.net email is sponsored by DB2 Express > Download DB2 Express C - the FREE version of DB2 express and take > control of your XML. No limits. Just data. Click to get it now. > http://sourceforge.net/powerbar/db2/ > _______________________________________________ > kvm-devel mailing list > kvm-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/kvm-devel > > ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel