Hi, This set of patches enables QEMU to boot VM images from gluster volumes. This is achieved by adding gluster as a new block backend driver in QEMU. Its already possible to boot from VM images on gluster volumes using Fuse mount, but this patchset provides the ability to boot VM images from gluster volumes by by-passing the FUSE layer in gluster. In case the image is present on the local system, it is possible to even bypass client and server translator and hence the RPC overhead.
QEMU with gluster backend support will take the volume file on command line and then link to libglusterfs library to perform IO to image on gluster volume. block/gluster-helpers.c has bare minimum gluster code that is necessary for QEMU to boot and work with image on gluster volume. I have implemented routines like gluster_create, gluster_open, gluster_aio_readv etc which will eventually not be necessary when we have equivalent routines in libglusterfsclient working. While I have this implementation here, we are also working actively on resurrecting libglusterfsclient and using QEMU with it. In addition to posix routines, block/gluster-helpers.c has some elaborate lookup code which also will become redundant with libglusterfsclient. The patches are experimental in nature and I have only verified that I can boot an image from gluster volume using these patches in fuse-bypass and rpc-bypass modes. I haven't tested with full-blown version of volume file (that is generated by gluster CLI), but have always use only hand crafted volume files with just posix translator in it. How to use this patchset ======================== 1. Compiling GlusterFS - Get GlusterFS source from git://git.gluster.com/glusterfs.git - Compile and install # ./autogen.sh; ./configure; make; make install - Copy a few required header files and libraries # mkdir /usr/local/include/glusterfs/ # cp glusterfs/libglusterfs/src/*.h /usr/local/include/glusterfs/ # cp glusterfs/config.h /usr/local/include/glusterfs/ # cp glusterfs/contrib/uuid/uuid.h /usr/local/include/glusterfs/ 2. Compiling QEMU - Get QEMU sources - Apply the patches from this patchset. - Configure # ./configure --disable-werror --target-list=x86_64-softmmu --enable-glusterfs --enable-uuid - make; make install Note: I have to resort to --disable-werror to mainly tide over the warnings in block/gluster-helpers.c. I didn't spent too much effort in cleaning this up since this code will be gone once we have a working libglusterfsclient. 3. Starting GlusterFS server # glusterfsd -f s-qemu.vol # cat s-qemu.vol volume vm type storage/posix option directory /vm end-volume volume server type protocol/server subvolumes vm option transport-type tcp option auth.addr.vm.allow * end-volume Here /vm is the directory exported by the server. Ensure that this directory is present before GlusterFS server is started. 4. Creating VM image # qemu-img create -f gluster gluster:c-qemu.vol:/F16 5G # cat c-qemu.vol volume client type protocol/client option transport-type tcp option remote-host bharata option remote-subvolume vm end-volume 5. Install a distro (say Fedora16) on the VM image # qemu-system-x86_64 --enable-kvm -smp 4 -m 1024 -drive file=gluster:c-qemu.vol:/F16,format=gluster -cdrom Fedora-16-x86_64-DVD.iso After this follow the normal F16 installation. Next time onwards, the following QEMU command can be used to directly start the VM. 6. Start the VM (Fuse-bypass) # qemu-system-x86_64 --enable-kvm --nographic -smp 4 -m 1024 -drive file=gluster:c-qemu.vol:/F16,format=gluster 6a. Booting VM in RPC-bypass mode. # cat c-qemu-rpcbypass.vol volume vm type storage/posix option directory /vm end-volume # qemu-system-x86_64 --enable-kvm --nographic -smp 4 -m 1024 -drive file=gluster:c-qemu-rpcbypass.vol:/F16,format=gluster Note that in this case, its not necessary to run a gluster server. Tests ===== I have done some initial tests using fio. Here are the details: Environment ----------- Dual core x86_64 laptop QEMU (f8687bab919672ccd) GlusterFS (c40b73fc453caf12) Guest: Fedora 16 (kernel 3.1.0-7.fc16.x86_64) Host: Fedora 16 (kernel 3.4) fio-HEAD-47ea504 fio jobfile ----------- # cat aio-read-direct-seq ; Read 4 files with aio at different depths [global] ioengine=libaio direct=1 rw=read bs=128k size=512m directory=/data1 [file1] iodepth=4 [file2] iodepth=32 [file3] iodepth=8 [file4] iodepth=16 Base ---- QEMU: qemu-system-x86_64 --enable-kvm --nographic -m 1024 -smp 4 -drive file=/vm/dir1/F16,cache=none Fuse mount ---------- Server: glusterfsd -f s-qemu.vol Client: glusterfs -f c-qemu.vol /mnt QEMU: qemu-system-x86_64 --enable-kvm --nographic -m 1024 -smp 4 -drive file=/mnt/dir1/F16,cache=none Fuse bypass ----------- Server: glusterfsd -f s-qemu.vol QEMU: qemu-system-x86_64 --enable-kvm --nographic -m 1024 -smp 4 -drive file=gluster:/c-qemu.vol:/dir1/F16,format=gluster,cache=none RPC bypass ---------- QEMU: qemu-system-x86_64 --enable-kvm --nographic -m 1024 -smp 4 -drive file=gluster:/c-qemu-rpcbypass.vol:/dir1/F16,format=gluster,cache=none Numbers (aggrb, minb and maxb in kB/s. mint and maxt in msec) ------- aggrb minb maxb mint maxt Base 72916 18229 18945 27673 28761 Fuse mount 8211 2052 3094 169433 255396 Fuse bypass 66591 16647 17806 29444 31493 RPC bypass 70940 17735 18782 27914 29562 Note that these are just indicative numbers and I haven't really tuned QEMU or GlusterFS or fio to achieve best results. However with this test we can see that Fuse mount case is not ideal and Fuse bypass and RPC bypass help. Regards, Bharata.