Wow great explanation! Thank you Eric! On Sat, 5 Aug 2017 at 14:59 Eric Green <eric.lee.gr...@gmail.com> wrote:
> qcow2 performance has been historically bad regardless of the underlying > storage (it is an absolutely terrible storage format), which is why most > OpenStack Kilo and later installations instead usually use managed LVM and > present LVM volumes as iSCSI volumes to QEMU, because using raw LVM volumes > directly works quite a bit better (especially since you can do "thick" > volumes, which get you the best performance, without having to zero out a > large file on disk). But Cloudstack doesn't use that paradigm. Still, you > can get much better performance with qcow2 regardless: > > 1) Create a disk offering that creates 'sparse' qcow2 volumes (the > 'sparse' provisioning type). Otherwise every write is actually multiple > writes -- one to extend the previous qcow2 file, one to update the inode > with the new file size, and one to update the qcow2 file's own notion of > how long it is and what all of its sections are, and one to write the > actual data. And these are all *small* random writes, which SSD's have > historically been bad at due to write zones. Note that if you look at a > freshly provisioned 'sparse' file in the actual data store, it might look > like it's taking up 2tb of space, but it's actually taking up only a few > blocks. > > 2) In that disk offering, if you care more about performance than about > reliability, set the caching mode to 'writeback'. (The default is 'none'). > This will result in larger writes to the SSD, which it'll do at higher > rates of speeds than small writes. The downside is that your hardware and > OS better be *ultra* reliable with battery backup and clean shutdown in > case of power failure and etc., or the data in question is toast if > something crashes or the power goes out. So consider how important the data > is before selecting this option. > > 3) If you have a lot of time and want to pre-provision your disks in full, > in that disk offering set the provisioning type to 'fat'. This will > pre-zero a qcow2 file of the full size that you selected. Be aware that > Cloudstack does this zeroing of a volume commissioned with this offering > type *when you attach it to a virtual machine*, not when you create it. So > attach it to a "trash" virtual machine first before you attach it to your > "real" virtual machine, unless you want a lot of downtime waiting for it to > zero. But assuming you have a host filesystem that properly allocates files > on a per-extent basis, and the extents match up with the underlying SSD > write block size well, you should be able to get within 5% of hardware > performance with 'fat' qcow2. (With 'thin' you can still come within 10% of > that, which is why 'thin' might be the best for most workloads that require > performance, and 'thin' doesn't waste space on blocks that have never been > written and doesn't tie up your storage system for hours zeroing out a 2tb > qcow2 file, so consider that if thinking 'fat'). > > 4) USE XFS AS THE HOST FILESYSTEM FOR THE DATASTORE. ext4 will be > *terrible*. I'm not sure what causes the bad will between ext4 on the > storage host and qcow2, but I've seen it multiple times in my own testing > of raw libvirt (no CloudStack). As for btrfs, btrfs will be terrible with > regular 'thin' qcow2. There is an interaction between its write cycles and > qcow2's write patterns that, as with ext4, causes very slow performance. I > have not tested sparse qcow2 with btrfs because I don't trust btrfs, it has > many design decisions reminiscent of ReiserFS, which ate many Linux > filesystems back during the day. I have not tested ZFS. The ZFS on Linux > implementation generally has good but not great performance, it was written > for reliability, not performance, so it seemed a waste of my time to test > it. I may do that this weekend however just to see. I inherited a PCIe M2.e > SSD, you see, and want to see what having that as the write cache device > will do for performance.... > > 5) For the guest filesystem it really depends on your workload and the > guest OS. I love ext4 for reliability inside a virtual machine, because you > can't just lose an entire ext4 filesystem (it's based on ext2/ext3, which > in turn were created when hardware was much less reliable than today and > thus has a lot of features to keep you from losing an entire filesystem > just because a few blocks went AWOL), but it's not a very fast filesystem. > Xfs in my testing has the best performance for virtually all workloads. > Generally, I use ext4 for root volumes, and make decisions for data volumes > based upon how important the performance versus reliability equation works > out for me. I have a lot of ext4 filesystems hanging around for data that > basically sits there in place without many writes but which I don't want to > lose. > > For best performance of all, manage this SSD storage *outside* of > Cloudstack as a bunch of LVM volumes which are exported to virtual machine > guests via LIO (iSCSI). Even 'sparse' LVM volumes perform better than qcow2 > 'thin' volumes. If you choose to do that, there's some LIO settings that'll > make things faster for a write-heavy load. But you'll be managing your > volumes manually yourself rather than having Cloudstack do it. Which is OK > if you're provisioning a database server under Cloudstack and don't intend > on offering this expensive SSD storage to all customers, but obviously > doesn't scale to ISP public cloud levels. For that you'll need to figure > out how to integrate Cloudstack with something like Cinder which can do > this exporting in an automated fashion. > > > On Aug 5, 2017, at 09:29, Rodrigo Baldasso <rodr...@loophost.com.br> > wrote: > > > > Yes.. mounting an lvm volume inside the host works great, ~500Mb/s write > speed.. inside the guest i'm using ext4 but the speed is aroung 30mb/s. > > > > - - - - - - - - - - - - - - - - - - - > > > > Rodrigo Baldasso - LHOST > > > > (51) 9 8419-9861 > > - - - - - - - - - - - - - - - - - - - > > On 05/08/2017 13:26:00, Ivan Kudryavtsev <kudryavtsev...@bw-sw.com> > wrote: > > Rodrigo, is your fio testing shows great results? What filesystem you are > > using? KVM is known to work very bad over BTRFS. > > > > 5 авг. 2017 г. 23:16 пользователь "Rodrigo Baldasso" > > rodr...@loophost.com.br> написал: > > > > Hi Ivan, > > > > In fact i'm testing using local storage.. but on NFS I was getting > similar > > results also. > > > > Thanks! > > > > - - - - - - - - - - - - - - - - - - - > > > > Rodrigo Baldasso - LHOST > > > > (51) 9 8419-9861 > > - - - - - - - - - - - - - - - - - - - > > On 05/08/2017 13:03:24, Ivan Kudryavtsev wrote: > > Hi, Rodrigo. It looks strange. Check your NFSconfiguration and network > > errors, loss. It should work great. > > > > 5 авг. 2017 г. 22:22 пользователь "Rodrigo Baldasso" > > rodr...@loophost.com.br> написал: > > > > Hi everyone, > > > > I'm having trouble to archive a good I/O rate using cloudstack qcow2 with > > any type of caching (or even disabled). > > > > We have some RAID-5e SSD arrays which give us a very good rates directly > on > > the node/host, but on the guest the speed is terrible. > > > > Does anyone knows a solution/workaround for this? I never used qcow (only > > raw+lvm) so I don't know much to do to solve this. > > > > Thanks! > >