> On Aug 5, 2017, at 21:03, Ivan Kudryavtsev <kudryavtsev...@bw-sw.com> wrote: > > Hi, I think Eric's comments are too tough. E.g. I have 11xSSD 1TB with > linux soft raid 5 and Ext4 and it works like a charm without special > tunning. > > Qcow2 also not so bad. LVM2 does it better of course (if not being > snapshotted). Our users have different workloads and nobody claims disk > performance is a problem. Read/write 100 MB/sec over 10G connection is not > a problem at all for the setup specified above.
100 MB/sec is the speed of a single vintage 2010 5200 RPM SATA-2 drive. For many people, that is not a problem. For some, it is. For example, I have a 12x-SSD RAID10 for a database. This RAID10 is on a SAS2 bus with 4 channels thus capable of 2.4 gigaBYTES per second raw throughput. Yes, I have validated that the SAS2 bus is the limit on throughput for my SSD array. If I provided a qcow2 volume to the database instance that only managed 100MB/sec, my database people would howl. I have many virtual machines that run quite happily with thin qcow2 volumes on 12-disk RAID6 XFS datastores (spinning storage) with no problem, because they don't care about disk throughput, they are there to process data, or provide services like DNS or a Wiki knowledge base, or otherwise do things that aren't particularly time-critical in our environment. So it's all about your customer and his needs. For maximum throughput, qcow2 on a ext4 soft RAID capable of doing 100Mb/sec is very... 2010 spinning storage... and people who need more than that, like database people, will be extremely dissatisfied. Thus my suggestions of ways to improve performance via providing a custom disk offering for those cases where disk performance and specifically write performance is a problem -- switching to 'sparse' rather than 'thin' as the provisioning mechanism (which greatly speeds writes since now only the filesystem block allocation mechanisms get invoked, rather than qcow2's block allocation mechanisms, and qcow2 now only has a single allocation zone which greatly speeds its own lookups), using a different underlying filesystem that has proven to have more consistent performance (xfs isn't much faster than ext4 under most scenarios but doesn't have the lengthy dropouts in performance that come with lots of writes on ext4), and possibly flipping on async caching in the disk offering if data integrity isn't a problem (for example, for an Elasticsearch instance, the data is all replicated across multiple nodes on multiple datastores anyhow, so if I lose an Elasticsearch node's data so what? I just destroy that instance and create a new one to join to the cluster!). And of course there's always the option of simply avoiding qcow2 altogether and providing the data via iSCSI or NFS directly to the instance, which may be what you need to do for something like a database that has some very specific performance and throughput requirements.