Re: [ovirt-users] Multi-node cluster with local storage

Sahina Bose Fri, 04 Mar 2016 05:40:25 -0800


On 03/04/2016 05:30 PM, Pavel Gashev wrote:

On 04/03/16 13:50, "Sahina Bose" <sab...@redhat.com> wrote:

On 03/04/2016 04:13 PM, Pavel Gashev wrote:

On 04/03/16 12:22, "Sahina Bose" <sab...@redhat.com> wrote:

On 03/04/2016 02:14 AM, Pavel Gashev wrote:

Unfortunately, oVirt doesn't support multi-node local storage clusters.
And Gluster/CEPH doesn't work well over 1G network. It looks like that
the only way to use oVirt in a three-node cluster is to share local
storages over NFS. At least it makes possible to migrate VMs and move
disks among hardware nodes.

Do you know of reported problems with Gluster over 1Gb network? I think
10Gb is recommended, but 1Gb can also be used for gluster.
(We use it in our lab setup, and haven't encountered any issues so far
but of course, the workload may be different - hence the question)

Let's calculate. If I have a three node replicated gluster volume, each block 
writing on a node copies the block to the other two nodes. Thus, maximal write 
performance can't be above 50MB/s. Even it's acceptable for my workload, things 
get worse in failure recovering scenario. Gluster works with files. When a node 
fails and then recovers (even it's just a plain reboot), gluster copies the 
whole file over network if the file is changed during node outage. So if I have 
a 100GB VM disk, and guest system has written a 512-byte block to the disk, the 
whole 100GB will be copied during recovery. It might take 20 minutes for 100GB, 
and 3 hours for 1TB. And network will be 100% busy during recovery, so VMs on 
other nodes will wait for I/O most of time. In other words, a plain reboot of a 
node would result in datacenter out of service for several hours.

Things might be better if you have a distributed+replicated gluster volume. It 
requires at least six nodes. But things are still bad when you try to rebalance 
the volume after adding new bricks, or when a node has really failed and 
replaced.

Thus, 1GB network is ok for a lab, but it's not ok for production. IMHO.

Most of the problems that you outline here - related to healing and
replacing are addressed with the sharding translator. Sharding breaks
the large image file into smaller files, so that the entire file does
not have to be copied. More details here -
http://blog.gluster.org/2015/12/introducing-shard-translator/

Sure, I meant the same by mentioning distributed+replicated volumes. Actually, 
distributed+striped+replicated - 
https://access.redhat.com/documentation/en-US/Red_Hat_Storage/2.0/html/Administration_Guide/sect-User_Guide-Setting_Volumes-Distributed_Striped_Replicated.html

Ok. Sharding is not the same as striped volumes in gluster. Withstriping, like you mentioned, you would require more number of nodes toform the striped set in addition to the replica set.( so 6 nodes sinceyou need replica 3 )Sharding can however work with 3 nodes - so on the replica 3 glustervolume that you create, you can turn on the volume option"features.shard on", to turn on this feature.


_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] Multi-node cluster with local storage

Reply via email to