[Gluster-users] Virtual machines and self-healing on GlusterFS v3.3

Dario Berzano Fri, 14 Sep 2012 06:27:45 -0700

Hello,

  in our computing centre we have an infrastructure with a GlusterFS volume 
made of two bricks in replicated mode:



Volume Name: VmDir
Type: Replicate
Volume ID: 9aab85df-505c-460a-9e5b-381b1bf3c030
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: one-san-01:/bricks/VmDir01
Brick2: one-san-02:/bricks/VmDir02


We are using this volume to store running images of some KVM virtual machines 
and thought we could benefit from the replicated storage in order to achieve 
more robustness as well as the ability to live-migrate VMs.

Our GlusterFS volume VmDir is mounted on several (three at the moment) 
hypervisors.

However, in many cases (but it is difficult to reproduce: best way is to stress 
VM I/O), either when one brick becomes unavailable for some reason, or when we 
perform live migrations, virtual machines decide to remount filesystems from 
their virtual disks in read-only. At the same time, on the hypervisors mounting 
the GlusterFS partitions, we spot some kernel messages like:


  INFO: task kvm:13560 blocked for more than 120 seconds.


By googling it I have found some "workarounds" to mitigate this problem, like 
mounting disks within virtual machines with barrier=0:

  http://invalidlogic.com/2012/04/28/ubuntu-precise-on-xenserver-disk-errors/

but I actually fear to damage my virtual machine disks by doing such a thing!

AFAIK from GlusterFS v3.3 self-healing should be performed server-side (and no 
self-healing at all is performed on the clients and by granularly locking big 
files). When I connect to my GlusterFS pool, if I monitor the self-healing 
status continuously:

watch -n1 'gluster volume heal VmDir info'

I obtain an output like:


Heal operation on volume VmDir has been successful

Brick one-san-01:/bricks/VmDir01
Number of entries: 2
/1814/images/disk.0
/1816/images/disk.0

Brick one-san-02:/bricks/VmDir02
Number of entries: 2
/1816/images/disk.0
/1814/images/disk.0


with a list of virtual machine disks healed by GlusterFS. Those and other files 
continuously appear and disappear from the list.

This is a behavior I don't understand at all: does this mean that those files 
continuously get corrupted and healed, and self-healing is just a natural part 
of the replication process?! Or some kind of corruption is actually happening 
on our virtual disks for some reason? Is this related to the "remount readonly" 
problem?

A more general question maybe would be: is GlusterFS v3.3 ready for storing 
running virtual machines (and is there some special configuration option needed 
on the volumes and clients for that)?

Thank you in advance for shedding some light...

Regards,
--
: Dario Berzano
: CERN PH-SFT & Università di Torino (Italy)
: Wiki: http://newton.ph.unito.it/~berzano
: GPG: http://newton.ph.unito.it/~berzano/gpg

smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

[Gluster-users] Virtual machines and self-healing on GlusterFS v3.3

Reply via email to