[Gluster-users] Restore a node in a replicating Gluster setup after data loss
Hi We have a Replica 2 + Arbiter Gluster setup with 3 nodes Server1, Server2 and Server3 where Server3 is the Arbiter node. There are several Gluster volumes ontop of that setup. They all look a bit like this: gluster volume info gv-tier1-vm-01 [...] Number of Bricks: 1 x (2 + 1) = 3 [...] Bricks: Brick1: Server1:/var/data/lv-vm-01 Brick2: Server2:/var/data/lv-vm-01 Brick3: Server3:/var/data/lv-vm-01/brick (arbiter) [...] cluster.data-self-heal-algorithm: full [...] We took down Server2 because we needed to do maintenance on this server's storage. During maintenance work, we ended up having to completely rebuild the storage on Server2. This means that "/var/data/lv-vm-01" on Server2 is now empty. However, all the Gluster Metadata in "/var/lib/glusterd/" is still in tact. Gluster has not been started on Server2. Here is what our sample gluster volume currently looks like on the still active nodes: gluster volume status gv-tier1-vm-01 Status of volume: gv-tier1-vm-01 Gluster process TCP Port RDMA Port Online Pid -- Brick Server1:/var/data/lv-vm-0149204 0 Y 22775 Brick Server3:/var/data/lv-vm-01/brick 49161 0 Y 15334 Self-heal Daemon on localhost N/A N/AY 19233 Self-heal Daemon on Server3 N/A N/AY 20839 Now we would like to rebuild the data on Server2 from the still in tact data on Server1. That is to say, we hope to start up Gluster on Server2 in such a way that it will sync the data from Server1 back. If at all possible, the Gluster cluster should stay up during this process and access to the Gluster volumes should not be interrupted. What is the correct / recommended way of doing this? Greetings Niklaus Hofer -- stepping stone GmbH Neufeldstrasse 9 CH-3012 Bern Telefon: +41 31 332 53 63 www.stepping-stone.ch niklaus.ho...@stepping-stone.ch ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] Expected behaviour of hypervisor on Gluster node loss
Hi I have a question concerning the 'correct' behaviour of GlusterFS: We a nice Gluster setup up and running. Most things are working nicely. Our setup is as follows: - Storage is a 2+1 Gluster setup (2 replicating hosts + 1 arbiter) with a volume for virtual machines. - Two virtualisation hosts running libvirt / qemu / kvm. Now the question is, what is supposed to happen when we unplug one of the storage nodes (aka power outage in one of our data centers)? Initially we were hoping that the virtualisation hosts would automatically switch over to the second storage node and keep all VMs running. However, during our tests, we have found that this is not the case. Instead, when we unplug one of the storage nodes, the virtual machines run into all sorts of problems; being unable to read/write, crashing applications and even corrupting the filesystem. That is of course not acceptable. Reading the documentation again, we now think that we have misunderstood what we're supposed to be doing. To our understanding, what should happen is this: - If the virtualisation host is connected to the storage node which is still running: - everything is fine and the VM keeps running - If the virtualisation host was connected to the storage node which is now absent: - qemu is supposed to 'pause' / 'freeze' the VM - Virtualisation host waits for ping timeout - Virtualisation host switches over to the other storage node - qemu 'unpauses' the VMs - The VM is fully operational again Does my description match the 'optimal' GlusterFS behaviour? Greets Niklaus Hofer -- stepping stone GmbH Neufeldstrasse 9 CH-3012 Bern Telefon: +41 31 332 53 63 www.stepping-stone.ch niklaus.ho...@stepping-stone.ch ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users