[Gluster-users] Restore a node in a replicating Gluster setup after data loss

2017-06-01 Thread Niklaus Hofer

Hi

We have a Replica 2 + Arbiter Gluster setup with 3 nodes Server1, 
Server2 and Server3 where Server3 is the Arbiter node. There are several 
Gluster volumes ontop of that setup. They all look a bit like this:


gluster volume info gv-tier1-vm-01

[...]
Number of Bricks: 1 x (2 + 1) = 3
[...]
Bricks:
Brick1: Server1:/var/data/lv-vm-01
Brick2: Server2:/var/data/lv-vm-01
Brick3: Server3:/var/data/lv-vm-01/brick (arbiter)
[...]
cluster.data-self-heal-algorithm: full
[...]

We took down Server2 because we needed to do maintenance on this 
server's storage. During maintenance work, we ended up having to 
completely rebuild the storage on Server2. This means that 
"/var/data/lv-vm-01" on Server2 is now empty. However, all the Gluster 
Metadata in "/var/lib/glusterd/" is still in tact. Gluster has not been 
started on Server2.


Here is what our sample gluster volume currently looks like on the still 
active nodes:


gluster volume status gv-tier1-vm-01

Status of volume: gv-tier1-vm-01
Gluster process TCP Port  RDMA Port  Online  Pid
--
Brick Server1:/var/data/lv-vm-0149204 0  Y 
22775
Brick Server3:/var/data/lv-vm-01/brick  49161 0  Y 
15334
Self-heal Daemon on localhost   N/A   N/AY 
19233
Self-heal Daemon on Server3 N/A   N/AY 
20839



Now we would like to rebuild the data on Server2 from the still in tact 
data on Server1. That is to say, we hope to start up Gluster on Server2 
in such a way that it will sync the data from Server1 back. If at all 
possible, the Gluster cluster should stay up during this process and 
access to the Gluster volumes should not be interrupted.


What is the correct / recommended way of doing this?

Greetings
Niklaus Hofer
--
stepping stone GmbH
Neufeldstrasse 9
CH-3012 Bern

Telefon: +41 31 332 53 63
www.stepping-stone.ch
niklaus.ho...@stepping-stone.ch
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Expected behaviour of hypervisor on Gluster node loss

2017-01-30 Thread Niklaus Hofer

Hi

I have a question concerning the 'correct' behaviour of GlusterFS:

We a nice Gluster setup up and running. Most things are working nicely. 
Our setup is as follows:
 - Storage is a 2+1 Gluster setup (2 replicating hosts + 1 arbiter) 
with a volume for virtual machines.

 - Two virtualisation hosts running libvirt / qemu / kvm.

Now the question is, what is supposed to happen when we unplug one of 
the storage nodes (aka power outage in one of our data centers)?
Initially we were hoping that the virtualisation hosts would 
automatically switch over to the second storage node and keep all VMs 
running.


However, during our tests, we have found that this is not the case. 
Instead, when we unplug one of the storage nodes, the virtual machines 
run into all sorts of problems; being unable to read/write, crashing 
applications and even corrupting the filesystem. That is of course not 
acceptable.


Reading the documentation again, we now think that we have misunderstood 
what we're supposed to be doing. To our understanding, what should 
happen is this:
 - If the virtualisation host is connected to the storage node which is 
still running:

   - everything is fine and the VM keeps running
 - If the virtualisation host was connected to the storage node which 
is now absent:

   - qemu is supposed to 'pause' / 'freeze' the VM
   - Virtualisation host waits for ping timeout
   - Virtualisation host switches over to the other storage node
   - qemu 'unpauses' the VMs
   - The VM is fully operational again

Does my description match the 'optimal' GlusterFS behaviour?


Greets
Niklaus Hofer
--
stepping stone GmbH
Neufeldstrasse 9
CH-3012 Bern

Telefon: +41 31 332 53 63
www.stepping-stone.ch
niklaus.ho...@stepping-stone.ch
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users