Re: [Gluster-users] Failed file system

2016-08-03 Thread Andres E. Moya
Does anyone else have input? 

we are currently only running off 1 node and one node is offline in replicate 
brick. 

we are not experiencing any downtime because the 1 node is up. 

I do not understand which is the best way to bring up a second node. 

Do we just re create a file system on the node that is down and the mount 
points and allow gluster to heal( my concern with this is whether the node that 
is down will some how take precedence and wipe out the data on the healthy node 
instead of vice versa) 

Or do we fully wipe out the config on the node that is down, re create the file 
system and re add the node that is down into gluster using the add brick 
command replica 3, and then wait for it to heal then run the remove brick 
command for the failed brick 

which would be the safest and easiest to accomplish 

thanks for any input 




From: "Leno Vo"  
To: "Andres E. Moya"  
Cc: "gluster-users"  
Sent: Tuesday, August 2, 2016 6:45:27 PM 
Subject: Re: [Gluster-users] Failed file system 

if you don't want any downtime (in the case that your node 2 really die), you 
have to create a new gluster san (if you have the resources of course, 3 nodes 
as much as possible this time), and then just migrate your vms (or files), 
therefore no downtime but you have to cross your finger that the only node will 
not die too... also without sharding the vm migration especially an rdp one, 
will be slow access from users till it migrated. 

you have to start testing sharding, it's fast and cool... 




On Tuesday, August 2, 2016 2:51 PM, Andres E. Moya  
wrote: 


couldnt we just add a new server by 

gluster peer probe 
gluster volume add-brick replica 3 (will this command succeed with 1 current 
failed brick?) 

let it heal, then 

gluster volume remove remove-brick 

From: "Leno Vo"  
To: "Andres E. Moya" , "gluster-users" 
 
Sent: Tuesday, August 2, 2016 1:26:42 PM 
Subject: Re: [Gluster-users] Failed file system 

you need to have a downtime to recreate the second node, two nodes is actually 
not good for production and you should have put raid 1 or raid 5 as your 
gluster storage, when you recreate the second node you might try running some 
VMs that need to be up and rest of vm need to be down but stop all backup and 
if you have replication, stop it too. if you have 1G nic, 2cpu and less 8Gram, 
then i suggest all turn off the VMs during recreation of second node. someone 
said if you have sharding with 3.7.x, maybe some vip vm can be up... 

if it just a filesystem, then just turn off the backup service until you 
recreate the second node. depending on your resources and how big is your 
storage, it might be hours to recreate it and even days... 

here's my process on recreating the second or third node (copied and modifed 
from the net), 

#make sure partition is already added 
This procedure is for replacing a failed server, IF your newly installed server 
has the same hostname as the failed one: 

(If your new server will have a different hostname, see this article instead.) 

For purposes of this example, the server that crashed will be server3 and the 
other servers will be server1 and server2 

On both server1 and server2, make sure hostname server3 resolves to the correct 
IP address of the new replacement server. 
#On either server1 or server2, do 
grep server3 /var/lib/glusterd/peers/* 

This will return a uuid followed by ":hostname1=server3" 

#On server3, make sure glusterd is stopped, then do 
echo UUID={uuid from previous step}>/var/lib/glusterd/glusterd.info 

#actual testing below, 
[root@node1 ~]# cat /var/lib/glusterd/glusterd.info 
UUID=4b9d153c-5958-4dbe-8f91-7b5002882aac 
operating-version=30710 
#the second line is new. maybe not needed... 

On server3: 
make sure that all brick directories are created/mounted 
start glusterd 
peer probe one of the existing servers 

#restart glusterd, check that full peer list has been populated using 
gluster peer status 

(if peers are missing, probe them explicitly, then restart glusterd again) 
#check that full volume configuration has been populated using 
gluster volume info 

if volume configuration is missing, do 
#on the other node 
gluster volume sync "replace-node" all 

#on the node to be replaced 
setfattr -n trusted.glusterfs.volume-id -v 0x$(grep volume-id 
/var/lib/glusterd/vols/v1/info | cut -d= -f2 | sed 's/-//g') /gfs/b1/v1 
setfattr -n trusted.glusterfs.volume-id -v 0x$(grep volume-id 
/var/lib/glusterd/vols/v2/info | cut -d= -f2 | sed 's/-//g') /gfs/b2/v2 
setfattr -n trusted.glusterfs.volume-id -v 0x$(grep volume-id 
/var/lib/glusterd/vols/config/info | cut -d= -f2 | sed 's/-//g') 
/gfs/b1/config/c1 

mount -t glusterfs localhost:config /data/data1 

#install ctdb if not yet installed and put it back online, use the step on 
creating the ctdb config but

Re: [Gluster-users] Failed file system

2016-08-02 Thread Andres E. Moya
couldnt we just add a new server by 

gluster peer probe 
gluster volume add-brick replica 3 (will this command succeed with 1 current 
failed brick?) 

let it heal, then 

gluster volume remove remove-brick 

From: "Leno Vo"  
To: "Andres E. Moya" , "gluster-users" 
 
Sent: Tuesday, August 2, 2016 1:26:42 PM 
Subject: Re: [Gluster-users] Failed file system 

you need to have a downtime to recreate the second node, two nodes is actually 
not good for production and you should have put raid 1 or raid 5 as your 
gluster storage, when you recreate the second node you might try running some 
VMs that need to be up and rest of vm need to be down but stop all backup and 
if you have replication, stop it too. if you have 1G nic, 2cpu and less 8Gram, 
then i suggest all turn off the VMs during recreation of second node. someone 
said if you have sharding with 3.7.x, maybe some vip vm can be up... 

if it just a filesystem, then just turn off the backup service until you 
recreate the second node. depending on your resources and how big is your 
storage, it might be hours to recreate it and even days... 

here's my process on recreating the second or third node (copied and modifed 
from the net), 

#make sure partition is already added 
This procedure is for replacing a failed server, IF your newly installed server 
has the same hostname as the failed one: 

(If your new server will have a different hostname, see this article instead.) 

For purposes of this example, the server that crashed will be server3 and the 
other servers will be server1 and server2 

On both server1 and server2, make sure hostname server3 resolves to the correct 
IP address of the new replacement server. 
#On either server1 or server2, do 
grep server3 /var/lib/glusterd/peers/* 

This will return a uuid followed by ":hostname1=server3" 

#On server3, make sure glusterd is stopped, then do 
echo UUID={uuid from previous step}>/var/lib/glusterd/glusterd.info 

#actual testing below, 
[root@node1 ~]# cat /var/lib/glusterd/glusterd.info 
UUID=4b9d153c-5958-4dbe-8f91-7b5002882aac 
operating-version=30710 
#the second line is new. maybe not needed... 

On server3: 
make sure that all brick directories are created/mounted 
start glusterd 
peer probe one of the existing servers 

#restart glusterd, check that full peer list has been populated using 
gluster peer status 

(if peers are missing, probe them explicitly, then restart glusterd again) 
#check that full volume configuration has been populated using 
gluster volume info 

if volume configuration is missing, do 
#on the other node 
gluster volume sync "replace-node" all 

#on the node to be replaced 
setfattr -n trusted.glusterfs.volume-id -v 0x$(grep volume-id 
/var/lib/glusterd/vols/v1/info | cut -d= -f2 | sed 's/-//g') /gfs/b1/v1 
setfattr -n trusted.glusterfs.volume-id -v 0x$(grep volume-id 
/var/lib/glusterd/vols/v2/info | cut -d= -f2 | sed 's/-//g') /gfs/b2/v2 
setfattr -n trusted.glusterfs.volume-id -v 0x$(grep volume-id 
/var/lib/glusterd/vols/config/info | cut -d= -f2 | sed 's/-//g') 
/gfs/b1/config/c1 

mount -t glusterfs localhost:config /data/data1 

#install ctdb if not yet installed and put it back online, use the step on 
creating the ctdb config but 
#use your common sense not to deleted or modify current one. 

gluster vol heal v1 full 
gluster vol heal v2 full 
gluster vol heal config full 



On Tuesday, August 2, 2016 11:57 AM, Andres E. Moya  
wrote: 


Hi, we have a 2 node replica setup 
on 1 of the nodes the file system that had the brick on it failed, not the OS 
can we re create a file system and mount the bricks on the same mount point 

what will happen, will the data from the other node sync over, or will the 
failed node wipe out the data on the other mode? 

what would be the correct process? 

Thanks in advance for any help 
___ 
Gluster-users mailing list 
[ mailto:Gluster-users@gluster.org | Gluster-users@gluster.org ] 
[ http://www.gluster.org/mailman/listinfo/gluster-users | 
http://www.gluster.org/mailman/listinfo/gluster-users ] 



___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Failed file system

2016-08-02 Thread Andres E. Moya
Hi, we have a 2 node replica setup
on 1 of the nodes the file system that had the brick on it failed, not the OS
can we re create a file system and mount the bricks on the same mount point

what will happen, will the data from the other node sync over, or will the 
failed node wipe out the data on the other mode?

what would be the correct process?

Thanks in advance for any help
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users