Thank you for responding, Heiko. In the process of seeing the differences between our two scripts. First thing I noticed was that the notes states "need to be defined in the /etc/hosts". Would using the IP address directly be a problem?
On Tue, Jun 21, 2016 at 2:10 PM, Heiko L. <hei...@fh-lausitz.de> wrote: > Am Di, 21.06.2016, 19:22 schrieb Danny Lee: > > Hello, > > > > > > We are currently figuring out how to add GlusterFS to our system to make > > our systems highly available using scripts. We are using Gluster 3.7.11. > > > > Problem: > > Trying to migrate to GlusterFS from a non-clustered system to a 3-node > > glusterfs replicated cluster using scripts. Tried various things to > make this work, but it sometimes causes us to be in an > > indesirable state where if you call "gluster volume heal <volname> > full", we would get the error message, "Launching heal > > operation to perform full self heal on volume <volname> has been > unsuccessful on bricks that are down. Please check if > > all brick processes are running." All the brick processes are running > based on running the command, "gluster volume status > > volname" > > > > Things we have tried: > > Order of preference > > 1. Create Volume with 3 Filesystems with the same data > > 2. Create Volume with 2 Empty filesysytems and one with the data > > 3. Create Volume with only one filesystem with data and then using > > "add-brick" command to add the other two empty filesystems > > 4. Create Volume with one empty filesystem, mounting it, and then copying > > the data over to that one. And then finally, using "add-brick" command > to add the other two empty filesystems > - should be working > - read each file on /mnt/gvol, to trigger replication [2] > > > 5. Create Volume > > with 3 empty filesystems, mounting it, and then copying the data over > - my favorite > > > > > Other things to note: > > A few minutes after the volume is created and started successfully, our > > application server starts up against it, so reads and writes may happen > pretty quickly after the volume has started. But there > > is only about 50MB of data. > > > > Steps to reproduce (all in a script): > > # This is run by the primary node with the IP Adress, <server-ip-1>, that > > has data systemctl restart glusterd gluster peer probe <server-ip-2> > gluster peer probe <server-ip-3> Wait for "gluster peer > > status" to all be in "Peer in Cluster" state gluster volume create > <volname> replica 3 transport tcp ${BRICKS[0]} ${BRICKS[1]} > > ${BRICKS[2]} force > > gluster volume set <volname> nfs.disable true gluster volume start > <volname> mkdir -p $MOUNT_POINT mount -t glusterfs > > <server-ip-1>:/volname $MOUNT_POINT > > find $MOUNT_POINT | xargs stat > > I have written a script for 2 nodes. [1] > but should be at least 3 nodes. > > > I hope it helps you > regards Heiko > > > > > Note that, when we added sleeps around the gluster commands, there was a > > higher probability of success, but not 100%. > > > > # Once volume is started, all the the clients/servers will mount the > > gluster filesystem by polling "mountpoint -q $MOUNT_POINT": mkdir -p > $MOUNT_POINT mount -t glusterfs <server-ip-1>:/volname > > $MOUNT_POINT > > > > > > Logs: > > *etc-glusterfs-glusterd.vol.log* in *server-ip-1* > > > > > > [2016-06-21 14:10:38.285234] I [MSGID: 106533] > > [glusterd-volume-ops.c:857:__glusterd_handle_cli_heal_volume] > 0-management: > > Received heal vol req for volume volname > > [2016-06-21 14:10:38.296801] E [MSGID: 106153] > > [glusterd-syncop.c:113:gd_collate_errors] 0-glusterd: Commit failed on > > <server-ip-2>. Please check log file for details. > > > > > > > > *usr-local-volname-data-mirrored-data.log* in *server-ip-1* > > > > > > [2016-06-21 14:14:39.233366] E [MSGID: 114058] > > [client-handshake.c:1524:client_query_portmap_cbk] 0-volname-client-0: > > failed to get the port number for remote subvolume. Please run 'gluster > volume status' on server to see if brick process is > > running. *I think this is caused by the self heal daemon* > > > > > > *cmd_history.log* in *server-ip-1* > > > > > > [2016-06-21 14:10:38.298800] : volume heal volname full : FAILED : > Commit > > failed on <server-ip-2>. Please check log file for details. > _______________________________________________ > > Gluster-users mailing list > > Gluster-users@gluster.org > > http://www.gluster.org/mailman/listinfo/gluster-users > > [1] > http://www2.fh-lausitz.de/launic/comp/net/glusterfs/130620.glusterfs.create_brick_vol.howto.txt > - old, limit 2 nodes > > > -- > > >
_______________________________________________ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users