Re: [Gluster-users] Can't stop (or control) geo-replication?

Danny Sauer Thu, 24 Apr 2014 15:04:32 -0700

No, I still haven't heard anything from the community, and I just removed the 
ssh keys for the broken systems so they don't try to start up the "bad" 
replication configs (which is incredibly ugly). Someday soon I'm planning to 
build a test cluster to experiment on, though, and will follow up if I figure 
out a solution.


--Danny

Steve Dainard <sdain...@miovision.com> wrote:

>Hi Danny,
>
>
>Did you get anywhere with this geo-rep issue? I have a similar problem running 
>on CentOS 6.5 when trying anything other than 'start' with geo-rep.
>
>
>Thanks,
>
>
>Steve 
>
>
>On Tue, Feb 25, 2014 at 9:45 AM, Danny Sauer <da...@dannysauer.com> wrote:
>
>
>-----BEGIN PGP SIGNED MESSAGE-----
>Hash: SHA1
>
>I have the current gluster 3.4 running on some RHEL6 systems.  For some 
>reason, all of the geo-replication commands which change a config file (start, 
>stop, config) return failure.  Despite this, "start" actually starts it up.  
>I'd be mostly ok with this if stop also actually stopped it; but that does not 
>happen.  The "command failed" behavior is consistent across all nodes.  The 
>binaries are the result of downloading the source RPM and "rpm --rebuild"ing, 
>since the packages on the download server still don't install on anything but 
>the latest RHEL6 (that ssl library dependency thing); I didn't change 
>anything, just directly rebuilt from the source package.  I have working ssh 
>between the systems, and files do propagate over; I can see in the logs that 
>ssh does connect and start up the gsyncd.  I just have several test configs 
>that I'd like to not have running now, but they won't stay dead. :)
>
>Is there a way to forcibly remove several geo-replication configs outside of 
>the shell tool?  I tried editing the config file to change the ssh command 
>path for one of them, and my changes kept getting overwritten by metadata from 
>the other nodes (yes, time is in sync on all nodes using ntp against the same 
>server), so I'm assuming that deleting the relevant block from the config file 
>won't do it?
>
>The really weird thing is that other volume management tasks work fine; I can 
>add/remove bricks from volumes, create, start and stop regular volumes, etc.  
>It's just the geo-replication management part that fails.
>
>Thanks for any input you can provide. :)  Some example output (with username, 
>IP, and hostnames changed to protect the innocent) is below.
>
>- --Danny
>
>
>user@gluster1 [/home/user]
>$ sudo gluster v geo sec ssh://slave_73::geo_sec_73 stop
> 
>geo-replication command failed
>user@gluster1 [/home/user]
>$ sudo gluster v geo sec ssh://slave_73::geo_sec_73 config
>gluster_log_file: 
>/var/log/glusterfs/geo-replication/sec/ssh%3A%2F%2Froot%401.2.3.4%3Agluster%3A%2F%2F127.0.0.1%3Ageo_sec_73.gluster.log
>ssh_command: ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i 
>/var/lib/glusterd/geo-replication/secret.pem
>session_owner: ace6b109-ba88-4c2e-9381-f2fc31aa36b5
>remote_gsyncd: /usr/libexec/glusterfs/gsyncd
>socketdir: /var/run
>state_file: 
>/var/lib/glusterd/geo-replication/sec/ssh%3A%2F%2Froot%401.2.3.4%3Agluster%3A%2F%2F127.0.0.1%3Ageo_sec_73.status
>state_socket_unencoded: 
>/var/lib/glusterd/geo-replication/sec/ssh%3A%2F%2Froot%401.2.3.4%3Agluster%3A%2F%2F127.0.0.1%3Ageo_sec_73.socket
>gluster_command_dir: /usr/sbin/
>pid_file: 
>/var/lib/glusterd/geo-replication/sec/ssh%3A%2F%2Froot%401.2.3.4%3Agluster%3A%2F%2F127.0.0.1%3Ageo_sec_73.pid
>log_file: 
>/var/log/glusterfs/geo-replication/sec/ssh%3A%2F%2Froot%401.2.3.4%3Agluster%3A%2F%2F127.0.0.1%3Ageo_sec_73.log
>gluster_params: xlator-option=*-dht.assert-no-child-down=true
>user@gluster1 [/home/user]
>$ sudo gluster v geo sec ssh://slave_73::geo_sec_73 status
>NODE                 MASTER               SLAVE                                
>              STATUS
>- 
>---------------------------------------------------------------------------------------------------
>gluster1             sec                  ssh://slave_73::geo_sec_73           
>              faulty
>user@gluster1 [/home/user]
>$ sudo gluster v geo sec ssh://slave_73::geo_sec_73 stop
> 
>geo-replication command failed
>user@gluster1 [/home/user]
>$ sudo gluster v geo sec ssh://slave_73::geo_sec_73 status
>NODE                 MASTER               SLAVE                                
>              STATUS
>- 
>---------------------------------------------------------------------------------------------------
>gluster1             sec                  ssh://slave_73::geo_sec_73           
>              faulty
> 
>
>-----BEGIN PGP SIGNATURE-----
>Version: GnuPG v1.4.14 (GNU/Linux)
>Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>
>iEYEARECAAYFAlMMrHEACgkQvtwZjjd2PN8kpQCfVjtKeO7DCvhT9SpK+LEulZVZ
>c0wAn16xAT14V+oNOilbKwHDoM68EIbW
>=QfSZ
>-----END PGP SIGNATURE-----
>
>
>_______________________________________________
>Gluster-users mailing list
>Gluster-users@gluster.org
>http://supercolony.gluster.org/mailman/listinfo/gluster-users
>
>

_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Can't stop (or control) geo-replication?

Reply via email to