A much simpler answer is to assign a hostname to multiple IP addresses
(round robin dns). When gethostbyname() returns multiple entries, the
client will try them all until it's successful.
On 11/24/2014 06:23 PM, Paul Robert Marino wrote:
This is simple and can be handled in many ways.
Some background first.
The mount point is a single IP or host name. The only thing the client
uses it for is to download a describing all the bricks in the cluster.
The next thing is it opens connections to all the nodes containing
bricks for that volume.
So the answer is tell the client to connect to a virtual IP address.
I personally use keepalived for this but you can use any one of the
many IPVS Or other tools that manage IPS for this. I assign the VIP
to a primary node then have each node monitor the cluster processes if
they die on a node it goes into a faulted state and can not own the VIP.
As long as the client are connecting to a running host in the cluster
you are fine even if that host doesn't own bricks in the volume but is
aware of them as part of the cluster.
-- Sent from my HP Pre3
------------------------------------------------------------------------
On Nov 24, 2014 8:07 PM, Eric Ewanco <eric.ewa...@genband.com> wrote:
Hi all,
We’re trying to use gluster as a replicated volume. It works OK when
both peers are up but when one peer is down and the other reboots, the
“surviving” peer does not automount glusterfs. Furthermore, after the
boot sequence is complete, it can be mounted without issue. It
automounts fine when the peer is up during startup. I tried to google
this and while I found some similar issues, I haven’t found any
solutions to my problem. Any insight would be appreciated. Thanks.
gluster volume info output (after startup):
Volume Name: rel-vol
Type: Replicate
Volume ID: 90cbe313-e9f9-42d9-a947-802315ab72b0
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 10.250.1.1:/export/brick1
Brick2: 10.250.1.2:/export/brick1
gluster peer status output (after startup):
Number of Peers: 1
Hostname: 10.250.1.2
Uuid: 8d49b929-4660-4b1e-821b-bfcd6291f516
State: Peer in Cluster (Disconnected)
Original volume create command:
gluster volume create rel-vol rep 2 transport tcp
10.250.1.1:/export/brick1 10.250.1.2:/export/brick1
I am running Gluster 3.4.5 on OpenSuSE 12.2.
gluster --version:
glusterfs 3.4.5 built on Jul 25 2014 08:31:19
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU
General Public License.
The fstab line is:
localhost:/rel-vol /home glusterfs defaults,_netdev 0 0
lsof -i :24007-24100:
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
glusterd 4073 root 6u IPv4 82170 0t0 TCP s1:24007->s1:1023
(ESTABLISHED)
glusterd 4073 root 9u IPv4 13816 0t0 TCP *:24007 (LISTEN)
glusterd 4073 root 10u IPv4 88106 0t0 TCP s1:exp2->s2:24007
(SYN_SENT)
glusterfs 4097 root 8u IPv4 16751 0t0 TCP s1:1023->s1:24007
(ESTABLISHED)
This is shorter than it is when it works, but maybe that’s because the
mount spawns some more processes.
Some ports are down:
root@q50-s1:/root> telnet localhost 24007
Trying ::1...
telnet: connect to address ::1: Connection refused
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
telnet> close
Connection closed.
root@q50-s1:/root> telnet localhost 24009
Trying ::1...
telnet: connect to address ::1: Connection refused
Trying 127.0.0.1...
telnet: connect to address 127.0.0.1: Connection refused
ps axww | fgrep glu:
4073 ? Ssl 0:10 /usr/sbin/glusterd -p /run/glusterd.pid
4097 ? Ssl 0:00 /usr/sbin/glusterfsd -s 10.250.1.1
--volfile-id rel-vol.10.250.1.1.export-brick1 -p
/var/lib/glusterd/vols/rel-vol/run/10.250.1.1-export-brick1.pid -S
/var/run/89ba432ed09e07e107723b4b266e18f9.socket --brick-name
/export/brick1 -l /var/log/glusterfs/bricks/export-brick1.log
--xlator-option
*-posix.glusterd-uuid=3b02a581-8fb9-4c6a-8323-9463262f23bc
--brick-port 49152 --xlator-option rel-vol-server.listen-port=49152
5949 ttyS0 S+ 0:00 fgrep glu
These are the error messages I see in /var/log/gluster/home.log (/home
is the mountpoint):
+------------------------------------------------------------------------------+
[2014-11-24 13:51:27.932285] E
[client-handshake.c:1742:client_query_portmap_cbk] 0-rel-vol-client-0:
failed to get the port number for remote subvolume. Please run
'gluster volume status' on server to see if brick process is running.
[2014-11-24 13:51:27.932373] W [socket.c:514:__socket_rwv]
0-rel-vol-client-0: readv failed (No data available)
[2014-11-24 13:51:27.932405] I [client.c:2098:client_rpc_notify]
0-rel-vol-client-0: disconnected
[2014-11-24 13:51:30.818281] E [socket.c:2157:socket_connect_finish]
0-rel-vol-client-1: connection to 10.250.1.2:24007 failed (No route to
host)
[2014-11-24 13:51:30.818313] E [afr-common.c:3735:afr_notify]
0-rel-vol-replicate-0: All subvolumes are down. Going offline until
atleast one of them comes back up.
[2014-11-24 13:51:30.822189] I [fuse-bridge.c:4771:fuse_graph_setup]
0-fuse: switched to graph 0
[2014-11-24 13:51:30.822245] W [socket.c:514:__socket_rwv]
0-rel-vol-client-1: readv failed (No data available)
[2014-11-24 13:51:30.822312] I [fuse-bridge.c:3726:fuse_init]
0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.13
kernel 7.18
[2014-11-24 13:51:30.822562] W [fuse-bridge.c:705:fuse_attr_cbk]
0-glusterfs-fuse: 2: LOOKUP() / => -1 (Transport endpoint is not
connected)
[2014-11-24 13:51:30.835120] I [fuse-bridge.c:4630:fuse_thread_proc]
0-fuse: unmounting /home
[2014-11-24 13:51:30.835397] W [glusterfsd.c:1002:cleanup_and_exit]
(-->/lib64/libc.so.6(clone+0x6d) [0x7f00f0f682bd]
(-->/lib64/libpthread.so.0(+0x7e0e) [0x7f0
0f1603e0e] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xc5)
[0x4075f5]))) 0-: received signum (15), shutting down
[2014-11-24 13:51:30.835416] I [fuse-bridge.c:5262:fini] 0-fuse:
Unmounting '/home'.
Relevant section from /var/log/glusterfs/etc-glusterfs-glusterd.vol.log:
[2014-11-24 13:51:27.552371] I [glusterfsd.c:1910:main]
0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.4.5
(/usr/sbin/glusterd -p /run/glusterd.pid)
[2014-11-24 13:51:27.574553] I [glusterd.c:961:init] 0-management:
Using /var/lib/glusterd as working directory
[2014-11-24 13:51:27.577734] I [socket.c:3480:socket_init]
0-socket.management: SSL support is NOT enabled
[2014-11-24 13:51:27.577756] I [socket.c:3495:socket_init]
0-socket.management: using system polling thread
[2014-11-24 13:51:27.577834] E
[rpc-transport.c:253:rpc_transport_load] 0-rpc-transport:
/usr/lib64/glusterfs/3.4.5/rpc-transport/rdma.so: cannot open shared
object file: No such file or directory
[2014-11-24 13:51:27.577849] W
[rpc-transport.c:257:rpc_transport_load] 0-rpc-transport: volume
'rdma.management': transport-type 'rdma' is not valid or not found on
this machine
[2014-11-24 13:51:27.577858] W [rpcsvc.c:1389:rpcsvc_transport_create]
0-rpc-service: cannot create listener, initing the transport failed
[2014-11-24 13:51:27.578697] I
[glusterd.c:354:glusterd_check_gsync_present] 0-glusterd:
geo-replication module not installed in the system
[2014-11-24 13:51:27.598907] I
[glusterd-store.c:1339:glusterd_restore_op_version] 0-glusterd:
retrieved op-version: 2
[2014-11-24 13:51:27.607802] E
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown
key: brick-0
[2014-11-24 13:51:27.607837] E
[glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown
key: brick-1
[2014-11-24 13:51:27.809027] I
[glusterd-handler.c:2818:glusterd_friend_add] 0-management: connect
returned 0
[2014-11-24 13:51:27.809098] I
[rpc-clnt.c:962:rpc_clnt_connection_init] 0-management: setting
frame-timeout to 600
[2014-11-24 13:51:27.809150] I [socket.c:3480:socket_init]
0-management: SSL support is NOT enabled
[2014-11-24 13:51:27.809162] I [socket.c:3495:socket_init]
0-management: using system polling thread
[2014-11-24 13:51:27.813801] I [glusterd.c:125:glusterd_uuid_init]
0-management: retrieved UUID: 3b02a581-8fb9-4c6a-8323-9463262f23bc
Given volfile:
+------------------------------------------------------------------------------+
1: volume management
2: type mgmt/glusterd
3: option working-directory /var/lib/glusterd
4: option transport-type socket,rdma
5: option transport.socket.keepalive-time 10
6: option transport.socket.keepalive-interval 2
7: option transport.socket.read-fail-log off
8: # option base-port 49152
9: end-volume
+------------------------------------------------------------------------------+
[2014-11-24 13:51:30.818283] E [socket.c:2157:socket_connect_finish]
0-management: connection to 10.250.1.2:24007 failed (No route to host)
[2014-11-24 13:51:30.820254] I
[rpc-clnt.c:962:rpc_clnt_connection_init] 0-management: setting
frame-timeout to 600
[2014-11-24 13:51:30.820316] I [socket.c:3480:socket_init]
0-management: SSL support is NOT enabled
[2014-11-24 13:51:30.820327] I [socket.c:3495:socket_init]
0-management: using system polling thread
[2014-11-24 13:51:30.820378] W [socket.c:514:__socket_rwv]
0-management: readv failed (No data available)
[2014-11-24 13:51:30.821243] I
[glusterd-utils.c:1079:glusterd_volume_brickinfo_get] 0-management:
Found brick
[2014-11-24 13:51:30.821268] I [socket.c:2236:socket_event_handler]
0-transport: disconnecting now
[2014-11-24 13:51:30.822036] I
[glusterd-utils.c:1079:glusterd_volume_brickinfo_get] 0-management:
Found brick
[2014-11-24 13:51:30.863454] I
[glusterd-pmap.c:227:pmap_registry_bind] 0-pmap: adding brick
/export/brick1 on port 49152
[2014-11-24 13:51:33.824274] W [socket.c:514:__socket_rwv]
0-management: readv failed (No data available)
[2014-11-24 13:51:34.817560] I
[glusterd-utils.c:1079:glusterd_volume_brickinfo_get] 0-management:
Found brick
[2014-11-24 13:51:39.824281] W [socket.c:514:__socket_rwv]
0-management: readv failed (No data available)
[2014-11-24 13:51:42.830260] W [socket.c:514:__socket_rwv]
0-management: readv failed (No data available)
[2014-11-24 13:51:48.832276] W [socket.c:514:__socket_rwv]
0-management: readv failed (No data available)
[ad nauseam...]
_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users