Ok, now I'm intrigued.

btw... when I read your initial email I was on my phone. I only got as far as the selinux error before my ADHD got the better of me and I thought, "well it says what the problem is right there." Sorry, or I would have answered at that time.

As it turns out, reading further that error you're seeing comes from glusterfsd.service (not glusterd.service) which shouldn't even be enabled unless you're trying to use old legacy volfiles from 3.0. The "parsing the volfile failed" was spurious, as you discovered.

As for your current problem...

Are your two machines perhaps connected via crossover cable?

The question comes down to, when you're on 192.168.253.1 and shut down 192.168.253.2, what prevents .1 from being able to be reached? Is it, perhaps, because it's gone offline? Check dmesg. See if you can ping the .1 address (when .2 is down) and see if you can telnet to port 24007 on .1.

On 07/11/2013 09:46 AM, Greg Scott wrote:
When you first mount your volume, look in the client log and see if it's 
connecting to both bricks.
  I suspect it's not and that the failure is related to firewall settings.
Logs from both nodes below.  For this test, first I did "umount /firewall-scripts" from both 
nodes.   Then I did “mount –av” using the default parameters in my fstab file.  I did **not** turn on the 
backupvolfile-server=<secondary server> for this test.   And then in another window, I did 
"tail tail /var/log/glusterfs/firewall-scripts.log -f" and you can see the spot where I mounted 
my file system back up again.

Note that everything works as expected when both nodes are online, so this suggests 
everyone can see everyone else when things are steady-state.   Also note that 
backupvolfile-server=<secondary server> changed the behavior - I documented 
this in an earlier post.

...the failure is related to firewall settings.
No way.   I’m wide open on the interface I’m using for heartbeat and glusterfs. 
 In my application, I take node fw1 offline by inserting a firewall rule and 
then getting rid of it a few seconds later.   For testing right now, I just 
insert the rule by hand, look at a bunch of stuff, then get rid of it later.    
But since you brought it up, I cleaned out all firewall rules before doing and 
logging the mounts below.  Near as I can tell, it looks like everyone can see 
everyone else.  And the logs look the same to my eye as they did before I 
dropped all (not relevant) firewall rules.

Log from fw1:

[root@chicago-fw1 ~]#
[root@chicago-fw1 ~]# tail /var/log/glusterfs/firewall-scripts.log -f
[2013-07-11 15:51:54.423508] I [client-handshake.c:1456:client_setvolume_cbk] 
0-firewall-scripts-client-1: Connected to 192.168.253.2:49152, attached to 
remote volume '/gluster-fw2'.
[2013-07-11 15:51:54.423576] I [client-handshake.c:1468:client_setvolume_cbk] 
0-firewall-scripts-client-1: Server and Client lk-version numbers are not same, 
reopening the fds
[2013-07-11 15:51:54.440124] I [fuse-bridge.c:4723:fuse_graph_setup] 0-fuse: 
switched to graph 0
[2013-07-11 15:51:54.440660] I 
[client-handshake.c:450:client_set_lk_version_cbk] 0-firewall-scripts-client-1: 
Server lk version = 1
[2013-07-11 15:51:54.440886] I [fuse-bridge.c:3680:fuse_init] 0-glusterfs-fuse: 
FUSE inited with protocol versions: glusterfs 7.13 kernel 7.21
[2013-07-11 15:51:54.442235] I 
[afr-common.c:2057:afr_set_root_inode_on_first_lookup] 
0-firewall-scripts-replicate-0: added root inode
[2013-07-11 15:51:54.443451] I [afr-common.c:2120:afr_discovery_cbk] 
0-firewall-scripts-replicate-0: selecting local read_child 
firewall-scripts-client-0
[2013-07-11 16:21:22.729423] I [fuse-bridge.c:4583:fuse_thread_proc] 0-fuse: 
unmounting /firewall-scripts
[2013-07-11 16:21:22.730976] W [glusterfsd.c:970:cleanup_and_exit] 
(-->/usr/lib64/libc.so.6(clone+0x6d) [0x7f7a69fee13d] 
(-->/usr/lib64/libpthread.so.0(+0x33c1607c53) [0x7f7a6a684c53] 
(-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xd5) [0x7f7a6b372e35]))) 0-: received 
signum (15), shutting down
[2013-07-11 16:21:22.731040] I [fuse-bridge.c:5212:fini] 0-fuse: Unmounting 
'/firewall-scripts'.


Blank space - mount -av below.

[2013-07-11 16:39:36.625696] I [glusterfsd.c:1878:main] 0-/usr/sbin/glusterfs: 
Started running /usr/sbin/glusterfs version 3.4.0beta3 (/usr/sbin/glusterfs 
--volfile-id=/firewall-scripts --volfile-server=192.168.253.1 /firewall-scripts)
[2013-07-11 16:39:36.640661] I [socket.c:3480:socket_init] 0-glusterfs: SSL 
support is NOT enabled
[2013-07-11 16:39:36.640800] I [socket.c:3495:socket_init] 0-glusterfs: using 
system polling thread
[2013-07-11 16:39:36.672416] I [socket.c:3480:socket_init] 
0-firewall-scripts-client-1: SSL support is NOT enabled
[2013-07-11 16:39:36.672539] I [socket.c:3495:socket_init] 
0-firewall-scripts-client-1: using system polling thread
[2013-07-11 16:39:36.674545] I [socket.c:3480:socket_init] 
0-firewall-scripts-client-0: SSL support is NOT enabled
[2013-07-11 16:39:36.674667] I [socket.c:3495:socket_init] 
0-firewall-scripts-client-0: using system polling thread
[2013-07-11 16:39:36.675015] I [client.c:2154:notify] 
0-firewall-scripts-client-0: parent translators are ready, attempting connect 
on transport
[2013-07-11 16:39:36.686253] I [client.c:2154:notify] 
0-firewall-scripts-client-1: parent translators are ready, attempting connect 
on transport
Given volfile:
+------------------------------------------------------------------------------+
   1: volume firewall-scripts-client-0
   2:     type protocol/client
   3:     option password fb3955b7-a6ca-49bb-b886-d4b6609392f8
   4:     option username de6eacd1-31bc-4bdb-a049-776cd840059e
   5:     option transport-type tcp
   6:     option remote-subvolume /gluster-fw1
   7:     option remote-host 192.168.253.1
   8: end-volume
   9:
  10: volume firewall-scripts-client-1
  11:     type protocol/client
  12:     option password fb3955b7-a6ca-49bb-b886-d4b6609392f8
  13:     option username de6eacd1-31bc-4bdb-a049-776cd840059e
  14:     option transport-type tcp
  15:     option remote-subvolume /gluster-fw2
  16:     option remote-host 192.168.253.2
  17: end-volume
  18:
  19: volume firewall-scripts-replicate-0
  20:     type cluster/replicate
  21:     subvolumes firewall-scripts-client-0 firewall-scripts-client-1
  22: end-volume
  23:
  24: volume firewall-scripts-dht
  25:     type cluster/distribute
  26:     subvolumes firewall-scripts-replicate-0
  27: end-volume
  28:
  29: volume firewall-scripts-write-behind
  30:     type performance/write-behind
  31:     subvolumes firewall-scripts-dht
  32: end-volume
  33:
  34: volume firewall-scripts-read-ahead
  35:     type performance/read-ahead
  36:     subvolumes firewall-scripts-write-behind
  37: end-volume
  38:
  39: volume firewall-scripts-io-cache
  40:     type performance/io-cache
  41:     subvolumes firewall-scripts-read-ahead
  42: end-volume
  43:
  44: volume firewall-scripts-quick-read
  45:     type performance/quick-read
  46:     subvolumes firewall-scripts-io-cache
  47: end-volume
  48:
  49: volume firewall-scripts-open-behind
  50:     type performance/open-behind
  51:     subvolumes firewall-scripts-quick-read
  52: end-volume
  53:
  54: volume firewall-scripts-md-cache
  55:     type performance/md-cache
  56:     subvolumes firewall-scripts-open-behind
  57: end-volume
  58:
  59: volume firewall-scripts
  60:     type debug/io-stats
  61:     option count-fop-hits off
  62:     option latency-measurement off
  63:     subvolumes firewall-scripts-md-cache
  64: end-volume

+------------------------------------------------------------------------------+
[2013-07-11 16:39:36.698740] I [rpc-clnt.c:1648:rpc_clnt_reconfig] 
0-firewall-scripts-client-0: changing port to 49152 (from 0)
[2013-07-11 16:39:36.698974] W [socket.c:514:__socket_rwv] 
0-firewall-scripts-client-0: readv failed (No data available)
[2013-07-11 16:39:36.711537] I [rpc-clnt.c:1648:rpc_clnt_reconfig] 
0-firewall-scripts-client-1: changing port to 49152 (from 0)
[2013-07-11 16:39:36.711717] W [socket.c:514:__socket_rwv] 
0-firewall-scripts-client-1: readv failed (No data available)
[2013-07-11 16:39:36.723116] I 
[client-handshake.c:1658:select_server_supported_programs] 
0-firewall-scripts-client-0: Using Program GlusterFS 3.3, Num (1298437), 
Version (330)
[2013-07-11 16:39:36.723521] I 
[client-handshake.c:1658:select_server_supported_programs] 
0-firewall-scripts-client-1: Using Program GlusterFS 3.3, Num (1298437), 
Version (330)
[2013-07-11 16:39:36.723913] I [client-handshake.c:1456:client_setvolume_cbk] 
0-firewall-scripts-client-0: Connected to 192.168.253.1:49152, attached to 
remote volume '/gluster-fw1'.
[2013-07-11 16:39:36.723995] I [client-handshake.c:1468:client_setvolume_cbk] 
0-firewall-scripts-client-0: Server and Client lk-version numbers are not same, 
reopening the fds
[2013-07-11 16:39:36.724390] I [afr-common.c:3698:afr_notify] 
0-firewall-scripts-replicate-0: Subvolume 'firewall-scripts-client-0' came back 
up; going online.
[2013-07-11 16:39:36.724601] I 
[client-handshake.c:450:client_set_lk_version_cbk] 0-firewall-scripts-client-0: 
Server lk version = 1
[2013-07-11 16:39:36.724730] I [client-handshake.c:1456:client_setvolume_cbk] 
0-firewall-scripts-client-1: Connected to 192.168.253.2:49152, attached to 
remote volume '/gluster-fw2'.
[2013-07-11 16:39:36.724788] I [client-handshake.c:1468:client_setvolume_cbk] 
0-firewall-scripts-client-1: Server and Client lk-version numbers are not same, 
reopening the fds
[2013-07-11 16:39:36.737359] I [fuse-bridge.c:4723:fuse_graph_setup] 0-fuse: 
switched to graph 0
[2013-07-11 16:39:36.739297] I 
[client-handshake.c:450:client_set_lk_version_cbk] 0-firewall-scripts-client-1: 
Server lk version = 1
[2013-07-11 16:39:36.739486] I [fuse-bridge.c:3680:fuse_init] 0-glusterfs-fuse: 
FUSE inited with protocol versions: glusterfs 7.13 kernel 7.21
[2013-07-11 16:39:36.740672] I 
[afr-common.c:2057:afr_set_root_inode_on_first_lookup] 
0-firewall-scripts-replicate-0: added root inode
[2013-07-11 16:39:36.741820] I [afr-common.c:2120:afr_discovery_cbk] 
0-firewall-scripts-replicate-0: selecting local read_child 
firewall-scripts-client-0

And from fw2:

[root@chicago-fw2 ~]# tail /var/log/glusterfs/firewall-scripts.log -f
[2013-07-11 15:51:45.499012] I [client-handshake.c:1468:client_setvolume_cbk] 
0-firewall-scripts-client-1: Server and Client lk-version numbers are not same, 
reopening the fds
[2013-07-11 15:51:45.512667] I [fuse-bridge.c:4723:fuse_graph_setup] 0-fuse: 
switched to graph 0
[2013-07-11 15:51:45.513211] I 
[client-handshake.c:450:client_set_lk_version_cbk] 0-firewall-scripts-client-0: 
Server lk version = 1
[2013-07-11 15:51:45.513416] I 
[client-handshake.c:450:client_set_lk_version_cbk] 0-firewall-scripts-client-1: 
Server lk version = 1
[2013-07-11 15:51:45.513538] I [fuse-bridge.c:3680:fuse_init] 0-glusterfs-fuse: 
FUSE inited with protocol versions: glusterfs 7.13 kernel 7.21
[2013-07-11 15:51:45.515208] I 
[afr-common.c:2057:afr_set_root_inode_on_first_lookup] 
0-firewall-scripts-replicate-0: added root inode
[2013-07-11 15:51:45.516512] I [afr-common.c:2120:afr_discovery_cbk] 
0-firewall-scripts-replicate-0: selecting local read_child 
firewall-scripts-client-1
[2013-07-11 16:21:28.150710] I [fuse-bridge.c:4583:fuse_thread_proc] 0-fuse: 
unmounting /firewall-scripts
[2013-07-11 16:21:28.154455] W [glusterfsd.c:970:cleanup_and_exit] 
(-->/usr/lib64/libc.so.6(clone+0x6d) [0x7fa599ad613d] 
(-->/usr/lib64/libpthread.so.0(+0x3c1b407c53) [0x7fa59a16cc53] 
(-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xd5) [0x7fa59ae5ae35]))) 0-: received 
signum (15), shutting down
[2013-07-11 16:21:28.154503] I [fuse-bridge.c:5212:fini] 0-fuse: Unmounting 
'/firewall-scripts'.


Blank space - this is where I did mount -av

[2013-07-11 16:39:35.100584] I [glusterfsd.c:1878:main] 0-/usr/sbin/glusterfs: 
Started running /usr/sbin/glusterfs version 3.4.0beta3 (/usr/sbin/glusterfs 
--volfile-id=/firewall-scripts --volfile-server=192.168.253.2 /firewall-scripts)
[2013-07-11 16:39:35.113481] I [socket.c:3480:socket_init] 0-glusterfs: SSL 
support is NOT enabled
[2013-07-11 16:39:35.113614] I [socket.c:3495:socket_init] 0-glusterfs: using 
system polling thread
[2013-07-11 16:39:35.147118] I [socket.c:3480:socket_init] 
0-firewall-scripts-client-1: SSL support is NOT enabled
[2013-07-11 16:39:35.147313] I [socket.c:3495:socket_init] 
0-firewall-scripts-client-1: using system polling thread
[2013-07-11 16:39:35.149112] I [socket.c:3480:socket_init] 
0-firewall-scripts-client-0: SSL support is NOT enabled
[2013-07-11 16:39:35.149268] I [socket.c:3495:socket_init] 
0-firewall-scripts-client-0: using system polling thread
[2013-07-11 16:39:35.149390] I [client.c:2154:notify] 
0-firewall-scripts-client-0: parent translators are ready, attempting connect 
on transport
[2013-07-11 16:39:35.160491] I [client.c:2154:notify] 
0-firewall-scripts-client-1: parent translators are ready, attempting connect 
on transport
Given volfile:
+------------------------------------------------------------------------------+
   1: volume firewall-scripts-client-0
   2:     type protocol/client
   3:     option password fb3955b7-a6ca-49bb-b886-d4b6609392f8
   4:     option username de6eacd1-31bc-4bdb-a049-776cd840059e
   5:     option transport-type tcp
   6:     option remote-subvolume /gluster-fw1
   7:     option remote-host 192.168.253.1
   8: end-volume
   9:
  10: volume firewall-scripts-client-1
  11:     type protocol/client
  12:     option password fb3955b7-a6ca-49bb-b886-d4b6609392f8
  13:     option username de6eacd1-31bc-4bdb-a049-776cd840059e
  14:     option transport-type tcp
  15:     option remote-subvolume /gluster-fw2
  16:     option remote-host 192.168.253.2
  17: end-volume
  18:
  19: volume firewall-scripts-replicate-0
  20:     type cluster/replicate
  21:     subvolumes firewall-scripts-client-0 firewall-scripts-client-1
  22: end-volume
  23:
  24: volume firewall-scripts-dht
  25:     type cluster/distribute
  26:     subvolumes firewall-scripts-replicate-0
  27: end-volume
  28:
  29: volume firewall-scripts-write-behind
  30:     type performance/write-behind
  31:     subvolumes firewall-scripts-dht
  32: end-volume
  33:
  34: volume firewall-scripts-read-ahead
  35:     type performance/read-ahead
  36:     subvolumes firewall-scripts-write-behind
  37: end-volume
  38:
  39: volume firewall-scripts-io-cache
  40:     type performance/io-cache
  41:     subvolumes firewall-scripts-read-ahead
  42: end-volume
  43:
  44: volume firewall-scripts-quick-read
  45:     type performance/quick-read
  46:     subvolumes firewall-scripts-io-cache
  47: end-volume
  48:
  49: volume firewall-scripts-open-behind
  50:     type performance/open-behind
  51:     subvolumes firewall-scripts-quick-read
  52: end-volume
  53:
  54: volume firewall-scripts-md-cache
  55:     type performance/md-cache
  56:     subvolumes firewall-scripts-open-behind
  57: end-volume
  58:
  59: volume firewall-scripts
  60:     type debug/io-stats
  61:     option count-fop-hits off
  62:     option latency-measurement off
  63:     subvolumes firewall-scripts-md-cache
  64: end-volume

+------------------------------------------------------------------------------+
[2013-07-11 16:39:35.173867] I [rpc-clnt.c:1648:rpc_clnt_reconfig] 
0-firewall-scripts-client-0: changing port to 49152 (from 0)
[2013-07-11 16:39:35.174065] I [rpc-clnt.c:1648:rpc_clnt_reconfig] 
0-firewall-scripts-client-1: changing port to 49152 (from 0)
[2013-07-11 16:39:35.174377] W [socket.c:514:__socket_rwv] 
0-firewall-scripts-client-0: readv failed (No data available)
[2013-07-11 16:39:35.185807] W [socket.c:514:__socket_rwv] 
0-firewall-scripts-client-1: readv failed (No data available)
[2013-07-11 16:39:35.197485] I 
[client-handshake.c:1658:select_server_supported_programs] 
0-firewall-scripts-client-0: Using Program GlusterFS 3.3, Num (1298437), 
Version (330)
[2013-07-11 16:39:35.197740] I 
[client-handshake.c:1658:select_server_supported_programs] 
0-firewall-scripts-client-1: Using Program GlusterFS 3.3, Num (1298437), 
Version (330)
[2013-07-11 16:39:35.198257] I [client-handshake.c:1456:client_setvolume_cbk] 
0-firewall-scripts-client-0: Connected to 192.168.253.1:49152, attached to 
remote volume '/gluster-fw1'.
[2013-07-11 16:39:35.198346] I [client-handshake.c:1468:client_setvolume_cbk] 
0-firewall-scripts-client-0: Server and Client lk-version numbers are not same, 
reopening the fds
[2013-07-11 16:39:35.198546] I [afr-common.c:3698:afr_notify] 
0-firewall-scripts-replicate-0: Subvolume 'firewall-scripts-client-0' came back 
up; going online.
[2013-07-11 16:39:35.198759] I [client-handshake.c:1456:client_setvolume_cbk] 
0-firewall-scripts-client-1: Connected to 192.168.253.2:49152, attached to 
remote volume '/gluster-fw2'.
[2013-07-11 16:39:35.198810] I [client-handshake.c:1468:client_setvolume_cbk] 
0-firewall-scripts-client-1: Server and Client lk-version numbers are not same, 
reopening the fds
[2013-07-11 16:39:35.211534] I [fuse-bridge.c:4723:fuse_graph_setup] 0-fuse: 
switched to graph 0
[2013-07-11 16:39:35.211921] I 
[client-handshake.c:450:client_set_lk_version_cbk] 0-firewall-scripts-client-1: 
Server lk version = 1
[2013-07-11 16:39:35.212098] I 
[client-handshake.c:450:client_set_lk_version_cbk] 0-firewall-scripts-client-0: 
Server lk version = 1
[2013-07-11 16:39:35.212234] I [fuse-bridge.c:3680:fuse_init] 0-glusterfs-fuse: 
FUSE inited with protocol versions: glusterfs 7.13 kernel 7.21
[2013-07-11 16:39:35.213421] I 
[afr-common.c:2057:afr_set_root_inode_on_first_lookup] 
0-firewall-scripts-replicate-0: added root inode
[2013-07-11 16:39:35.214372] I [afr-common.c:2120:afr_discovery_cbk] 
0-firewall-scripts-replicate-0: selecting local read_child 
firewall-scripts-client-1


_______________________________________________
Gluster-users mailing list
[email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Reply via email to