Ok, now I'm intrigued.
btw... when I read your initial email I was on my phone. I only got as
far as the selinux error before my ADHD got the better of me and I
thought, "well it says what the problem is right there." Sorry, or I
would have answered at that time.
As it turns out, reading further that error you're seeing comes from
glusterfsd.service (not glusterd.service) which shouldn't even be
enabled unless you're trying to use old legacy volfiles from 3.0. The
"parsing the volfile failed" was spurious, as you discovered.
As for your current problem...
Are your two machines perhaps connected via crossover cable?
The question comes down to, when you're on 192.168.253.1 and shut down
192.168.253.2, what prevents .1 from being able to be reached? Is it,
perhaps, because it's gone offline? Check dmesg. See if you can ping the
.1 address (when .2 is down) and see if you can telnet to port 24007 on .1.
On 07/11/2013 09:46 AM, Greg Scott wrote:
When you first mount your volume, look in the client log and see if it's
connecting to both bricks.
I suspect it's not and that the failure is related to firewall settings.
Logs from both nodes below. For this test, first I did "umount /firewall-scripts" from both
nodes. Then I did “mount –av” using the default parameters in my fstab file. I did **not** turn on the
backupvolfile-server=<secondary server> for this test. And then in another window, I did
"tail tail /var/log/glusterfs/firewall-scripts.log -f" and you can see the spot where I mounted
my file system back up again.
Note that everything works as expected when both nodes are online, so this suggests
everyone can see everyone else when things are steady-state. Also note that
backupvolfile-server=<secondary server> changed the behavior - I documented
this in an earlier post.
...the failure is related to firewall settings.
No way. I’m wide open on the interface I’m using for heartbeat and glusterfs.
In my application, I take node fw1 offline by inserting a firewall rule and
then getting rid of it a few seconds later. For testing right now, I just
insert the rule by hand, look at a bunch of stuff, then get rid of it later.
But since you brought it up, I cleaned out all firewall rules before doing and
logging the mounts below. Near as I can tell, it looks like everyone can see
everyone else. And the logs look the same to my eye as they did before I
dropped all (not relevant) firewall rules.
Log from fw1:
[root@chicago-fw1 ~]#
[root@chicago-fw1 ~]# tail /var/log/glusterfs/firewall-scripts.log -f
[2013-07-11 15:51:54.423508] I [client-handshake.c:1456:client_setvolume_cbk]
0-firewall-scripts-client-1: Connected to 192.168.253.2:49152, attached to
remote volume '/gluster-fw2'.
[2013-07-11 15:51:54.423576] I [client-handshake.c:1468:client_setvolume_cbk]
0-firewall-scripts-client-1: Server and Client lk-version numbers are not same,
reopening the fds
[2013-07-11 15:51:54.440124] I [fuse-bridge.c:4723:fuse_graph_setup] 0-fuse:
switched to graph 0
[2013-07-11 15:51:54.440660] I
[client-handshake.c:450:client_set_lk_version_cbk] 0-firewall-scripts-client-1:
Server lk version = 1
[2013-07-11 15:51:54.440886] I [fuse-bridge.c:3680:fuse_init] 0-glusterfs-fuse:
FUSE inited with protocol versions: glusterfs 7.13 kernel 7.21
[2013-07-11 15:51:54.442235] I
[afr-common.c:2057:afr_set_root_inode_on_first_lookup]
0-firewall-scripts-replicate-0: added root inode
[2013-07-11 15:51:54.443451] I [afr-common.c:2120:afr_discovery_cbk]
0-firewall-scripts-replicate-0: selecting local read_child
firewall-scripts-client-0
[2013-07-11 16:21:22.729423] I [fuse-bridge.c:4583:fuse_thread_proc] 0-fuse:
unmounting /firewall-scripts
[2013-07-11 16:21:22.730976] W [glusterfsd.c:970:cleanup_and_exit]
(-->/usr/lib64/libc.so.6(clone+0x6d) [0x7f7a69fee13d]
(-->/usr/lib64/libpthread.so.0(+0x33c1607c53) [0x7f7a6a684c53]
(-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xd5) [0x7f7a6b372e35]))) 0-: received
signum (15), shutting down
[2013-07-11 16:21:22.731040] I [fuse-bridge.c:5212:fini] 0-fuse: Unmounting
'/firewall-scripts'.
Blank space - mount -av below.
[2013-07-11 16:39:36.625696] I [glusterfsd.c:1878:main] 0-/usr/sbin/glusterfs:
Started running /usr/sbin/glusterfs version 3.4.0beta3 (/usr/sbin/glusterfs
--volfile-id=/firewall-scripts --volfile-server=192.168.253.1 /firewall-scripts)
[2013-07-11 16:39:36.640661] I [socket.c:3480:socket_init] 0-glusterfs: SSL
support is NOT enabled
[2013-07-11 16:39:36.640800] I [socket.c:3495:socket_init] 0-glusterfs: using
system polling thread
[2013-07-11 16:39:36.672416] I [socket.c:3480:socket_init]
0-firewall-scripts-client-1: SSL support is NOT enabled
[2013-07-11 16:39:36.672539] I [socket.c:3495:socket_init]
0-firewall-scripts-client-1: using system polling thread
[2013-07-11 16:39:36.674545] I [socket.c:3480:socket_init]
0-firewall-scripts-client-0: SSL support is NOT enabled
[2013-07-11 16:39:36.674667] I [socket.c:3495:socket_init]
0-firewall-scripts-client-0: using system polling thread
[2013-07-11 16:39:36.675015] I [client.c:2154:notify]
0-firewall-scripts-client-0: parent translators are ready, attempting connect
on transport
[2013-07-11 16:39:36.686253] I [client.c:2154:notify]
0-firewall-scripts-client-1: parent translators are ready, attempting connect
on transport
Given volfile:
+------------------------------------------------------------------------------+
1: volume firewall-scripts-client-0
2: type protocol/client
3: option password fb3955b7-a6ca-49bb-b886-d4b6609392f8
4: option username de6eacd1-31bc-4bdb-a049-776cd840059e
5: option transport-type tcp
6: option remote-subvolume /gluster-fw1
7: option remote-host 192.168.253.1
8: end-volume
9:
10: volume firewall-scripts-client-1
11: type protocol/client
12: option password fb3955b7-a6ca-49bb-b886-d4b6609392f8
13: option username de6eacd1-31bc-4bdb-a049-776cd840059e
14: option transport-type tcp
15: option remote-subvolume /gluster-fw2
16: option remote-host 192.168.253.2
17: end-volume
18:
19: volume firewall-scripts-replicate-0
20: type cluster/replicate
21: subvolumes firewall-scripts-client-0 firewall-scripts-client-1
22: end-volume
23:
24: volume firewall-scripts-dht
25: type cluster/distribute
26: subvolumes firewall-scripts-replicate-0
27: end-volume
28:
29: volume firewall-scripts-write-behind
30: type performance/write-behind
31: subvolumes firewall-scripts-dht
32: end-volume
33:
34: volume firewall-scripts-read-ahead
35: type performance/read-ahead
36: subvolumes firewall-scripts-write-behind
37: end-volume
38:
39: volume firewall-scripts-io-cache
40: type performance/io-cache
41: subvolumes firewall-scripts-read-ahead
42: end-volume
43:
44: volume firewall-scripts-quick-read
45: type performance/quick-read
46: subvolumes firewall-scripts-io-cache
47: end-volume
48:
49: volume firewall-scripts-open-behind
50: type performance/open-behind
51: subvolumes firewall-scripts-quick-read
52: end-volume
53:
54: volume firewall-scripts-md-cache
55: type performance/md-cache
56: subvolumes firewall-scripts-open-behind
57: end-volume
58:
59: volume firewall-scripts
60: type debug/io-stats
61: option count-fop-hits off
62: option latency-measurement off
63: subvolumes firewall-scripts-md-cache
64: end-volume
+------------------------------------------------------------------------------+
[2013-07-11 16:39:36.698740] I [rpc-clnt.c:1648:rpc_clnt_reconfig]
0-firewall-scripts-client-0: changing port to 49152 (from 0)
[2013-07-11 16:39:36.698974] W [socket.c:514:__socket_rwv]
0-firewall-scripts-client-0: readv failed (No data available)
[2013-07-11 16:39:36.711537] I [rpc-clnt.c:1648:rpc_clnt_reconfig]
0-firewall-scripts-client-1: changing port to 49152 (from 0)
[2013-07-11 16:39:36.711717] W [socket.c:514:__socket_rwv]
0-firewall-scripts-client-1: readv failed (No data available)
[2013-07-11 16:39:36.723116] I
[client-handshake.c:1658:select_server_supported_programs]
0-firewall-scripts-client-0: Using Program GlusterFS 3.3, Num (1298437),
Version (330)
[2013-07-11 16:39:36.723521] I
[client-handshake.c:1658:select_server_supported_programs]
0-firewall-scripts-client-1: Using Program GlusterFS 3.3, Num (1298437),
Version (330)
[2013-07-11 16:39:36.723913] I [client-handshake.c:1456:client_setvolume_cbk]
0-firewall-scripts-client-0: Connected to 192.168.253.1:49152, attached to
remote volume '/gluster-fw1'.
[2013-07-11 16:39:36.723995] I [client-handshake.c:1468:client_setvolume_cbk]
0-firewall-scripts-client-0: Server and Client lk-version numbers are not same,
reopening the fds
[2013-07-11 16:39:36.724390] I [afr-common.c:3698:afr_notify]
0-firewall-scripts-replicate-0: Subvolume 'firewall-scripts-client-0' came back
up; going online.
[2013-07-11 16:39:36.724601] I
[client-handshake.c:450:client_set_lk_version_cbk] 0-firewall-scripts-client-0:
Server lk version = 1
[2013-07-11 16:39:36.724730] I [client-handshake.c:1456:client_setvolume_cbk]
0-firewall-scripts-client-1: Connected to 192.168.253.2:49152, attached to
remote volume '/gluster-fw2'.
[2013-07-11 16:39:36.724788] I [client-handshake.c:1468:client_setvolume_cbk]
0-firewall-scripts-client-1: Server and Client lk-version numbers are not same,
reopening the fds
[2013-07-11 16:39:36.737359] I [fuse-bridge.c:4723:fuse_graph_setup] 0-fuse:
switched to graph 0
[2013-07-11 16:39:36.739297] I
[client-handshake.c:450:client_set_lk_version_cbk] 0-firewall-scripts-client-1:
Server lk version = 1
[2013-07-11 16:39:36.739486] I [fuse-bridge.c:3680:fuse_init] 0-glusterfs-fuse:
FUSE inited with protocol versions: glusterfs 7.13 kernel 7.21
[2013-07-11 16:39:36.740672] I
[afr-common.c:2057:afr_set_root_inode_on_first_lookup]
0-firewall-scripts-replicate-0: added root inode
[2013-07-11 16:39:36.741820] I [afr-common.c:2120:afr_discovery_cbk]
0-firewall-scripts-replicate-0: selecting local read_child
firewall-scripts-client-0
And from fw2:
[root@chicago-fw2 ~]# tail /var/log/glusterfs/firewall-scripts.log -f
[2013-07-11 15:51:45.499012] I [client-handshake.c:1468:client_setvolume_cbk]
0-firewall-scripts-client-1: Server and Client lk-version numbers are not same,
reopening the fds
[2013-07-11 15:51:45.512667] I [fuse-bridge.c:4723:fuse_graph_setup] 0-fuse:
switched to graph 0
[2013-07-11 15:51:45.513211] I
[client-handshake.c:450:client_set_lk_version_cbk] 0-firewall-scripts-client-0:
Server lk version = 1
[2013-07-11 15:51:45.513416] I
[client-handshake.c:450:client_set_lk_version_cbk] 0-firewall-scripts-client-1:
Server lk version = 1
[2013-07-11 15:51:45.513538] I [fuse-bridge.c:3680:fuse_init] 0-glusterfs-fuse:
FUSE inited with protocol versions: glusterfs 7.13 kernel 7.21
[2013-07-11 15:51:45.515208] I
[afr-common.c:2057:afr_set_root_inode_on_first_lookup]
0-firewall-scripts-replicate-0: added root inode
[2013-07-11 15:51:45.516512] I [afr-common.c:2120:afr_discovery_cbk]
0-firewall-scripts-replicate-0: selecting local read_child
firewall-scripts-client-1
[2013-07-11 16:21:28.150710] I [fuse-bridge.c:4583:fuse_thread_proc] 0-fuse:
unmounting /firewall-scripts
[2013-07-11 16:21:28.154455] W [glusterfsd.c:970:cleanup_and_exit]
(-->/usr/lib64/libc.so.6(clone+0x6d) [0x7fa599ad613d]
(-->/usr/lib64/libpthread.so.0(+0x3c1b407c53) [0x7fa59a16cc53]
(-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xd5) [0x7fa59ae5ae35]))) 0-: received
signum (15), shutting down
[2013-07-11 16:21:28.154503] I [fuse-bridge.c:5212:fini] 0-fuse: Unmounting
'/firewall-scripts'.
Blank space - this is where I did mount -av
[2013-07-11 16:39:35.100584] I [glusterfsd.c:1878:main] 0-/usr/sbin/glusterfs:
Started running /usr/sbin/glusterfs version 3.4.0beta3 (/usr/sbin/glusterfs
--volfile-id=/firewall-scripts --volfile-server=192.168.253.2 /firewall-scripts)
[2013-07-11 16:39:35.113481] I [socket.c:3480:socket_init] 0-glusterfs: SSL
support is NOT enabled
[2013-07-11 16:39:35.113614] I [socket.c:3495:socket_init] 0-glusterfs: using
system polling thread
[2013-07-11 16:39:35.147118] I [socket.c:3480:socket_init]
0-firewall-scripts-client-1: SSL support is NOT enabled
[2013-07-11 16:39:35.147313] I [socket.c:3495:socket_init]
0-firewall-scripts-client-1: using system polling thread
[2013-07-11 16:39:35.149112] I [socket.c:3480:socket_init]
0-firewall-scripts-client-0: SSL support is NOT enabled
[2013-07-11 16:39:35.149268] I [socket.c:3495:socket_init]
0-firewall-scripts-client-0: using system polling thread
[2013-07-11 16:39:35.149390] I [client.c:2154:notify]
0-firewall-scripts-client-0: parent translators are ready, attempting connect
on transport
[2013-07-11 16:39:35.160491] I [client.c:2154:notify]
0-firewall-scripts-client-1: parent translators are ready, attempting connect
on transport
Given volfile:
+------------------------------------------------------------------------------+
1: volume firewall-scripts-client-0
2: type protocol/client
3: option password fb3955b7-a6ca-49bb-b886-d4b6609392f8
4: option username de6eacd1-31bc-4bdb-a049-776cd840059e
5: option transport-type tcp
6: option remote-subvolume /gluster-fw1
7: option remote-host 192.168.253.1
8: end-volume
9:
10: volume firewall-scripts-client-1
11: type protocol/client
12: option password fb3955b7-a6ca-49bb-b886-d4b6609392f8
13: option username de6eacd1-31bc-4bdb-a049-776cd840059e
14: option transport-type tcp
15: option remote-subvolume /gluster-fw2
16: option remote-host 192.168.253.2
17: end-volume
18:
19: volume firewall-scripts-replicate-0
20: type cluster/replicate
21: subvolumes firewall-scripts-client-0 firewall-scripts-client-1
22: end-volume
23:
24: volume firewall-scripts-dht
25: type cluster/distribute
26: subvolumes firewall-scripts-replicate-0
27: end-volume
28:
29: volume firewall-scripts-write-behind
30: type performance/write-behind
31: subvolumes firewall-scripts-dht
32: end-volume
33:
34: volume firewall-scripts-read-ahead
35: type performance/read-ahead
36: subvolumes firewall-scripts-write-behind
37: end-volume
38:
39: volume firewall-scripts-io-cache
40: type performance/io-cache
41: subvolumes firewall-scripts-read-ahead
42: end-volume
43:
44: volume firewall-scripts-quick-read
45: type performance/quick-read
46: subvolumes firewall-scripts-io-cache
47: end-volume
48:
49: volume firewall-scripts-open-behind
50: type performance/open-behind
51: subvolumes firewall-scripts-quick-read
52: end-volume
53:
54: volume firewall-scripts-md-cache
55: type performance/md-cache
56: subvolumes firewall-scripts-open-behind
57: end-volume
58:
59: volume firewall-scripts
60: type debug/io-stats
61: option count-fop-hits off
62: option latency-measurement off
63: subvolumes firewall-scripts-md-cache
64: end-volume
+------------------------------------------------------------------------------+
[2013-07-11 16:39:35.173867] I [rpc-clnt.c:1648:rpc_clnt_reconfig]
0-firewall-scripts-client-0: changing port to 49152 (from 0)
[2013-07-11 16:39:35.174065] I [rpc-clnt.c:1648:rpc_clnt_reconfig]
0-firewall-scripts-client-1: changing port to 49152 (from 0)
[2013-07-11 16:39:35.174377] W [socket.c:514:__socket_rwv]
0-firewall-scripts-client-0: readv failed (No data available)
[2013-07-11 16:39:35.185807] W [socket.c:514:__socket_rwv]
0-firewall-scripts-client-1: readv failed (No data available)
[2013-07-11 16:39:35.197485] I
[client-handshake.c:1658:select_server_supported_programs]
0-firewall-scripts-client-0: Using Program GlusterFS 3.3, Num (1298437),
Version (330)
[2013-07-11 16:39:35.197740] I
[client-handshake.c:1658:select_server_supported_programs]
0-firewall-scripts-client-1: Using Program GlusterFS 3.3, Num (1298437),
Version (330)
[2013-07-11 16:39:35.198257] I [client-handshake.c:1456:client_setvolume_cbk]
0-firewall-scripts-client-0: Connected to 192.168.253.1:49152, attached to
remote volume '/gluster-fw1'.
[2013-07-11 16:39:35.198346] I [client-handshake.c:1468:client_setvolume_cbk]
0-firewall-scripts-client-0: Server and Client lk-version numbers are not same,
reopening the fds
[2013-07-11 16:39:35.198546] I [afr-common.c:3698:afr_notify]
0-firewall-scripts-replicate-0: Subvolume 'firewall-scripts-client-0' came back
up; going online.
[2013-07-11 16:39:35.198759] I [client-handshake.c:1456:client_setvolume_cbk]
0-firewall-scripts-client-1: Connected to 192.168.253.2:49152, attached to
remote volume '/gluster-fw2'.
[2013-07-11 16:39:35.198810] I [client-handshake.c:1468:client_setvolume_cbk]
0-firewall-scripts-client-1: Server and Client lk-version numbers are not same,
reopening the fds
[2013-07-11 16:39:35.211534] I [fuse-bridge.c:4723:fuse_graph_setup] 0-fuse:
switched to graph 0
[2013-07-11 16:39:35.211921] I
[client-handshake.c:450:client_set_lk_version_cbk] 0-firewall-scripts-client-1:
Server lk version = 1
[2013-07-11 16:39:35.212098] I
[client-handshake.c:450:client_set_lk_version_cbk] 0-firewall-scripts-client-0:
Server lk version = 1
[2013-07-11 16:39:35.212234] I [fuse-bridge.c:3680:fuse_init] 0-glusterfs-fuse:
FUSE inited with protocol versions: glusterfs 7.13 kernel 7.21
[2013-07-11 16:39:35.213421] I
[afr-common.c:2057:afr_set_root_inode_on_first_lookup]
0-firewall-scripts-replicate-0: added root inode
[2013-07-11 16:39:35.214372] I [afr-common.c:2120:afr_discovery_cbk]
0-firewall-scripts-replicate-0: selecting local read_child
firewall-scripts-client-1
_______________________________________________
Gluster-users mailing list
[email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-users