> What linux distro ?
>
> Anything special about your network configuration ?
>
> Any chance your server is taking too long to release networking and
gluster
> is starting before network is ready ?
>
> Can you completely disable iptables and test again ?
Both nodes are CentOS 6.5 VMs running on VMware ESXi 5.5.0. There is
nothing special about network configuration, just static IPs. Ping and ssh
works fine. I added "iptables -F" to /etc/rc.local. After simulteneous
reboot "gluster peer status" on both nodes is connected and replication
works fine. But "gluster volume status" states that NFS server and
self-heal daemon on one of them isn't running. So I need to restart
glusterd to make them running.
Another issue: when everything is OK after "service glusterd restart" on
both nodes, I reboot one node and then can see on the rebooted node
(ipset02):
*[root@ipset02 etc]#* gluster peer status
Number of Peers: 1
Hostname: ipset01
Uuid: 6313a4dd-f736-46ff-9836-bdf05c886ffd
State: Peer in Cluster (Connected)
*[root@ipset02 etc]#* gluster volume status
Status of volume: ipset-gv
Gluster processPortOnlinePid
--
Brick ipset01:/usr/local/etc/ipset49152Y1615
Brick ipset02:/usr/local/etc/ipset49152Y2282
NFS Server on localhost2049Y2289
Self-heal Daemon on localhostN/AY2296
NFS Server on ipset012049Y2258
Self-heal Daemon on ipset01 N/AY2262
There are no active volume tasks
[root@ipset02 etc]# tail -17 /var/log/glusterfs/glustershd.log
[2014-03-26 07:55:48.982456] E
[client-handshake.c:1742:client_query_portmap_cbk] 0-ipset-gv-client-1:
failed to get the port number for remote subvolume. Please run 'gluster
volume status' on server to see if brick process is running.
[2014-03-26 07:55:48.982532] W [socket.c:514:__socket_rwv]
0-ipset-gv-client-1: readv failed (No data available)
[2014-03-26 07:55:48.982555] I [client.c:2097:client_rpc_notify]
0-ipset-gv-client-1: disconnected
[2014-03-26 07:55:48.982572] I [rpc-clnt.c:1676:rpc_clnt_reconfig]
0-ipset-gv-client-0: changing port to 49152 (from 0)
[2014-03-26 07:55:48.982627] W [socket.c:514:__socket_rwv]
0-ipset-gv-client-0: readv failed (No data available)
[2014-03-26 07:55:48.986252] I
[client-handshake.c:1659:select_server_supported_programs]
0-ipset-gv-client-0: Using Program GlusterFS 3.3, Num (1298437), Version
(330)
[2014-03-26 07:55:48.986551] I
[client-handshake.c:1456:client_setvolume_cbk] 0-ipset-gv-client-0:
Connected to 192.168.1.180:49152, attached to remote volume
'/usr/local/etc/ipset'.
[2014-03-26 07:55:48.986566] I
[client-handshake.c:1468:client_setvolume_cbk] 0-ipset-gv-client-0: Server
and Client lk-version numbers are not same, reopening the fds
[2014-03-26 07:55:48.986628] I [afr-common.c:3698:afr_notify]
0-ipset-gv-replicate-0: Subvolume 'ipset-gv-client-0' came back up; going
online.
[2014-03-26 07:55:48.986743] I
[client-handshake.c:450:client_set_lk_version_cbk] 0-ipset-gv-client-0:
Server lk version = 1
[2014-03-26 07:55:52.975670] I [rpc-clnt.c:1676:rpc_clnt_reconfig]
0-ipset-gv-client-1: changing port to 49152 (from 0)
[2014-03-26 07:55:52.975717] W [socket.c:514:__socket_rwv]
0-ipset-gv-client-1: readv failed (No data available)
[2014-03-26 07:55:52.978961] I
[client-handshake.c:1659:select_server_supported_programs]
0-ipset-gv-client-1: Using Program GlusterFS 3.3, Num (1298437), Version
(330)
[2014-03-26 07:55:52.979128] I
[client-handshake.c:1456:client_setvolume_cbk] 0-ipset-gv-client-1:
Connected to 192.168.1.181:49152, attached to remote volume
'/usr/local/etc/ipset'.
[2014-03-26 07:55:52.979143] I
[client-handshake.c:1468:client_setvolume_cbk] 0-ipset-gv-client-1: Server
and Client lk-version numbers are not same, reopening the fds
[2014-03-26 07:55:52.979269] I
[client-handshake.c:450:client_set_lk_version_cbk] 0-ipset-gv-client-1:
Server lk version = 1
[2014-03-26 07:55:52.980284] I
[afr-self-heald.c:1180:afr_dir_exclusive_crawl] 0-ipset-gv-replicate-0:
Another crawl is in progress for ipset-gv-client-1
And on the node that wasn't rebooted:
*[root@ipset01 ~]#* gluster peer status
Number of Peers: 1
Hostname: ipset02
Uuid: ff14ab0e-53cf-4015-9e49-fb60698c56db
State: Peer in Cluster (Disconnected)
*[root@ipset01 ~]#* gluster volume status
Status of volume: ipset-gv
Gluster processPortOnlinePid
--
Brick ipset01:/usr/local/etc/ipset49152Y1615
NFS Server on localhost2049Y2258
Self-heal Daemon on localhostN/AY2262
There are no active volume tasks
[root@ipset01 ~]# tail -3 /var/log/glusterfs/glustershd.log
[2014-03-26 07:50:28.881369] W [socket.c:514:__socket_rwv]
0-ipset-gv-client-1: readv failed (Connection reset by peer)
[2014-03-26