Re: [Gluster-users] [Gluster-devel] In a replica 2 server, file-updates on one server missing on the other server #Personal#

2015-02-20 Thread A Ghoshal
I found out the reason this happens a few of days back. Just to let you 
know..

It seems it has partly to do with the way we handle reboots on our setup. 
When we take down one of our replica servers (for testing/maintenance), to 
ensure that the bricks are unmounted correctly, we kill off the glusterfsd 
processes (short of stopping the volume and causing service disruption to 
the mount clients). Let us assume that serv1 is being rebooted. When we 
kill off glusterfsd,

For file-systems that are normally not accessed: 

1. ping between the mount client on serv0 and the brick's glusterfsd on 
serv1 times out. In our system, this ping is configured at 10 seconds. 

2. At this point, the mount client on serv0 destroys the now defunct TCP 
connection and querying the port of the remote brick with the remote 
glusterd process. 

3. But, since by this time serv1 is already down, no response arrives, and 
the local mount client retries the query till serv1 is up once more, upon 
which the glusterd on serv1  responds with the newly allocated port number 
for the brick, and a new connection is thus established.

For frequently accessed file-systems: 

1. it is one of the file operations (read/write) that times out. This 
happens much earlier than 10 seconds. 
This results in the connection being destroyed and the mount client on 
serv0 querying remote glusterd for the remote brick's port number. 

2. Because this happens so quickly, glusterd on serv1 is not yet down, and 
also unaware that the local brick is not alive anymore. So, it returns the 
port number of the dead process.

3. For the mount client on serv0, since the query succeeded, it does not 
attempt another port query, but instead tries to connect to the stale port 
number ad infinitum. 

Our solution to this problem is simple - before we kill glusterfsd and 
unmount the bricks, we stop glusterd:

/etc/init.d/glusterd stop

This ensures that the portmap queries by the mount client on serv0 are 
never honored.

Thanks,
Anirban



From:   A Ghoshal/MUM/TCS
To: Ben England 
Cc: gluster-users@gluster.org
Date:   02/05/2015 04:50 AM
Subject:Re: [Gluster-devel] [Gluster-users] In a replica 2 server, 
file-updates on one server missing on the other server #Personal#
Sent by:A Ghoshal


CC gluster-users.

No, there aren't any firewall rules in our server. As I write in one of my 
earlier emails, if I kill the mount client, and remount the volume, then 
the problem disappears. That is to say, this causes the client to refresh 
remote port data and from there everything's fine. Also, we dont' use 
gfapi - and bind() is always good.




From:   Ben England 
To: A Ghoshal 
Date:   02/05/2015 04:40 AM
Subject:Re: [Gluster-devel] [Gluster-users] In a replica 2 server, 
file-updates on one server missing on the other server #Personal#



could it be a problem with iptables blocking connections?  DO iptables 
--list and make sure gluster ports are allowed through, at both ends. 
Also, if you are using libgfapi, be sure you use rpc-auth-allow-insecure 
if you have a lot of gfapi instances, or else you'll run into problems.

- Original Message -
> From: "A Ghoshal" 
> To: "Ben England" 
> Sent: Wednesday, February 4, 2015 6:07:10 PM
> Subject: Re: [Gluster-devel] [Gluster-users] In a replica 2 server, 
file-updates on one server missing on the other
> server #Personal#
> 
> Thanks, Ben, same here :/ I actually get port numbers for glusterfsd in
> any of the three ways:
> 
> 1. gluster volume status 
> 2. command line for glusterfsd on target server.
> 3. if you're really paranoid, get the glusterfsd PID and use netstat.
> 
> Looking at the code it seems to me that the whole thing operates on a
> statd-notify paradigm. Your local mount client registers for notify on 
all
> remote glusterfsd's. When remote brick goes down and comes back up, you
> are notified and then it calls portmap to obtain remote glusterfsd port.
> 
> I see here that both glusterd are up. But somehow the port number of the
> remote glusterfsd with the mount client is now stale - not sure how it
> happens. Now, the client keeps trying to connect on the stale port every 
3
> seconds. It gets the return errno of -111 (-ECONNREFUSED) which is 
clearly
> indicating that there is not listener on the remote host's IP at this
> port.
> 
> Design-wise, could it indicate to the mount client that the port number
> information needs to be refreshed? Would you say this is a bug of sorts?
> 
> 
> 
> 
> From:   Ben England 
> To: A Ghoshal 
> Date:   02/05/2015 03:59 AM
> Subject:Re: [Gluster-devel] [Gluster-users] In a replica 2 
server,
> file-updates on one server missing on the other server #Personal#
> 
> 
> 
> I thought Gluster was based on ONC RPC, which means there are no fixed
> port numbers except for glusterd (24007).  The client connects to
> Glusterd, reads the volfile, and gets the port numbers of the registered
> glusterfsd processes at that ti

Re: [Gluster-users] [Gluster-devel] In a replica 2 server, file-updates on one server missing on the other server #Personal#

2015-02-04 Thread A Ghoshal
CC gluster-users.

No, there aren't any firewall rules in our server. As I write in one of my 
earlier emails, if I kill the mount client, and remount the volume, then 
the problem disappears. That is to say, this causes the client to refresh 
remote port data and from there everything's fine. Also, we dont' use 
gfapi - and bind() is always good.



From:   Ben England 
To: A Ghoshal 
Date:   02/05/2015 04:40 AM
Subject:Re: [Gluster-devel] [Gluster-users] In a replica 2 server, 
file-updates on one server missing on the other server #Personal#



could it be a problem with iptables blocking connections?  DO iptables 
--list and make sure gluster ports are allowed through, at both ends. 
Also, if you are using libgfapi, be sure you use rpc-auth-allow-insecure 
if you have a lot of gfapi instances, or else you'll run into problems.

- Original Message -
> From: "A Ghoshal" 
> To: "Ben England" 
> Sent: Wednesday, February 4, 2015 6:07:10 PM
> Subject: Re: [Gluster-devel] [Gluster-users] In a replica 2 server, 
file-updates on one server missing on the other
> server #Personal#
> 
> Thanks, Ben, same here :/ I actually get port numbers for glusterfsd in
> any of the three ways:
> 
> 1. gluster volume status 
> 2. command line for glusterfsd on target server.
> 3. if you're really paranoid, get the glusterfsd PID and use netstat.
> 
> Looking at the code it seems to me that the whole thing operates on a
> statd-notify paradigm. Your local mount client registers for notify on 
all
> remote glusterfsd's. When remote brick goes down and comes back up, you
> are notified and then it calls portmap to obtain remote glusterfsd port.
> 
> I see here that both glusterd are up. But somehow the port number of the
> remote glusterfsd with the mount client is now stale - not sure how it
> happens. Now, the client keeps trying to connect on the stale port every 
3
> seconds. It gets the return errno of -111 (-ECONNREFUSED) which is 
clearly
> indicating that there is not listener on the remote host's IP at this
> port.
> 
> Design-wise, could it indicate to the mount client that the port number
> information needs to be refreshed? Would you say this is a bug of sorts?
> 
> 
> 
> 
> From:   Ben England 
> To: A Ghoshal 
> Date:   02/05/2015 03:59 AM
> Subject:Re: [Gluster-devel] [Gluster-users] In a replica 2 
server,
> file-updates on one server missing on the other server #Personal#
> 
> 
> 
> I thought Gluster was based on ONC RPC, which means there are no fixed
> port numbers except for glusterd (24007).  The client connects to
> Glusterd, reads the volfile, and gets the port numbers of the registered
> glusterfsd processes at that time, then it connects to glusterfsd.  Make
> sense?  What you need to know is whether glusterfsd is running or not, 
and
> whether glusterd is finding out current state of glusterfsd.
> /var/log/glusterfsd/bricks/*log has log files for each glusterfsd 
process,
> you might be able to see from that what glusterfsd port number is.
> /var/log/glusterfs/etc*log is glusterd's log file, it might say whether
> glusterd knows about glusterfsd.  I'm not as good at troubleshooting as
> some of the other people are so don't take my word for it.
> -ben
> 
> 
> - Original Message -
> > From: "A Ghoshal" 
> > To: gluster-de...@gluster.org
> > Cc: gluster-users@gluster.org, gluster-users-boun...@gluster.org
> > Sent: Wednesday, February 4, 2015 4:36:02 PM
> > Subject: Re: [Gluster-devel] [Gluster-users] In a replica 2 server,
> file-updates on one server missing on the other
> > server #Personal#
> > 
> > Sorry for spamming you guys, but this is kind of important for me to
> debug,
> > so if you saw anything like this before, do let me know. Here's an
> update:
> > 
> > It seems the mount client is attempting connection with an invalid 
port
> > number. 49175 is NOT the port number of glusterfsd on serv1
> (192.168.24.8).
> > 
> > I got me an strace:
> > 
> > [pid 31026] open("/proc/sys/net/ipv4/ip_local_reserved_ports", 
O_RDONLY)
> = -1
> > ENOENT (No such file or directory)
> > [pid 31026] write(4, "[2015-02-04 20:39:02.793154] W ["..., 215) = 215
> > [pid 31026] write(4, "[2015-02-04 20:39:02.793289] W ["..., 194) = 194
> > [pid 31026] bind(10, {sa_family=AF_INET, sin_port=htons(1023),
> > sin_addr=inet_addr("192.168.24.80")}, 16) = 0
> > [pid 31026] fcntl(10, F_GETFL) = 0x2 (flags O_RDWR)
> > [pid 31026] fcntl(10, F_SETFL, O_RDWR|O_NONBLOCK) = 0
> > [pid 31026] connect(10, {sa_family=AF_INET, sin_port=htons(49175),
> > sin_addr=inet_addr("192.168.24.81")}, 16) = -1 EINPROGRESS (Operation
> now in
> > progress)
> > [pid 31026] fcntl(10, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK)
> > [pid 31026] fcntl(10, F_SETFL, O_RDWR|O_NONBLOCK) = 0
> > [pid 31026] epoll_ctl(3, EPOLL_CTL_ADD, 10, 
{EPOLLIN|EPOLLPRI|EPOLLOUT,
> > {u32=10, u64=8589934602}}) = 0
> > [pid 31026] nanosleep({1, 0}, 
> > [pid 31021] <... epoll_wait resumed>
> {{EPOLLIN|EPOLLOUT|EPOLLERR|EPOLLHUP,
> >