Hi,

In a 2-node nfs-ganesha cluster setup, we have noticed that after couple of 
iterations of failover & failback of the IP between those nodes, client I/O 
gets stuck. We have observed this in RHEL 7.1 environments (not sure about RHEL 
6). While debugging I see that, the node which takes over Virtual IP(after 
couple of iterations) doesn't respond(acknowledge) to the client's TCP SYN 
packet. 

Found couple of discussions around it in few forums and I tried tuning certain 
TCP parameters (tcp_timestamp, tcp_window_scaling) as mentioned in there. But 
it did not work. The current work-around we are left with (to resume the I/Os) 
is either 
* restart nfs-ganesha service on the node which has taken over IP, to clear the 
existing established TCP connections. Or 
* failback the IP by getting the original node back online to resume the I/O.

Any ideas on what could be have been the reason for TCP ACK not being sent to 
the TCP SYN packet coming on an existing connection in ESTABLISHED state? Any 
pointers on how to fix that?

Thanks,
Soumya

------------------------------------------------------------------------------
_______________________________________________
Nfs-ganesha-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Reply via email to