On 07-02-2011 17:58, Dimitri Maziuk wrote:
> Dave Dykstra wrote:
>>  From the old linux-ha.org/HaNFS page, Hint #2:
>>      If your kernel defaults to using TCP for NFS (as is the case in 2.6
>>      kernels), switch to UDP instead by using the 'udp' mount option. If
>>      you don't do this, you won't be able to quickly switch from server
>>      "A" to "B" and back to "A" because "A" will hold the TCP connection
>>      in TIME_WAIT state for 15-20 minutes and refuse to reconnect.
> This is when flipping back to "A" right away. If you don't do that, tcp
> is fine.
>
> I have NFS v3 mounts on heartbeat R1 (2.1.4) here with
> noatime,vers=3,rsize=32768,wsize=32768,hard,proto=tcp,timeo=600,retrans=2
> (all defaults) and they don't take a minute to failover. (Initial
> boot-up is another story.)
Even during he fail-back case it should be possible to mitigate that effect.
With NFSv3 we have a very good experience (closer to 20s) both with TCP 
and UDP.
With NFSv4, even messing with the lease/grace times the client can't 
write in less than a minute or two. :(
> So it's either NFS v4 or crm (resource agents?) or both.
We are inclined to blame NFS v4, but the costumer is adamant on this point.
If anyone has good experiences regarding fail-over with NFSv4 we would 
*really* appreciate sharing the setup.
> (Ballpark figure for timeouts at various levels of the network stack is
> 45 sec -- or used to be back when I did my networking 101 -- and if you
> want to lower them you better know what you're doing.)
Well, there are some things that really help here, like gratuitous ARP :)

-- 
ServiSMART                                      Ricardo Sousa
servimos o seu negócio                          tel: +351 96 298 0989

_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to