subject:"Re\: \[Linux\-HA\] NFSv4 with Heartbeat and DRBD"

Re: [Linux-HA] NFSv4 with Heartbeat and DRBD

2011-02-07 Thread Ricardo Botelho de Sousa

On 07-02-2011 17:58, Dimitri Maziuk wrote:
> Dave Dykstra wrote:
>>  From the old linux-ha.org/HaNFS page, Hint #2:
>>  If your kernel defaults to using TCP for NFS (as is the case in 2.6
>>  kernels), switch to UDP instead by using the 'udp' mount option. If
>>  you don't do this, you won't be able to quickly switch from server
>>  "A" to "B" and back to "A" because "A" will hold the TCP connection
>>  in TIME_WAIT state for 15-20 minutes and refuse to reconnect.
> This is when flipping back to "A" right away. If you don't do that, tcp
> is fine.
>
> I have NFS v3 mounts on heartbeat R1 (2.1.4) here with
> noatime,vers=3,rsize=32768,wsize=32768,hard,proto=tcp,timeo=600,retrans=2
> (all defaults) and they don't take a minute to failover. (Initial
> boot-up is another story.)
Even during he fail-back case it should be possible to mitigate that effect.
With NFSv3 we have a very good experience (closer to 20s) both with TCP 
and UDP.
With NFSv4, even messing with the lease/grace times the client can't 
write in less than a minute or two. :(
> So it's either NFS v4 or crm (resource agents?) or both.
We are inclined to blame NFS v4, but the costumer is adamant on this point.
If anyone has good experiences regarding fail-over with NFSv4 we would 
*really* appreciate sharing the setup.
> (Ballpark figure for timeouts at various levels of the network stack is
> 45 sec -- or used to be back when I did my networking 101 -- and if you
> want to lower them you better know what you're doing.)
Well, there are some things that really help here, like gratuitous ARP :)

-- 
ServiSMART  Ricardo Sousa
servimos o seu negócio  tel: +351 96 298 0989

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] NFSv4 with Heartbeat and DRBD

2011-02-07 Thread Dimitri Maziuk

Dave Dykstra wrote:
> From the old linux-ha.org/HaNFS page, Hint #2:
> If your kernel defaults to using TCP for NFS (as is the case in 2.6
> kernels), switch to UDP instead by using the 'udp' mount option. If
> you don't do this, you won't be able to quickly switch from server
> "A" to "B" and back to "A" because "A" will hold the TCP connection
> in TIME_WAIT state for 15-20 minutes and refuse to reconnect.

This is when flipping back to "A" right away. If you don't do that, tcp 
is fine.

I have NFS v3 mounts on heartbeat R1 (2.1.4) here with
noatime,vers=3,rsize=32768,wsize=32768,hard,proto=tcp,timeo=600,retrans=2
(all defaults) and they don't take a minute to failover. (Initial 
boot-up is another story.)

So it's either NFS v4 or crm (resource agents?) or both.

>>   We have implemented a solution based around heartbeat v3 and DRBD. 
>> While everything seems to work very well we have some difficulty with 
>> regard to the time it takes for the NFS service to become fully available.

(Ballpark figure for timeouts at various levels of the network stack is 
45 sec -- or used to be back when I did my networking 101 -- and if you 
want to lower them you better know what you're doing.)

Dima
-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] NFSv4 with Heartbeat and DRBD

2011-02-06 Thread Dave Dykstra

>From the old linux-ha.org/HaNFS page, Hint #2:
If your kernel defaults to using TCP for NFS (as is the case in 2.6
kernels), switch to UDP instead by using the 'udp' mount option. If
you don't do this, you won't be able to quickly switch from server
"A" to "B" and back to "A" because "A" will hold the TCP connection
in TIME_WAIT state for 15-20 minutes and refuse to reconnect.

Are you mounting with TCP?  A minute sounds short.

- Dave

On Fri, Feb 04, 2011 at 10:45:22AM +, Ricardo Botelho de Sousa wrote:
> Hello All!
> 
>   We have implemented a solution based around heartbeat v3 and DRBD. 
> While everything seems to work very well we have some difficulty with 
> regard to the time it takes for the NFS service to become fully available.
> 
>   How long it is expected for a graceful fail-over with NFSv4 to take? 
> We tried reducing grace/lease times to no avail. We don't seem to be 
> able to lower it from about a minute. TCP or UDP doesn't seem to make 
> any difference.
> 
>   It's not that I believe this is related to Heartbeat, but perhaps find 
> an explanation or some mysterious parameter from your collective experience.
> 
> Best regards,
> 
> -- 
> ServiSMART  Ricardo Sousa
> servimos o seu neg?cio  tel: +351 96 298 0989
> 
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] NFSv4 with Heartbeat and DRBD

Re: [Linux-HA] NFSv4 with Heartbeat and DRBD

Re: [Linux-HA] NFSv4 with Heartbeat and DRBD

3 matches

Site Navigation

Mail list logo

Footer information