Re: [Gluster-users] Problem With AFR

Keith Freedman Sat, 06 Mar 2010 14:48:42 -0800

At 01:14 PM 3/6/2010, Chad wrote:

I don't disagree that other network file systems may have issues.
It does not change that fact that while 5seconds may be "remarkable" it does not make itany more acceptable when my services die and my clients complain.The point of high availability is that anysingle point of failure does not take down your services.

The point is that it doesn't take down yourservices. If you have good quality equipment youshould only experience this 5 second delay on themost rare of occasions and only if there is something terribly wrong.

Perhaps it's a perception problem. The goal ofha isn't necessarily to prevent any notice of anyproblems in infrastructure, but instead to insurethat they are available. Often this means indegraded mode. if a 5 second delay isunacceptable, then there's unlikely to be anything that will suit your needs.

if they made the delay shorter, you'd be at riskfor mirrors breaking when there is simply normalnetwork latency which would make the situationworse since resyncs would happen much more frequently.

your users might not see a 5 second delay, butinstead they'd have a system which feels muchslower since it's constantly doing all this unnecessary extra work.

^C



Keith Freedman wrote:
At 08:13 AM 3/6/2010, Chad wrote:
I second this question/request.
When the 1st server goes down, how do weeliminate the hang time? 5 seconds is a long time for a file system to be hung.
it is a long time, but if you think about otherHA filesystems, this is one of the shortest I've seen.hardware NAS devices, when there's a nodefailure, will often take 30 seconds to 2minutes to fully recover. In light of thealternatives, 5 seconds is remarkable.if you're getting these delays on a regularbasis then something is wrong, but if it'ssomething that just happens in the face of afailure, then it should be relatively rare.if it happens all the time, then you reallyneed to figure out why your systems are failing and resolve that problem.
Just my .02
^C



Richard de Vries wrote:
Hello Eduardo,
We had the same problem over here, with two nodes that are both server
and client.
You can try to lower the ping-timeout in the client volume file:
option ping-timeout 5
5 seconds is the sadly the lowest possible ping-timeout, our
applications on the main node can hang for about 5 seconds in case the
standby node fails (although we have a dedicated interconnect).
Maybe the gluster developers have a better solution to this.
Regards,
Richard
Hi i´m using GlusterFs V3.0.2 in Fedora 12.

I have configure AFR with 3 nodes and mount the volume in the client, its
work fine, but when a node fail, the file system in the client locks and I
can´t execute any operations for about 40 to 50 Seconds.
After 40 to 50 Seconds the File system on the client start to work again.
How I can resolve this problem, because the file system can´t bee
inaccessible for so long Time.
_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

Re: [Gluster-users] Problem With AFR

Reply via email to