Hello all,

Just gave a talk at SCaLE 14x today and I mentioned our new locks revocation 
feature which has had a significant impact on our GFS cluster reliability.  As 
such I wanted to share the patch with the community, so here's the bugzilla 
report:

https://bugzilla.redhat.com/show_bug.cgi?id=1301401

=====
Summary:
Mis-behaving brick clients (gNFSd, FUSE, gfAPI) can cause cluster instability 
and eventual complete unavailability due to failures in releasing entry/inode 
locks in a timely manner.

Classic symptoms on this are increased brick (and/or gNFSd) memory usage due 
the high number of (lock request) frames piling up in the processes.  The 
failure-mode results in bricks eventually slowing down to a crawl due to 
swapping, or OOMing due to complete memory exhaustion; during this period the 
entire cluster can begin to fail.  End-users will experience this as hangs on 
the filesystem, first in a specific region of the file-system and ultimately 
the entire filesystem as the offending brick begins to turn into a zombie (i.e. 
not quite dead, but not quite alive either).

Currently, these situations must be handled by an administrator detecting & 
intervening via the "clear-locks" CLI command.  Unfortunately this doesn't 
scale for large numbers of clusters, and it depends on the correct (external) 
detection of the locks piling up (for which there is little signal other than 
state dumps).

This patch introduces two features to remedy this situation:

1. Monkey-unlocking - This is a feature targeted at developers (only!) to help 
track down crashes due to stale locks, and prove the utility of he lock 
revocation feature.  It does this by silently dropping 1% of unlock requests; 
simulating bugs or mis-behaving clients.

The feature is activated via:
features.locks-monkey-unlocking <on/off>

You'll see the message
"[<timestamp>] W [inodelk.c:653:pl_inode_setlk] 0-groot-locks: MONKEY LOCKING 
(forcing stuck lock)!" ... in the logs indicating a request has been dropped.

2. Lock revocation - Once enabled, this feature will revoke a *contended*lock  
(i.e. if nobody else asks for the lock, we will not revoke it) either by the 
amount of time the lock has been held, how many other lock requests are waiting 
on the lock to be freed, or some combination of both.  Clients which are losing 
their locks will be notified by receiving EAGAIN (send back to their callback 
function).

The feature is activated via these options:
features.locks-revocation-secs <integer; 0 to disable>
features.locks-revocation-clear-all [on/off]
features.locks-revocation-max-blocked <integer>

Recommended settings are: 1800 seconds for a time based timeout (give clients 
the benefit of the doubt, or chose a max-blocked requires some experimentation 
depending on your workload, but generally values of hundreds to low thousands 
(it's normal for many ten's of locks to be taken out when files are being 
written @ high throughput).

=====

The patch supplied will patch clean the the v3.7.6 release tag, and probably to 
any 3.7.x release & master (posix locks xlator is rarely touched).

Richard



_______________________________________________
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Reply via email to