I'm exploring the possibility of using CTDB to synchronize mandatory 
file lock states across a 9-server Samba cluster that re-exports a set 
of about 20 (large-ish) NFS v3 mount points.  Yes, I know the Samba team 
highly discourages re-exporting anything, including NFS mounts, but they 
also keep republishing old NFS information found in old Samba guides in 
the current Samba 3.5.0 guides.  But I digress.

I ran the ping_pong.c test against NFS and got much lower file locking 
performance than I thought I would.  Specifically, I got almost 4000 
locks/sec from a single client over gigabit ethernet.  This seems really 
slow compared to 1.4M locks/sec from a local filesystem and 400k 
locks/sec if mounted using the NFS client-side-only nolock option.

When I add the second ping_pong locking client I get between 0 and 40 
locks/sec at best, but usually both clients hang indefinitely (or at 
least until one of them is killed).

Before anyone suggests it, yes I know I can get more better Samba 
throughput, latency  and locking performance by putting Samba on the 
file server itself; but I don't for the following reasons:

- I would rather reboot Samba separately when it decides to blow chunks 
in memory,
- I can handle a 30 second reconnect timeout interval for a few users 
much easier than a full 15 minute fsck repair for a lot of users,
- Samba updates are easier to test and less disruptive this way,
- Samba can't handle nearly the number of active, concurrent users NFS can,
- Getting the resulting active users/server ratio down is easily 
accomplished by adding more Samba servers to my Samba cluster (as 
opposed to splitting off yet another file server and disk space just 
because one server can't manage all the active, chatty SMB sessions alone),
- The existing throughput is 4 times faster than what most of the 
windows clients have available, and about 40 times faster than any of 
them actually use at peak demand.  Increasing that to 5 times faster 
isn't likely to help them when Samba appears to be a processor- and 
memory-bound application.
- This way, I don't have to run DFS to have a unified file hierarchy.  I 
still suffer from last-write-wins problems like DFS with NT-FSR/DFS-R 
(and which CTDB might resolve anyway), but I can still scale-out without 
replication delays and issues.

I think improving the back-end NFS locking performance will have the 
greatest over-all benefit for the windows clients.  So has anyone had to 
improve NFS' lockd performance?  Is it possible, or is lockd the major 
driving reason for clustered filesystems and/or parallel-NFS development?

And does anyone have any experience with CTDB, good or bad?

Grazie,
Daniel Fussell
--------------------
BYU Unix Users Group 
http://uug.byu.edu/ 

The opinions expressed in this message are the responsibility of their
author.  They are not endorsed by BYU, the BYU CS Department or BYU-UUG. 
___________________________________________________________________
List Info (unsubscribe here): http://uug.byu.edu/mailman/listinfo/uug-list

Reply via email to