Re: [Imap-uw] NFSv4, anyone?

Mark Crispin Wed, 08 Nov 2006 11:27:35 -0800

Hi Ken -

Some things in your message need clarification:

The reason why there aren't any locking issues with your traditional UNIXformat mailboxes via NFS is that there is NO locking! The ONLY protectionagainst mailbox corruption over NFS is the .lock file, and that's only ifyou set the spool directory 1777 or install the mlock tool.


Yes.  There is no locking on NFS files even on Solaris.  Read on.

In the higher-performance multi-access formats such as mbx and mix,locking is required:

        NFS == no locking == can't be used with mbx or mix

On Solaris and other SVR4 systems, flock() does not exist as a systemcall, which forces the use of fcntl() locking. [Beware: the flock()routine on Solaris is a jacket into fcntl() lock; it is not true flock().]fcntl() locking, in turn, forces the software to go to considerable (andhighly inefficient) work to get around fcntl() locking's close semantics.

Furthermore, on Solaris and other SVR4 systems, fcntl() invokes thestatd/lockd daemons to do locking over NFS. imapd checks for NFS, andDOES NOT lock if the file is NFS. The reason this check is there is thatif it was absent, then imapd's relatively modest lock/unlock operations ontraditional UNIX format mailboxes will overload the statd/lockd mechanismand cause it to collapse, resulting in a cluster-wide deadlock.

Even if the other NFS issues could be resolved for mbx and mix (and Idoubt very much that they ever will be), the locking done by the mbx andmix formats is much more extensive than that in traditional UNIX format.Thus the cluster-wide deadlock is much more likely to happen.

When a cluster-wide deadlock happened, typically you end up having toerase the statd/lockd database on each machine in the cluster, and rebootall the machines in the cluster.

I must emphasize that the cluster-wide deadlock is a bug in statd/lockd,and not in imapd. Often, the imapd processes that caused the overload nolonger exist. statd/lockd is a gaping denial-of-service security hole inSVR4.

Recently, Solaris and other SVR4 systems broke the test for NFS that imapdhad used for many years. This is the second time that has happened, and Ifound out when a site had a imapd-caused cluster-wide deadlock. We canassume that this will be an ongoing problem with Solaris and other SVR4systems.

My recommendation is that, along with a deployment of some non-NFSsolution, you also deploy a Linux or BSD machine as your IMAP server inorder to get a system that implements flock(). Use of a system thatimplements flock() will get MUCH better price/performance/reliability fromyour IMAP server:

 . it will not have to do the unreliable NFS test to avoid statd/lockd
 . it will not have to do the inefficient workaround of fcntl() close
    semantics.

[Once again, beware!! The flock() routine on Solaris is NOT animplementation of flock(). It is just a jacket into fcntl() that hasflock() calling conventions. As such, it has fcntl() semantics and notflock() semantics.]

Even if your user community favors Solaris as a shell system, they aren'tgoing to care what kernel a dedicated server runs.


That would also free a Solaris machine for use as another shell system.

-- Mark --

http://panda.com/mrc
Democracy is two wolves and a sheep deciding what to eat for lunch.
Liberty is a well-armed sheep contesting the vote.
_______________________________________________
Imap-uw mailing list
[email protected]
https://mailman1.u.washington.edu/mailman/listinfo/imap-uw

Re: [Imap-uw] NFSv4, anyone?

Reply via email to