On Thu, 2 Nov 2006, Andrew Laurence wrote:
Has anyone evaluated the new "fixed" locking in NFSv4 for use with imapd? I attended a talk by Sun's Spencer Shepler at USENIX '05 in which he spent a good deal of time on NFSv4 (supposedly) fixes the locking shortcomings and can be used as a native file system.
http://blogs.sun.com/shepler/date/20050407

I doubt it, very much.

SUN has made claims like this repeatedly for over 10 years. Every time, these claims have been proven wrong.

SUN has zero credibility in my book.

I just (October 25, 2006) fixed the test for NFS on SVR4 systems in UW imapd yet again. The current release version of Solaris broke it. I learned that it was broken when a site got the familiar cluster-wide NFS hangs that are caused by SUN's foolish attempt to lock over NFS -- a problem that I thought was resolved years ago.

I have little doubt that in another few years I will have to deal with this issue yet again.

To be blunt, the best thing that you can do with a SUN machine is to erase Solaris and install Linux or BSD on it -- preferably *NOT* Linux from SUN since it may have SUN "improvements". The same goes for other proprietary UNIX platforms (HP, IBM, etc.).

I guess that it's time again to explain "why NFS is a bad idea for IMAP" and "why SVR4 (Solaris, HP-UX, AIX, etc.) is a bad idea for UW imapd".

                WHY NFS IS BAD FOR IMAP

The issue is not just locking. There are other, equally important, issues that render NFS unsuitable:
 . several filesystems that are atomic on a real filesystem are
   non-atomic under NFS.
 . NFS does not synchronize updates across a cluster; this is, if you
   write() to a file the buffer cache on other clients may still contain
   the previous data.

The underlying cause is that NFS is stateless. This is fundamental to NFS' design. This has long been known to be a mistake, but Xerox's Woodstock File System was all the rage in the early 1980s and NFS inherited that.

NFS is fundamentally unsuitable for UW imapd (and, the last that I heard, for Cyrus). UW imapd supports traditional UNIX format under NFS, but with limitations (no protection against multiple access) and with very limited scalability. The more advanced formats, notably mbx and mix, are UNSAFE under NFS.

NFS is not the way to scale large IMAP servers. IMAP servers are generally I/O bound, not CPU bound; and thus it makes no sense to have a dozen IMAP servers getting data from a monolithic NFS server.

The other argument advanced in favor of NFS is reliability; if one IMAP server goes down users can access one of the others. The flaw in that argument is that you lose everything when the NFS monolith goes down.

UW itself has "been there, done that, and got the T-shirt" with IMAP over NFS. I understand that it may be difficult to accept my advice, because the admins at UW didn't accept my advice either. After some years of trying to make IMAP over NFS work, they agreed that it was a mistake. For the past decade+ we've done IMAP without NFS. Now, we are finally eliminating the last remnants of SVR4.

                WHY SVR4 IS BAD FOR UW IMAPD

Proprietary SVR4 UNIX systems are also not a good solution for UW imapd.

Even if you do not use NFS (and thus avoid the NFS locking, non-atomic, and synchronication problems), you still have the problem that the only locking on proprietary SVR4 is fcntl() locking.

The problem with fcntl() locking is that it has the semantics of IEEE Std 1003.1-1988 (``POSIX.1'') that require that all locks associated with a file for a given process are removed when any file descriptor for that file is closed by that process.

This "feature" is devastating for the locking done by UW imapd (and many other applications), and UW imapd must undergo an extremely costly workaround to prevent files from being corrupted because of it. Even though I have thoroughly debugged the workaround, I suspect that there are still a few paths by which it fails.

Beware!

Some SVR4 systems have a flock() routine in the C library, but that routine calls fcntl() locking internally. Solaris is one such system. Solaris may have a flock() routine, but it DOES NOT have flock() locking!

Oh yeah, almost forgot:

To this list you can add OSF/1; the system also known as DIGITAL UNIX and TRU64 UNIX. OSF/1 once had flock(), but in some version unknown to me the flock() system call was removed and replaced by a C library call that invoked fcntl() locking.

Most people have gotten rid of OSF/1, and HP seems to have cancelled the product. We got rid of it back when DIGITAL still existed, after a repeated series of filesystem corruptions caused by AdvFS.

But just in case you have an old Alpha machine lying around and you wonder if you can squeeze some more life out of it by running an IMAP server on it, don't! Although UW imapd now knows that OSF/1 is like SVR4, it's just not worth the hassle. Buy a cheap Linux box instead.

                CONCLUSION

Linux and BSD are suitable operating systems for UW imapd. BSD has a slight technical edge, but Linux has better market support.

Small, inexpensive IMAP servers, each serving a portion of the IMAP user space *without* NFS, are the way to go. There are multiple technical solutions for directing users to the appropriate IMAP server; that's a topic for another essay.

-- Mark --

http://panda.com/mrc
Democracy is two wolves and a sheep deciding what to eat for lunch.
Liberty is a well-armed sheep contesting the vote.
_______________________________________________
Imap-uw mailing list
[email protected]
https://mailman1.u.washington.edu/mailman/listinfo/imap-uw

Reply via email to