On Thu, 2 Nov 2006, Andrew Laurence wrote:
Has anyone evaluated the new "fixed" locking in NFSv4 for use with
imapd? I attended a talk by Sun's Spencer Shepler at USENIX '05 in
which he spent a good deal of time on NFSv4 (supposedly) fixes the
locking shortcomings and can be used as a native file system.
http://blogs.sun.com/shepler/date/20050407
I doubt it, very much.
SUN has made claims like this repeatedly for over 10 years. Every time,
these claims have been proven wrong.
SUN has zero credibility in my book.
I just (October 25, 2006) fixed the test for NFS on SVR4 systems in UW
imapd yet again. The current release version of Solaris broke it. I
learned that it was broken when a site got the familiar cluster-wide NFS
hangs that are caused by SUN's foolish attempt to lock over NFS -- a
problem that I thought was resolved years ago.
I have little doubt that in another few years I will have to deal with
this issue yet again.
To be blunt, the best thing that you can do with a SUN machine is to erase
Solaris and install Linux or BSD on it -- preferably *NOT* Linux from SUN
since it may have SUN "improvements". The same goes for other proprietary
UNIX platforms (HP, IBM, etc.).
I guess that it's time again to explain "why NFS is a bad idea for IMAP"
and "why SVR4 (Solaris, HP-UX, AIX, etc.) is a bad idea for UW imapd".
WHY NFS IS BAD FOR IMAP
The issue is not just locking. There are other, equally
important, issues that render NFS unsuitable:
. several filesystems that are atomic on a real filesystem are
non-atomic under NFS.
. NFS does not synchronize updates across a cluster; this is, if you
write() to a file the buffer cache on other clients may still contain
the previous data.
The underlying cause is that NFS is stateless. This is fundamental to
NFS' design. This has long been known to be a mistake, but Xerox's
Woodstock File System was all the rage in the early 1980s and NFS
inherited that.
NFS is fundamentally unsuitable for UW imapd (and, the last that I heard,
for Cyrus). UW imapd supports traditional UNIX format under NFS, but with
limitations (no protection against multiple access) and with very limited
scalability. The more advanced formats, notably mbx and mix, are UNSAFE
under NFS.
NFS is not the way to scale large IMAP servers. IMAP servers are
generally I/O bound, not CPU bound; and thus it makes no sense to have a
dozen IMAP servers getting data from a monolithic NFS server.
The other argument advanced in favor of NFS is reliability; if one IMAP
server goes down users can access one of the others. The flaw in that
argument is that you lose everything when the NFS monolith goes down.
UW itself has "been there, done that, and got the T-shirt" with IMAP over
NFS. I understand that it may be difficult to accept my advice, because
the admins at UW didn't accept my advice either. After some years of
trying to make IMAP over NFS work, they agreed that it was a mistake.
For the past decade+ we've done IMAP without NFS. Now, we are finally
eliminating the last remnants of SVR4.
WHY SVR4 IS BAD FOR UW IMAPD
Proprietary SVR4 UNIX systems are also not a good solution for UW imapd.
Even if you do not use NFS (and thus avoid the NFS locking, non-atomic,
and synchronication problems), you still have the problem that the only
locking on proprietary SVR4 is fcntl() locking.
The problem with fcntl() locking is that it has the semantics of IEEE Std
1003.1-1988 (``POSIX.1'') that require that all locks associated with a
file for a given process are removed when any file descriptor for that
file is closed by that process.
This "feature" is devastating for the locking done by UW imapd (and many
other applications), and UW imapd must undergo an extremely costly
workaround to prevent files from being corrupted because of it. Even
though I have thoroughly debugged the workaround, I suspect that there are
still a few paths by which it fails.
Beware!
Some SVR4 systems have a flock() routine in the C library, but that
routine calls fcntl() locking internally. Solaris is one such system.
Solaris may have a flock() routine, but it DOES NOT have flock() locking!
Oh yeah, almost forgot:
To this list you can add OSF/1; the system also known as DIGITAL UNIX and
TRU64 UNIX. OSF/1 once had flock(), but in some version unknown to me
the flock() system call was removed and replaced by a C library call that
invoked fcntl() locking.
Most people have gotten rid of OSF/1, and HP seems to have cancelled the
product. We got rid of it back when DIGITAL still existed, after a
repeated series of filesystem corruptions caused by AdvFS.
But just in case you have an old Alpha machine lying around and you wonder
if you can squeeze some more life out of it by running an IMAP server on
it, don't! Although UW imapd now knows that OSF/1 is like SVR4, it's just
not worth the hassle. Buy a cheap Linux box instead.
CONCLUSION
Linux and BSD are suitable operating systems for UW imapd. BSD has a
slight technical edge, but Linux has better market support.
Small, inexpensive IMAP servers, each serving a portion of the IMAP user
space *without* NFS, are the way to go. There are multiple technical
solutions for directing users to the appropriate IMAP server; that's a
topic for another essay.
-- Mark --
http://panda.com/mrc
Democracy is two wolves and a sheep deciding what to eat for lunch.
Liberty is a well-armed sheep contesting the vote.
_______________________________________________
Imap-uw mailing list
[email protected]
https://mailman1.u.washington.edu/mailman/listinfo/imap-uw