I have been doing some basic measurements of IMAP NFS usage with
courier-imap, and I thought I'd post the results in case they're of
interest.

The question I'm trying to answer is how courier-imap will scale to large
numbers of IMAP clients, in particular webmail IMAP frontend users, when the
storage is on NFS backends such as Netapps.

The testing was done with an old version of courier-imap (3.0.5) running
under FreeBSD 7.2, with backend on a Linux NFS server (Ubuntu Karmic).  The
NFS mount has the noatime option but is otherwise default (v3, UDP).

I am logging in to a test account with 103 seen messages in its INBOX.

Packets captured on the FreeBSD box using:
   tcpdump -w filename.pcap -i em0 -n -s0 udp port 2049 or tcp port 143

Test 1: login, select INBOX, logout
-----------------------------------

This shows the NFS packets captured by tcpdump (both in and out: 'ok' are
the response packets)

-- login
   1 09:06:34 access
   1 09:06:34 lookup
   2 09:06:34 ok

-- select inbox
  11 09:06:45 access
   1 09:06:45 commit
   2 09:06:45 create
   3 09:06:45 fsstat
   1 09:06:45 link
  13 09:06:45 lookup
  37 09:06:45 ok
   1 09:06:45 readdir
   3 09:06:45 remove
   1 09:06:45 setattr
   1 09:06:45 write

courier-imap is actually very efficient when opening a mailbox.  It simply
does a readdir() to find the names of all the files in the maildir, opens
its uid cache file and reads out the names of the files it knows about, and
then updates the cache (adding any new files it hasn't seen before, and
removing any files which no longer exist in the maildir).

Test 2: real IMAP client
------------------------

The next test was using mutt as an IMAP client. This is representative of
"dumb" IMAP clients which don't perform any local storage of the mailbox
contents; webmail IMAP frontends are likely to behave in the same way.

Going into mutt I changed folder to
  imap://username:passw...@localhost/INBOX

tcpdump showed that mutt was sending the following commands:

a0000 LOGIN "username" "password"
a0001 CAPABILITY
a0002 LIST "" ""
a0003 MYRIGHTS "INBOX"
a0004 SELECT "INBOX"
a0005 FETCH 1:103 (UID FLAGS INTERNALDATE RFC822.SIZE
BODY.PEEK[HEADER.FIELDS (DATE FROM SUBJECT TO CC MESSAGE-ID REFERENCES
CONTENT-TYPE CONTENT-DESCRIPTION IN-REPLY-TO REPLY-TO LINES LIST-POST
X-LABEL)])

and this generated the following NFS ops:

 228 09:22:17 access
   1 09:22:17 commit
   2 09:22:17 create
   5 09:22:17 fsstat
   1 09:22:17 link
  23 09:22:17 lookup
 269 09:22:17 ok
   1 09:22:17 read
   2 09:22:17 readdir
   4 09:22:17 remove
   1 09:22:17 setattr
   1 09:22:17 write

On quitting mutt, it sent these two additional commands:

a0006 CLOSE
a0007 LOGOUT

   2 09:22:28 lookup
   2 09:22:28 ok

Opening and scanning a mailbox in this way generates a lot more NFS traffic
than simply opening it.

As far as I can tell, the cache file caches only the uids, not any of the
message headers or bodies.  So the FETCH command which mutt sends is forcing
imapd to open and read each of the message files in the maildir.

I believe I'm seeing 'access' rather than 'read' requests because the Linux
NFS client is caching the file contents, and is just checking its cache is
up-to-date.  (See point A8 in the NFS FAQ at http://nfs.sourceforge.net/ ).

To confirm this, if I unmount and remount as nfsv2 then I see a load of
read() operations too:

   2 09:40:16 create
   5 09:40:16 fsstat
 116 09:40:16 getattr
   1 09:40:16 link
 136 09:40:16 lookup
 380 09:40:16 ok
 106 09:40:16 read
   9 09:40:16 readdir
   3 09:40:16 remove
   1 09:40:16 setattr
   1 09:40:16 write
   2 09:40:21 lookup
   2 09:40:21 ok

That's 103 reads for the maildir files, plus 3 extra (uidcache etc).

Conclusions
-----------

* this version of courier-imap could generate significant NFS load when used
by a large base of dumb IMAP clients (as opposed to PC-based clients which
cache the messages locally anyway)

* NFSv3 caching appears to be effective at avoiding sending the file
contents repeatedly, so load-balancers should be configured in 'sticky' mode
so that the same client will be preferentially referred to the same IMAP
frontend box.  The NFS server and the IMAP frontends should have as much
cache RAM as possible.  All those 'access' packets could still generate
substantial CPU load at the NFS server though.

* A webmail client should be chosen which holds open the IMAP connection for
each user, and/or caches metadata locally, if possible.

Question: do newer versions of courier-imap perform any additional caching
of message headers or properties? A quick scan of the 4.7.0 code suggests
not, but I could have overlooked something.

Regards,

Brian.

------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Courier-imap mailing list
Courier-imap@lists.sourceforge.net
Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-imap

Reply via email to