I have been doing some basic measurements of IMAP NFS usage with courier-imap, and I thought I'd post the results in case they're of interest.
The question I'm trying to answer is how courier-imap will scale to large numbers of IMAP clients, in particular webmail IMAP frontend users, when the storage is on NFS backends such as Netapps. The testing was done with an old version of courier-imap (3.0.5) running under FreeBSD 7.2, with backend on a Linux NFS server (Ubuntu Karmic). The NFS mount has the noatime option but is otherwise default (v3, UDP). I am logging in to a test account with 103 seen messages in its INBOX. Packets captured on the FreeBSD box using: tcpdump -w filename.pcap -i em0 -n -s0 udp port 2049 or tcp port 143 Test 1: login, select INBOX, logout ----------------------------------- This shows the NFS packets captured by tcpdump (both in and out: 'ok' are the response packets) -- login 1 09:06:34 access 1 09:06:34 lookup 2 09:06:34 ok -- select inbox 11 09:06:45 access 1 09:06:45 commit 2 09:06:45 create 3 09:06:45 fsstat 1 09:06:45 link 13 09:06:45 lookup 37 09:06:45 ok 1 09:06:45 readdir 3 09:06:45 remove 1 09:06:45 setattr 1 09:06:45 write courier-imap is actually very efficient when opening a mailbox. It simply does a readdir() to find the names of all the files in the maildir, opens its uid cache file and reads out the names of the files it knows about, and then updates the cache (adding any new files it hasn't seen before, and removing any files which no longer exist in the maildir). Test 2: real IMAP client ------------------------ The next test was using mutt as an IMAP client. This is representative of "dumb" IMAP clients which don't perform any local storage of the mailbox contents; webmail IMAP frontends are likely to behave in the same way. Going into mutt I changed folder to imap://username:passw...@localhost/INBOX tcpdump showed that mutt was sending the following commands: a0000 LOGIN "username" "password" a0001 CAPABILITY a0002 LIST "" "" a0003 MYRIGHTS "INBOX" a0004 SELECT "INBOX" a0005 FETCH 1:103 (UID FLAGS INTERNALDATE RFC822.SIZE BODY.PEEK[HEADER.FIELDS (DATE FROM SUBJECT TO CC MESSAGE-ID REFERENCES CONTENT-TYPE CONTENT-DESCRIPTION IN-REPLY-TO REPLY-TO LINES LIST-POST X-LABEL)]) and this generated the following NFS ops: 228 09:22:17 access 1 09:22:17 commit 2 09:22:17 create 5 09:22:17 fsstat 1 09:22:17 link 23 09:22:17 lookup 269 09:22:17 ok 1 09:22:17 read 2 09:22:17 readdir 4 09:22:17 remove 1 09:22:17 setattr 1 09:22:17 write On quitting mutt, it sent these two additional commands: a0006 CLOSE a0007 LOGOUT 2 09:22:28 lookup 2 09:22:28 ok Opening and scanning a mailbox in this way generates a lot more NFS traffic than simply opening it. As far as I can tell, the cache file caches only the uids, not any of the message headers or bodies. So the FETCH command which mutt sends is forcing imapd to open and read each of the message files in the maildir. I believe I'm seeing 'access' rather than 'read' requests because the Linux NFS client is caching the file contents, and is just checking its cache is up-to-date. (See point A8 in the NFS FAQ at http://nfs.sourceforge.net/ ). To confirm this, if I unmount and remount as nfsv2 then I see a load of read() operations too: 2 09:40:16 create 5 09:40:16 fsstat 116 09:40:16 getattr 1 09:40:16 link 136 09:40:16 lookup 380 09:40:16 ok 106 09:40:16 read 9 09:40:16 readdir 3 09:40:16 remove 1 09:40:16 setattr 1 09:40:16 write 2 09:40:21 lookup 2 09:40:21 ok That's 103 reads for the maildir files, plus 3 extra (uidcache etc). Conclusions ----------- * this version of courier-imap could generate significant NFS load when used by a large base of dumb IMAP clients (as opposed to PC-based clients which cache the messages locally anyway) * NFSv3 caching appears to be effective at avoiding sending the file contents repeatedly, so load-balancers should be configured in 'sticky' mode so that the same client will be preferentially referred to the same IMAP frontend box. The NFS server and the IMAP frontends should have as much cache RAM as possible. All those 'access' packets could still generate substantial CPU load at the NFS server though. * A webmail client should be chosen which holds open the IMAP connection for each user, and/or caches metadata locally, if possible. Question: do newer versions of courier-imap perform any additional caching of message headers or properties? A quick scan of the 4.7.0 code suggests not, but I could have overlooked something. Regards, Brian. ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ Courier-imap mailing list Courier-imap@lists.sourceforge.net Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-imap