I'm in the final stages of testing a courier IMAP installation, and
am troubleshooting a problem that I experienced when doing some load
testing.

  Courier-IMAP is running on FreeBSD 4.11, built from ports, with
Postfix as the MTA and MySQL as the user database, using
courier-authdaemond + MySQL to authenticate.  I'm stress-testing POP3
connections with Rabid from the Postal test suite, running on a
different FreeBSD machine.  Rabid has worked fine as a way of
stress-testing for me in the past, against qpopper.  (However I
recognize that might simply mean that Rabid is behaving in some
non-compliant way that qpopper tolerates better.)

  I initially increased settings to MAXDAEMONS=100, MAXPERIP=50 so I
should not in principle have seen problems; I have tried setting Rabid
to varying numbers of threads from 20-40, but always less than the 50
that Courier should in theory permit.

  If I use Rabid from a different server to start opening up a bunch of
POP3 connections and reading them, and I just let it run, then after "a
while" (see below) all fresh connections to the Courier POP server
start failing.  Rabid has worked fine as a way of stress-testing for me
in the past, against qpopper.  (However I recognize that might simply
mean that Rabid is behaving in some non-compliant way that qpopper
tolerates better.)

  When I was using 20 threads, 20 POP transactions/minute and
MAXPERIP=50 it would go for 2-3 minutes, then the Rabid log would start
spraying "Server error:" and "Can't establish connection" messages.
With 30 threads, 30 POP transactions/minute, it gets through one minute
and then starts erroring out.  With 20 threads, and 100 POP
transactions/minute, it doesn't make it to one minute.

  I am pretty sure now that the problem is in the interaction between
Rabid and Courier POP - Rabid seems to think it is closing connections
with "quit\r\n", disconnecting, and opening a fresh connection, but
Courier thinks a large fraction of the earlier "closed" connections are
still connected.

  I got to this conclusion by kicking up MAXPERIP to 200, to rule out
the connections per IP as a factor.  Wwhen I did that, I saw that Rabid
would see ongoing success at the 20 POP transactions/minute rate; but
at the 5 minute mark, Courier started logging a number of
disconnections with TIMEOUT, when Rabid apparently believed all those
connections were long since closed.

  I haven't got as far as tcpdumping the connections yet; I thought I'd
first check if this is known, and if it's a Rabid bug, then what others
have done in terms of performance and stress-testing?  I would also be
happy to take suggestions on a better way to stress-test the POP and
IMAP servers.  Rabid is a pretty clumsy tool but it's what I'm aware
of.

  -- Clifton

-- 
          Clifton Royston  --  [EMAIL PROTECTED] 
         Tiki Technologies Lead Programmer/Software Architect
"I'm gonna tell my son to grow up pretty as the grass is green
And whip-smart as the English Channel's wide..."
                                            -- 'Whip-Smart', Liz Phair


-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
courier-users mailing list
[email protected]
Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-users

Reply via email to