Re: queue processing problem
David Gartner <[EMAIL PROTECTED]> wrote: > > 4 machines (3 nodes, running qmail, mounting /home from NFS server. 1 NFS > server--running IDE RAID 5) All four machines: have 64M of ram and a 633 > Mhz proc, have qmail installed, accept smtp and pop3 connections and all > have the same /home (again, mounted over NFS). Vpopmail is installed on the > NFS server, in the home directory (local mail is put in > /home/vpopmail/domains/whatever.com/). /var on each machine is separate, so > they each have a separate queue. No special concurreny settings. tcpserver > is accepting 150 connections on pop/smtp at a time. Load balancers in the > front of these four machines send traffic to the least congested (the nfs > sever gets less traffic than the other three). > > Now, My question is do you think this can support a small ISP (10,000) > efficiently or should we go with special settings and/or think about > faster/better hardware? Do you think this leaves room for expansion? If I was setting it up, I'd probably make the NFS server a separate box (not accepting any SMTP or POP3 connections), probably running on SCSI RAID instead of IDE. The faster the SCSI setup, the better, of course. Additional memory in the NFS server would also be a benefit, and at a cost of USD$50 for 256MB of ECC PC133 SDRAM, it's hard to justify the business case of _not_ purchasing one or two extra sticks. The only other concern I would have would be that if one of your SMTP/POP toasters dies, you lose the contents of the queue on that machine, since they're running a single IDE disk for the queue. If this concerns you, perhaps upgrade each of those machines to IDE RAID. Can three toasters and one NFS server handle 10,000 users? Probably, but it depends a lot on what those users are doing. If they're mailing 20MB attachments to the net at large on a regular basis (or even worse, to each other), and they're each connected 24/7 and POP-checking their mail every minute, your systems might fall over rather quickly. If they're mostly dialup users connected an hour or two a day, sending a few 5k messages each, and only POP-ing their mail every 15 minutes, maybe your current setup is already overkill. You said you were worried -- I wouldn't be. Is the current setup working for you? Are the toasters frequently hitting their concurrency limits? Do you have the headroom to raise those limits? Is the NFS server coping with the current load? Remember, with a modular architecture like you're using, you can always add additional toasters in the future, feeding off the same central NFS server. If you grow to the point that you can't handle it with a single PC-based NFS server, a NetApp or similar might be within your reach at that point. Charles -- --- Charles Cazabon<[EMAIL PROTECTED]> GPL'ed software available at: http://www.qcc.sk.ca/~charlesc/software/ Any opinions expressed are just that -- my opinions. ---
Re: queue processing problem
David Gartner writes: > 4 machines (3 nodes, running qmail, mounting /home from NFS server. 1 NFS > server--running IDE RAID 5) Switch to SCSI drives and you can do it with one machine. -- -russ nelson <[EMAIL PROTECTED]> http://russnelson.com Crynwr sells support for free software | PGPok | 521 Pleasant Valley Rd. | +1 315 268 1925 voice | #exclude Potsdam, NY 13676-3213 | +1 315 268 9201 FAX |
Re: queue processing problem
Charles (and everyone else), I saw this thread and it made me kinda worried about a cluster we're fixing to send out. Here's a brief description of how it works: 4 machines (3 nodes, running qmail, mounting /home from NFS server. 1 NFS server--running IDE RAID 5) All four machines: have 64M of ram and a 633 Mhz proc, have qmail installed, accept smtp and pop3 connections and all have the same /home (again, mounted over NFS). Vpopmail is installed on the NFS server, in the home directory (local mail is put in /home/vpopmail/domains/whatever.com/). /var on each machine is separate, so they each have a separate queue. No special concurreny settings. tcpserver is accepting 150 connections on pop/smtp at a time. Load balancers in the front of these four machines send traffic to the least congested (the nfs sever gets less traffic than the other three). Now, My question is do you think this can support a small ISP (10,000) efficiently or should we go with special settings and/or think about faster/better hardware? Do you think this leaves room for expansion? Kinda distressed, David
Re: queue processing problem
Shawn Estes <[EMAIL PROTECTED]> wrote: Dave Sill had some good debugging/pinpointing advice for you in a separate message. I'll add a few things here. > First off, Im using concurrency patch and big-todo patch (from qmail.org) > with qmail-1.03. I've configured the conf-spawn to 400. We are an ISP so we > are not doing any kind of mailing lists, all messages coming through our > system are seperate messages sent by different customers. We process about > 15,000 different messages an hour. We have a server running FreeBSD 4.3, > with 256MB RAM, 9GB Seagate Barracuda 7200 (this is the disk holding the > queue), Quantum Fireball is holding the homedirs of the users. You're running a significant load -- disk I/O bandwidth and latency are probably at least part of the problem you're experiencing. Switching to a 15kRPM disk on a U160 controller would almost certainly help -- it will at least double available queue disk I/O bandwidth, while halving rotational latency. qmail does fsyncs at critical times to ensure reliability, and those each involve a disk seek; halving the rotational latency will reduce the access time significantly. > 1) qmail-qstat is showing that the "not yet preprocessed" messages are > growing, and very seldom is that number decreasing. qmail-send is having a hard time keeping up to the rate at which you are injecting messages. > 3) Ran the qmail-send run file by itself and the messages in the queue went > through very quickly. (5000 messages in about 15 minutes or so) A lot better > then they are with everything running. So when no messages are being injected, your system can deliver at reasonable speed, but as soon as you turn on qmail-smtpd, it can no longer keep up. > I appreciate any help that anyone can give me. I'm hoping that this is an > easy problem that I am just overlooking. If anymore information is needed, > please let me know. At a few hundred dollars for a 9GB 15kRPM disk, I'd say it's certainly a simple way to improve your system performance. > subdirectory split: 23. This is something else you might want to change. With 8000 messages in the queue and a subdir split of 23, you're averaging around 350 files per directory -- I've not had good luck with FFS-based systems when my directories have more than about 200 files each. Perhaps try something higher (remember, it should be prime). If you can't just vaporize the current contents of the queue or take the system down for a few hours, the way to switch over will be to temporarily run two instances of qmail in parallel. Charles -- --- Charles Cazabon<[EMAIL PROTECTED]> GPL'ed software available at: http://www.qcc.sk.ca/~charlesc/software/ Any opinions expressed are just that -- my opinions. ---
Re: queue processing problem
Shawn Estes <[EMAIL PROTECTED]> wrote: >First off, Im using concurrency patch and big-todo patch (from >qmail.org) with qmail-1.03. I've configured the conf-spawn to 400. We >are an ISP so we are not doing any kind of mailing lists, all >messages coming through our system are seperate messages sent by >different customers. We process about 15,000 different messages an >hour. We have a server running FreeBSD 4.3, with 256MB RAM, 9GB >Seagate Barracuda 7200 (this is the disk holding the queue), Quantum >Fireball is holding the homedirs of the users. > >This is kind of broken up into a few different problems. > >1) qmail-qstat is showing that the "not yet preprocessed" messages > are growing, and very seldom is that number decreasing. > > >2) qmail-remote is being spawned way under the current remote > concurrency limit (175) I have very seldom seen this number reach > above 30. Both suggest that qmail-send is having trouble keeping up. qmail-send is responsible for processing messages placed in the queue and for scheduling remote deliveries through qmail-rspawn. The question to answer is why qmail-send isn't keeping up. Perhaps disk I/O is the bottleneck. Or maybe the CPU is maxed out--though that's unlikely. What else is the system doing? Is there any idle CPU? Another possibility is that it's just too busy. You could split the load somewhat by installing another instance of qmail, e.g. in /var/qmail2, and let one instance handle locally injected messages while the other handles SMTP injected messages. Since qmail-send is single-threaded, it might be not able to keep qmail-rspawn busy if it keeps seeing new messages that need processing. Splitting the load like this would mean fewer interruptions for the qmail-send handling locally injected messages. >su-2.05# ps -ax | grep qmail-remote | wc -l > 30 >su-2.05# ps -ax | grep qmail-smtpd | wc -l > 111 That's a fairly high number of incoming SMTP connections. >Excerpt from /var/log/qmail/current: Too small to be useful, and lacking timestamps. >3) Messages are staying in the queue and are not being delivered the > way they should be. Note: Messages are going out, just very > slowly. The logs are showing deliveries local and remote. There > are no error messages in the log. (A test message sent to a local > user takes approximately 30-45 minutes, roughly the same amount of > time for a remote user) Same problem as 1 and 2. >Here's what I've done so far: > >1) Checked the Trigger file to make sure it has the correct permissions: Good. >2) Checked ulimit and kern max files. OK. >3) Ran the qmail-send run file by itself and the messages in the > queue went through very quickly. (5000 messages in about 15 > minutes or so) A lot better then they are with everything > running. Confirms my "qmail-send is being interrupted" hypothesis, I think. >4) Verified my run scripts with LWQ. The run scripts have softlimits > that are increased from LWQ, could this be my problem? No, but I wonder why you want such high limits. They're for your own protection. -Dave