RE: questions about performance and setup
I did some benchmarking using a standard 7200 RPM disk and a 128MB ramdisk. The machine was not using any swap, so there was no chance of the ramdisk accidentally making it to disk. In short, performance on it sucked. The throughput was about 10% less than IDE, but seeks/sec were 5-10 times more. However, the CPU was maxed at 100% during tests to the ramdisk. Jay -Original Message- From: Oliver White [mailto:[EMAIL PROTECTED]] Sent: Tuesday, July 18, 2000 12:27 AM To: [EMAIL PROTECTED] Subject: Re: questions about performance and setup Steve Wolfe wrote: With all of the emails I recieved, I get the impression that I'm going to I/O bound instead of processor or memory bound. How much disk will be sufficient for the queue? 1GB? More? It's not so much a matter of disk size (I don't think you'll have a 1 gig queue!), You could quite easily get a 1 Gig queue, even if you don't run into the obvious problem of temporary loss of network connectivity. Say you've got 200,000 subscribers and you generate your messages twice as fast as qmail can send them, then when you've finished generating the messages you've still got 100,000 in the queue. If the messages are 10Kb each, that's 1 Gb. (I can put 2GB of ram in the box)? Linux has support for making a disk in memory, putting a filesystem on it and mounting it. Wouldn't this take care of I/O problems? That's about as good of I/O as you can get, I would imagine. ; ) As another author stated, the largest gain would be in writes, but that's where the largest expenditure is anyway. Just make dang, dang sure that your machine is NOT going to have any hiccups or lose power while the queue is full, or you'll instantly lose it all. What if you put the 2 Gb RAM in the box, but let Linux use it as a disk cache? I'm not sure how the disk caching under Linux works, but if you create a file and then delete it before it actually gets written to disk, is there any disk activity required? Sure, the disks will be thrashing away, trying to keep up, but would the I/O actually block if there was still room in the disk cache? - Oliver.
Re: questions about performance and setup
What if you put the 2 Gb RAM in the box, but let Linux use it as a disk cache? I'm not sure how the disk caching under Linux works, but if you create a file and then delete it before it actually gets written to disk, is there any disk activity required? Sure, the disks will be thrashing away, trying to keep up, but would the I/O actually block if there was still room in the disk cache? Yes it will block. That's the whole point of the fsync() calls embedded within qmail. The code wants to be sure that data is on disk before proceeding. The only caveat is that some file systems may *lie* about the results of their fsync() and tell the process that the data has been placed on disk when it still sits in memory. In that sort of scenario you may well gain, especially if the I/O queue is subsequently sorted by cylinder prior to sending to the disk. As others have said, it's the cost of seeking - the amount of data is often trivial. Thus the concept of zeroseek which is pretty similar to what a journalling file system is trying to do on a more general level. Regards.
Re: questions about performance and setup
"Hubbard, David" wrote: know what you're getting into on the Dell boxes if you choose to run linux. I've got a Dell PE2400 dual that runs linux and you're going to be at the mercy of Dell and Adaptec on when you upgrade your kernel because they have some sorry proprietary drivers for their RAID controllers that are tailored to a specific kernel version and redhat sub-revision. If you can put up with that, then Redhat Linux/Qmail on a Dell runs very fast, I'm happy with mine. But at the same Just for the record, it depends on the RAID controller that you purchase from Dell. The PERC 2/DC and 2/SC (Dual Channel and Single Channel) are just AMI MegaRAID controllers with open source drivers included in the standard kernels. The PERC 3/Si (and maybe the PERC 2/Si?) are the Adaptec RAIDPort controllers with closed-source drivers. You have to wait for Adaptec/Dell to release new precompiled modules that can only be used with specific kernels that Redhat releases. But, the PERC 3/Si is much cheaper than the 2/DC if you are on a budget, need RAID, and don't care about having the latest kernel. You can probably get a better DPT card for around the same price, though. (Note: Adaptec now owns DPT) I have a Dell 2450 with PERC 2/DC controller and 18GB mirrored disks running linux with Qmail. Compiled latest standard kernel with no problem, and the machine runs like a champ. Later ;) -- S. Clint Bullock Network Administrator University of Georgia Office of the Vice President for Research 626 Boyd GSRC Athens, GA 30602-7411 (706) 542-5936 (706) 542-5638 FAX begin:vcard n:Bullock;Clint tel;fax:(706) 542-5946 tel;work:(706) 542-5936 x-mozilla-html:FALSE url:http://www.ovpr.uga.edu org:University of Georgia;Office of the Vice President for Research adr:;;626 Boyd GSRC;Athens;GA;30602-7411;USA version:2.1 email;internet:[EMAIL PROTECTED] title:Network Administrator fn:Clint Bullock end:vcard
Re: questions about performance and setup
Nothing wrong with 100% CPU usage. It just means that the kernel was able to soak the CPU with work ... which is good. Maxing out your performance on a RAM disk at 75% CPU usage means your system has a problem somewhere. As for performance though, I'd be interested in seeing the actual numbers from the ramdisk test to check against my 10k RPM disk stats. "Austad, Jay" wrote: I did some benchmarking using a standard 7200 RPM disk and a 128MB ramdisk. The machine was not using any swap, so there was no chance of the ramdisk accidentally making it to disk. In short, performance on it sucked. The throughput was about 10% less than IDE, but seeks/sec were 5-10 times more. However, the CPU was maxed at 100% during tests to the ramdisk.
RE: questions about performance and setup
As for performance though, I'd be interested in seeing the actual numbers from the ramdisk test to check against my 10k RPM disk stats. I used bonnie++ to test it. I'll post the results sometime today, when I get some time. Jay -Original Message- From: Michael T. Babcock [mailto:[EMAIL PROTECTED]] Sent: Tuesday, July 18, 2000 10:41 AM To: Austad, Jay; Qmail Mailing List Subject: Re: questions about performance and setup Nothing wrong with 100% CPU usage. It just means that the kernel was able to soak the CPU with work ... which is good. Maxing out your performance on a RAM disk at 75% CPU usage means your system has a problem somewhere. As for performance though, I'd be interested in seeing the actual numbers from the ramdisk test to check against my 10k RPM disk stats. "Austad, Jay" wrote: I did some benchmarking using a standard 7200 RPM disk and a 128MB ramdisk. The machine was not using any swap, so there was no chance of the ramdisk accidentally making it to disk. In short, performance on it sucked. The throughput was about 10% less than IDE, but seeks/sec were 5-10 times more. However, the CPU was maxed at 100% during tests to the ramdisk.
Re: questions about performance and setup
Is UTIME necessary in a mail queue? If a logging filesystem were mounted on a separate disk (or network array, etc.) specifically for the mail queue, shouldn't it be mounted without UTIME? Bruce Guenter wrote: The only way to get truely zero seek performance is to use a log-structured file system on a clean disk. Otherwise, you will seek occasionally to write out some dirty metadata. Even if you pre-allocate your log file on a regular filesystem, you will seek occasionally (once a second, AFAICT) to update the utime in the inode.
Re: questions about performance and setup
On Tue, Jul 18, 2000 at 01:25:36PM -0400, Michael T. Babcock wrote: Is UTIME necessary in a mail queue? If a logging filesystem were mounted on a separate disk (or network array, etc.) specifically for the mail queue, shouldn't it be mounted without UTIME? You cannot mount without mtime (I misspelt it -- utime is the syscall) AFAIK. You can mount without atime (access time). mtime is changed every time the file is modified. ctime is changed every time the inode is modified (file size change, permissions, etc.) atime is changed every time the file is accessed. -- Bruce Guenter [EMAIL PROTECTED] http://em.ca/~bruceg/ PGP signature
Re: questions about performance and setup
On Tue, Jul 18, 2000 at 01:25:36PM -0400, Michael T. Babcock wrote: Is UTIME necessary in a mail queue? If a logging filesystem were mounted on a separate disk (or network array, etc.) specifically for the mail queue, shouldn't it be mounted without UTIME? Do you mean atime or mtime? In either case, not all Unixen allow such mount options. Sepcifically Solaris only has noatime. I'd be surprised though if the OS wants to update the directory once a second to get an atime/mtime on disk for an opened file. Maybe once a minute which is not an unreasonable cost for zeroseek. This is probably something that's more appropriately discussed on the zeroseek list. The bottom line though is that when qmail-queue exits(0), the email must be phsyically on disk which means there must be at least one fsync() - no choice whatsoever. The zeroseek question is all about how you minimize the number of fsyncs and how you structure the queue so that the fsync() incurs a minimal seek on disk. Oh and combine that with appropriate security access to that queue structure and your done! Regards. Bruce Guenter wrote: The only way to get truely zero seek performance is to use a log-structured file system on a clean disk. Otherwise, you will seek occasionally to write out some dirty metadata. Even if you pre-allocate your log file on a regular filesystem, you will seek occasionally (once a second, AFAICT) to update the utime in the inode.
Re: questions about performance and setup
To be honest, I'm not aware of being able to disable UTIME either, although NOATIME is an option on Linux as well. I asked because it occured to me that this meta data is not terribly useful to mail servers (as the times necessary are stored in the data files themselves). Being able to shut these off may or may not reduce performance penalties of fsync()'s. Might be an issue for the ReiserFS or EXT3 people to think about. [EMAIL PROTECTED] wrote: On Tue, Jul 18, 2000 at 01:25:36PM -0400, Michael T. Babcock wrote: Is UTIME necessary in a mail queue? If a logging filesystem were mounted on a separate disk (or network array, etc.) specifically for the mail queue, shouldn't it be mounted without UTIME? Do you mean atime or mtime? In either case, not all Unixen allow such mount options. Sepcifically Solaris only has noatime. I'd be surprised though if the OS wants to update the directory once a second to get an atime/mtime on disk for an opened file. Maybe once a minute which is not an unreasonable cost for zeroseek.
Re: questions about performance and setup
Yes, sorry ... utime. But as I said in the other message ... it would be nice. Bruce Guenter wrote: You cannot mount without mtime (I misspelt it -- utime is the syscall) AFAIK. You can mount without atime (access time). mtime is changed every time the file is modified. ctime is changed every time the inode is modified (file size change, permissions, etc.) atime is changed every time the file is accessed.
Re: questions about performance and setup
On Mon, Jul 17, 2000 at 03:33:54PM +1000, Oliver White wrote: We're in a similar situation at the moment. However, we want to send out 100,000 UNIQUE emails per day, expanding to 500,000 or more in the near future. Also, our send window is only actually a couple of hours. Is that for your TV stuff? 500K queue insertions and delivery (reliably) within 2-3 hours is a lot. I would not rely on one server, nor one point of internet connection. One thing that you may want to think about is the amount of bandwidth you will need. Let's see now, assuming a 10Kbyte message size (which is pretty close to the current average, especially if it's HTML)... 500,000 x 10,000 x 8 = 400 bits. In two hours, that makes 200 bits per hour, that makes 3 bits per minute, that makes 555 bits per second. Let's put some commas in to make this obvious: 5,555,555 In other words you'll need to pump out 5+ megabits per second, which means a connection of around double that, say 10 Mbits per second. Is that what you have available? Looking at the disk I/O, 500K queue insertions and deletions implies 1Million fsynced I/Os (one for insertion, one for delivery) in 2 hours, which means: 50 fsynched I/Os per hour, that makes 8333 fsynched I/Os per minute, that makes 138 fsynched I/Os per second, that means that a 7ms access disk will be flat out. Again doubling it to make a safety margin means that you're looking at a disk subsystem that will give you an fsynced I/O rate of 3ms. Regards. I'm trying to work out the best settings for the concurrencyremote and conf-split parameters. Our system is a HP Netserver 2000r PIII-667 RAID5 running Linux. Are there any problems in setting conf-split to a very large value? Is it necessary on a Linux system, assuming a queue size of, say 100,000? Any information appreciated. - Oliver. "Austad, Jay" wrote: Non-unique emails will most likely be generated by other machines and send the box running mini-qmail via smtp. Non-unique emails will be a small percentage of what gets sent out, for now. Jay -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Saturday, July 15, 2000 12:10 AM To: '[EMAIL PROTECTED]' Subject: Re: questions about performance and setup On Fri, Jul 14, 2000 at 07:01:46PM -0500, Austad, Jay wrote: Then have the script that does the mailing call randomly on of the /var/qmail#/bin/qmail-inject. This will emulate round robin without any patching. Won't this way be a performance hit though? I admit, it is an easy solution No. My experience is that the cost of running a script to inject the mail in a way similar to that mentioned above, is pretty small compared to the queue injection cost and the delivery cost. sh or perl will be fine. and would work excellent, but I have to think about efficiency also. C code is much faster than shell or perl, and I'd like to set it up once and not have to ever worry about again, or at least for a long, long time. As I said, we're doing 50 million emails a month right now, but this is increasing substantially each month, and as we rollout new subscription services, we'll have even more load. Sending 10 times this amount by the same time next year is a good possibility, possibly sooner as we seem to underestimate the rate at which we're growing much of the time... You may also need to look at the scalability of the generation of the emails. One system I recently looked at claimed to be able to generate nicely unique emails at a targetted database, but it burned CPU like it was free - just in generating the content. Mark. Jay -Original Message- From: JuanE [mailto:[EMAIL PROTECTED]] Sent: Friday, July 14, 2000 5:55 PM To: '[EMAIL PROTECTED]' Subject: Re: questions about performance and setup Jay, That's the beauty of having multiple instances, not having to patch qmail. All you need to do is install qmail once per machine (ie, /var/qmail1, /var/qmail2,...). Then have the script that does the mailing call randomly on of the /var/qmail#/bin/qmail-inject. This will emulate round robin without any patching. JES Austad, Jay writes: Where would I start in the code to modify the QMQP servers list so that it would load balance between all of the servers in the list instead of just using the first one it can contact? This would be very useful to me. I assume qmail-qmqpc.c is one of them, are there others I would need to play around with? Jay -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Friday, July 14, 2000 3:55 PM To: '[EMAIL PROTECTED]' Subject: Re: questions about performance and setup On Fri, Jul 14, 2000 at 02:29:06PM -0500, Austad, Jay wrote: I already have Mandrake
Re: questions about performance and setup
We're in a similar situation at the moment. However, we want to send out 100,000 UNIQUE emails per day, expanding to 500,000 or more in the near future. Also, our send window is only actually a couple of hours. That shouldn't be too hard. With a Pentium 233 (not a P-II, a regular Pentium) attached to a 512k dsl line, using an IDE hard drive, I sent out 1,000 unique emails from a Perl script, the script took about 30 seconds to run, and all (deliverable) remote messages were delivered in about 45 seconds. That was with a concurrencyremote of 60. So, with equal hardware, 500,000 would take about 6 hours to run. Considering that you have about 10 times the CPU of the machine I used, and a much better disk, if you have a large enough pipe, you can turn the concurrencyremote to 200 (or even more), and it should work out in a couple of hours. steve
RE: questions about performance and setup
With all of the emails I recieved, I get the impression that I'm going to I/O bound instead of processor or memory bound. How much disk will be sufficient for the queue? 1GB? More? I'm just grasping here to figure out the best solution, so bear with me... What if I only needed a 1GB queue, and what if that queue was a 1GB ramdisk (I can put 2GB of ram in the box)? Linux has support for making a disk in memory, putting a filesystem on it and mounting it. Wouldn't this take care of I/O problems? Jay -Original Message- From: Oliver White [mailto:[EMAIL PROTECTED]] Sent: Monday, July 17, 2000 12:34 AM To: [EMAIL PROTECTED] Subject: Re: questions about performance and setup We're in a similar situation at the moment. However, we want to send out 100,000 UNIQUE emails per day, expanding to 500,000 or more in the near future. Also, our send window is only actually a couple of hours. I'm trying to work out the best settings for the concurrencyremote and conf-split parameters. Our system is a HP Netserver 2000r PIII-667 RAID5 running Linux. Are there any problems in setting conf-split to a very large value? Is it necessary on a Linux system, assuming a queue size of, say 100,000? Any information appreciated. - Oliver. "Austad, Jay" wrote: Non-unique emails will most likely be generated by other machines and send the box running mini-qmail via smtp. Non-unique emails will be a small percentage of what gets sent out, for now. Jay -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Saturday, July 15, 2000 12:10 AM To: '[EMAIL PROTECTED]' Subject: Re: questions about performance and setup On Fri, Jul 14, 2000 at 07:01:46PM -0500, Austad, Jay wrote: Then have the script that does the mailing call randomly on of the /var/qmail#/bin/qmail-inject. This will emulate round robin without any patching. Won't this way be a performance hit though? I admit, it is an easy solution No. My experience is that the cost of running a script to inject the mail in a way similar to that mentioned above, is pretty small compared to the queue injection cost and the delivery cost. sh or perl will be fine. and would work excellent, but I have to think about efficiency also. C code is much faster than shell or perl, and I'd like to set it up once and not have to ever worry about again, or at least for a long, long time. As I said, we're doing 50 million emails a month right now, but this is increasing substantially each month, and as we rollout new subscription services, we'll have even more load. Sending 10 times this amount by the same time next year is a good possibility, possibly sooner as we seem to underestimate the rate at which we're growing much of the time... You may also need to look at the scalability of the generation of the emails. One system I recently looked at claimed to be able to generate nicely unique emails at a targetted database, but it burned CPU like it was free - just in generating the content. Mark. Jay -Original Message- From: JuanE [mailto:[EMAIL PROTECTED]] Sent: Friday, July 14, 2000 5:55 PM To: '[EMAIL PROTECTED]' Subject: Re: questions about performance and setup Jay, That's the beauty of having multiple instances, not having to patch qmail. All you need to do is install qmail once per machine (ie, /var/qmail1, /var/qmail2,...). Then have the script that does the mailing call randomly on of the /var/qmail#/bin/qmail-inject. This will emulate round robin without any patching. JES Austad, Jay writes: Where would I start in the code to modify the QMQP servers list so that it would load balance between all of the servers in the list instead of just using the first one it can contact? This would be very useful to me. I assume qmail-qmqpc.c is one of them, are there others I would need to play around with? Jay -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Friday, July 14, 2000 3:55 PM To: '[EMAIL PROTECTED]' Subject: Re: questions about performance and setup On Fri, Jul 14, 2000 at 02:29:06PM -0500, Austad, Jay wrote: I already have Mandrake Linux 7.0 and 7.1 running on multiple Dell boxes with no trouble, some of them took work to get going, but it runs well. I have a few Crystal PC's here also that I may use instead, dual PIII 550's with 512MB ram and 9 or 18GB 1rpm drives. I'll probably use these for testing. I agree with the earlier poster that more spindles for your queue (c/- raid) is a good thing in general. The bulk of the messages will be the same content to many rcpt's. However, once in awhile we'll have 100,000 different messages go out to 100,000 different people. Since the QMQP support under mini-qmail doesn't load balance, can I feed it a hostname with mu
Re: questions about performance and setup
In other words you'll need to pump out 5+ megabits per second, which means a connection of around double that, say 10 Mbits per second. Is that what you have available? In theory, yes. In practice... remains to be seen. I did some similar calculations and came up with a similar result, which shocked me at first, but it must be possible, because there are companies out there that do it! Right. One I've was helping a little while ago had 20+ systems dedicated to the task and they were co-lo'd with plenty of connectivity. Once you started getting into large scale you need to consider multiple systems to at least ensure that you have some sort of redundancy strategy. Regards.
Re: questions about performance and setup
On Mon, Jul 17, 2000 at 10:29:03AM -0500, Austad, Jay wrote: With all of the emails I recieved, I get the impression that I'm going to I/O bound instead of processor or memory bound. How much disk will be sufficient for the queue? 1GB? More? I'm just grasping here to figure out the best solution, so bear with me... What if I only needed a 1GB queue, and what if that queue was a 1GB ramdisk (I can put 2GB of ram in the box)? Linux has support for making a disk in memory, putting a filesystem on it and mounting it. Wouldn't this take care of I/O problems? The I/O cost is simply there to protect again machine failure, reboots, power-loss, OS bugs, that sort of thing. Your memory file system will be ok if it's battery-backed up and running on a system as reliable as a hard-disk. Otherwise you increase the chances of losing part of your queue at some point. Having said that, you may find the trade-off acceptable. That is putting the queue in a memory file system and accepting a total queue loss once every now and and again. If eg, it's advertising email and occassional losses are tolerable, then this may be a perfectly acceptable cost/reliability trade-off for you. Regards.
Re: questions about performance and setup
On Mon, Jul 17, 2000 at 10:29:03AM -0500, Austad, Jay wrote: With all of the emails I recieved, I get the impression that I'm going to I/O bound instead of processor or memory bound. Ahhh... someone who gets it. How much disk will be sufficient for the queue? 1GB? More? What if you did your entire queue injection with your network connection down? I'd budget for a significant portion of that plus growth and safety. I'm just grasping here to figure out the best solution, so bear with me... What if I only needed a 1GB queue, and what if that queue was a 1GB ramdisk (I can put 2GB of ram in the box)? Linux has support for making a disk in memory, putting a filesystem on it and mounting it. Wouldn't this take care of I/O problems? I'd read up on ramdisks first. They aren't instant i/o. Alternatively, Quantum has a line of solid state disks which might do the trick for you. Pretty pricey, though. http://www.zdnet.com/etestinglabs/stories/main/0,8829,2352381,00.html John
RE: questions about performance and setup
I overlooked that when I posted this message; I totally forgot about the write penalty. Sorry about that. -Original Message- From: John White [mailto:[EMAIL PROTECTED]] Sent: Friday, July 14, 2000 7:08 PM To: qmail mailing list Subject: Re: questions about performance and setup On Fri, Jul 14, 2000 at 12:21:57PM -0700, Jason Murphy wrote: The machine I built contains a DPT SmartRAID V SCSI RAID 0/1/5 controller with 5 1RPM 9.1 gig drives. The thing I notice about RAID 5 in the right configuration is that you can throw tons of IO at it and you will see little decrease in performance. Our Database server (Ya, I know, its not MAIL SERVER) gets tons of IO and its nothing to it; just eats it up and continues on its way. A massive mail injection, especially if the content is unique to the user, can overwhelm a disk subsystem. This is reccomending the exact -wrong- kind of disk system. RAID 5 has a write penalty, as it has to calculate parity for each write, and write to multiple spindles. The best type of RAID for small block writes is RAID 10 or RAID 1+0 (not to be confused with RAID 0+1). Even better is to use a disk system with write-back cache. Ideally, you need at least seven spindles. I've seen great things with the Infortrend controller. A great setup would be 1U pc's connected to an external RAID. John smime.p7s
Re: questions about performance and setup
On Mon, Jul 17, 2000 at 09:51:22AM -0700, John White wrote: (I can put 2GB of ram in the box)? Linux has support for making a disk in memory, putting a filesystem on it and mounting it. Wouldn't this take care of I/O problems? I'd read up on ramdisks first. They aren't instant i/o. Indeed. I haven't played around with ramdisks for a couple of years now, but last time I benchmarked them, they didn't appear to run much faster than a harddisk FOR READS as buffer caches on harddisks made them act very similarly Writes would be a different prospect of course... -- Cheers Jason Haar Unix/Network Specialist, Trimble NZ Phone: +64 3 9635 377 Fax: +64 3 9635 417
Re: questions about performance and setup
With all of the emails I recieved, I get the impression that I'm going to I/O bound instead of processor or memory bound. How much disk will be sufficient for the queue? 1GB? More? It's not so much a matter of disk size (I don't think you'll have a 1 gig queue!), but of throughput. For example, a single IDE drive will get you a couple of megabytes of throughput per second, at a very high CPU cost. SCSI will yield more, with a lower CPU utilization, and with RAID arrays, you can move up to hundreds of megabytes per second if you want to. I'm just grasping here to figure out the best solution, so bear with me... What if I only needed a 1GB queue, and what if that queue was a 1GB ramdisk (I can put 2GB of ram in the box)? Linux has support for making a disk in memory, putting a filesystem on it and mounting it. Wouldn't this take care of I/O problems? That's about as good of I/O as you can get, I would imagine. ; ) As another author stated, the largest gain would be in writes, but that's where the largest expenditure is anyway. Just make dang, dang sure that your machine is NOT going to have any hiccups or lose power while the queue is full, or you'll instantly lose it all. steve
Re: questions about performance and setup
On Mon, Jul 17, 2000 at 10:24:53PM -0600, Steve Wolfe wrote: With all of the emails I recieved, I get the impression that I'm going to I/O bound instead of processor or memory bound. How much disk will be sufficient for the queue? 1GB? More? It's not so much a matter of disk size (I don't think you'll have a 1 gig queue!), but of throughput. For example, a single IDE drive will get you a couple of megabytes of throughput per second, at a very high CPU cost. SCSI will yield more, with a lower CPU utilization, and with RAID arrays, you can move up to hundreds of megabytes per second if you want to. Not entirely true. With UDMA mode, modern IDE drives get high throughput with low CPU utilization. On my Celeron PC, I could get well over 10MB/sec at well under 20% CPU, and it's hardly performance hardware (5400RPM spindle). With a 10K RPM spindle and a faster chipset (mine's a VIA) this will rival or beat fast SCSI disks in raw streaming bandwidth. However, the majority of mail queues are not even bandwidth bound -- they're seek bound, which is where SCSI disks still beat IDE. The faster seek time, the better (which is the motivation behind DJB's ingenious zeroseek proposal). Also, RAID5 arrays (the most common one for large capacities) suffer a significant write penalty due to recalculation and rewiting of the parity, and the mail queue is mostly written (and subsequently cached). A RAID1+0 array works better, but uses more disks. -- Bruce Guenter [EMAIL PROTECTED] http://em.ca/~bruceg/ PGP signature
Re: questions about performance and setup
Steve Wolfe wrote: With all of the emails I recieved, I get the impression that I'm going to I/O bound instead of processor or memory bound. How much disk will be sufficient for the queue? 1GB? More? It's not so much a matter of disk size (I don't think you'll have a 1 gig queue!), You could quite easily get a 1 Gig queue, even if you don't run into the obvious problem of temporary loss of network connectivity. Say you've got 200,000 subscribers and you generate your messages twice as fast as qmail can send them, then when you've finished generating the messages you've still got 100,000 in the queue. If the messages are 10Kb each, that's 1 Gb. (I can put 2GB of ram in the box)? Linux has support for making a disk in memory, putting a filesystem on it and mounting it. Wouldn't this take care of I/O problems? That's about as good of I/O as you can get, I would imagine. ; ) As another author stated, the largest gain would be in writes, but that's where the largest expenditure is anyway. Just make dang, dang sure that your machine is NOT going to have any hiccups or lose power while the queue is full, or you'll instantly lose it all. What if you put the 2 Gb RAM in the box, but let Linux use it as a disk cache? I'm not sure how the disk caching under Linux works, but if you create a file and then delete it before it actually gets written to disk, is there any disk activity required? Sure, the disks will be thrashing away, trying to keep up, but would the I/O actually block if there was still room in the disk cache? - Oliver.
Re: questions about performance and setup
We're in a similar situation at the moment. However, we want to send out 100,000 UNIQUE emails per day, expanding to 500,000 or more in the near future. Also, our send window is only actually a couple of hours. I'm trying to work out the best settings for the concurrencyremote and conf-split parameters. Our system is a HP Netserver 2000r PIII-667 RAID5 running Linux. Are there any problems in setting conf-split to a very large value? Is it necessary on a Linux system, assuming a queue size of, say 100,000? Any information appreciated. - Oliver. "Austad, Jay" wrote: Non-unique emails will most likely be generated by other machines and send the box running mini-qmail via smtp. Non-unique emails will be a small percentage of what gets sent out, for now. Jay -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Saturday, July 15, 2000 12:10 AM To: '[EMAIL PROTECTED]' Subject: Re: questions about performance and setup On Fri, Jul 14, 2000 at 07:01:46PM -0500, Austad, Jay wrote: Then have the script that does the mailing call randomly on of the /var/qmail#/bin/qmail-inject. This will emulate round robin without any patching. Won't this way be a performance hit though? I admit, it is an easy solution No. My experience is that the cost of running a script to inject the mail in a way similar to that mentioned above, is pretty small compared to the queue injection cost and the delivery cost. sh or perl will be fine. and would work excellent, but I have to think about efficiency also. C code is much faster than shell or perl, and I'd like to set it up once and not have to ever worry about again, or at least for a long, long time. As I said, we're doing 50 million emails a month right now, but this is increasing substantially each month, and as we rollout new subscription services, we'll have even more load. Sending 10 times this amount by the same time next year is a good possibility, possibly sooner as we seem to underestimate the rate at which we're growing much of the time... You may also need to look at the scalability of the generation of the emails. One system I recently looked at claimed to be able to generate nicely unique emails at a targetted database, but it burned CPU like it was free - just in generating the content. Mark. Jay -Original Message- From: JuanE [mailto:[EMAIL PROTECTED]] Sent: Friday, July 14, 2000 5:55 PM To: '[EMAIL PROTECTED]' Subject: Re: questions about performance and setup Jay, That's the beauty of having multiple instances, not having to patch qmail. All you need to do is install qmail once per machine (ie, /var/qmail1, /var/qmail2,...). Then have the script that does the mailing call randomly on of the /var/qmail#/bin/qmail-inject. This will emulate round robin without any patching. JES Austad, Jay writes: Where would I start in the code to modify the QMQP servers list so that it would load balance between all of the servers in the list instead of just using the first one it can contact? This would be very useful to me. I assume qmail-qmqpc.c is one of them, are there others I would need to play around with? Jay -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Friday, July 14, 2000 3:55 PM To: '[EMAIL PROTECTED]' Subject: Re: questions about performance and setup On Fri, Jul 14, 2000 at 02:29:06PM -0500, Austad, Jay wrote: I already have Mandrake Linux 7.0 and 7.1 running on multiple Dell boxes with no trouble, some of them took work to get going, but it runs well. I have a few Crystal PC's here also that I may use instead, dual PIII 550's with 512MB ram and 9 or 18GB 1rpm drives. I'll probably use these for testing. I agree with the earlier poster that more spindles for your queue (c/- raid) is a good thing in general. The bulk of the messages will be the same content to many rcpt's. However, once in awhile we'll have 100,000 different messages go out to 100,000 different people. Since the QMQP support under mini-qmail doesn't load balance, can I feed it a hostname with multiple dns entries (round-robin dns)? Or better yet, how easy would it be to modify the qmail code to just load balance between them? The manpage for qmail-qmqpc tells us that they have to be IP addresses in qmqpservers so a RR DNS won't help. If all of the messages are generated on one machine, then I'd be inclined to go for a much simpler solution than modifying qmail. I'd have an instance of qmail for each outbound server with the appropriate qmqpservers entry, then have your queue insertion script do a round-robin itself by simply cycling thru the qmail-inject command associated with each instance. for instance in 1 2 3
RE: questions about performance and setup
Hey Jay, I don't know much about setting that type of thing up in qmail, but I would like to give you some ideas on the hardware. I'm not sure how much load qmail would generate in a scenario like that, but you may want to consider Solaris x86 for the superior SMP performance. Also, you should know what you're getting into on the Dell boxes if you choose to run linux. I've got a Dell PE2400 dual that runs linux and you're going to be at the mercy of Dell and Adaptec on when you upgrade your kernel because they have some sorry proprietary drivers for their RAID controllers that are tailored to a specific kernel version and redhat sub-revision. If you can put up with that, then Redhat Linux/Qmail on a Dell runs very fast, I'm happy with mine. But at the same time, I'm sitting on a kernel with a known suid exploit hoping Dell will release newer drivers soon... It is much nicer running Linux on an older Dell server of mine that has an AMI MegaRaid card with drivers built into the kernel. Dave -Original Message- From: Austad, Jay To: '[EMAIL PROTECTED]' Sent: 7/14/00 2:18 PM Subject: questions about performance and setup I've been given the task of setting up our own "blaster" for sending out emails of our financial news and charts to our subscribers. We outsource this right now, and it's abysmally expensive. Basically, we want 3 boxes (or so) that run in parallel and blast out the emails, about 50 million per month, but the subscription rate is growing rapidly each month. It needs to handle bounced mail by dumping the addresses into a file for later retrieval so they can be removed from the database, or by running an external script for each bounced address. I'm looking at getting 3 dell dual PIII 750's, with a 18 or 36GB 1rpm disk, and 512M or 1G of mem each. Each will run Linux or BSD. Here's what I need to know: 1. How well does qmail take advantage of multiple processors? How much memory and disk will I need? (we're at 50 million messages per month now, and we only send out monday-friday, so that's over 2 million messages per day, and it's only going up) 2. How many messages per day would one estimate that each of these servers could do? 3. I read about mini-qmail and how it's about 100 times faster blasting out email to QMQP servers. Since you can specify multiple QMQP servers, if I have a fourth machine running mini-qmail and managing the actual mailing list, can I add the other 3 as QMQP servers and have it load balance between all 3 for sending out mail? (this way I could add more servers easily if I needed to) 4. Can I easily make qmail run an external script for each bounced mail? 5. Anything else I should know? Thanks. -- Jay Austad Network Administrator CBS Marketwatch 612.817.1271 [EMAIL PROTECTED] http://cbs.marketwatch.com http://www.bigcharts.com
RE: questions about performance and setup
I might as well jump into this since I just built a RAID 5 system for a database. The machine I built contains a DPT SmartRAID V SCSI RAID 0/1/5 controller with 5 1RPM 9.1 gig drives. The thing I notice about RAID 5 in the right configuration is that you can throw tons of IO at it and you will see little decrease in performance. Our Database server (Ya, I know, its not MAIL SERVER) gets tons of IO and its nothing to it; just eats it up and continues on its way. I gotta say that you can't go wrong with this controller. It's a I2O controller and thus supported in FreeBSD and Linux. As Dave stated, you will get stuck with Dell and their proprietary drivers, this I would avoid like the plague. -Original Message- From: Hubbard, David [mailto:[EMAIL PROTECTED]] Sent: Friday, July 14, 2000 11:48 AM To: '[EMAIL PROTECTED]' Subject: RE: questions about performance and setup Hey Jay, I don't know much about setting that type of thing up in qmail, but I would like to give you some ideas on the hardware. I'm not sure how much load qmail would generate in a scenario like that, but you may want to consider Solaris x86 for the superior SMP performance. Also, you should know what you're getting into on the Dell boxes if you choose to run linux. I've got a Dell PE2400 dual that runs linux and you're going to be at the mercy of Dell and Adaptec on when you upgrade your kernel because they have some sorry proprietary drivers for their RAID controllers that are tailored to a specific kernel version and redhat sub-revision. If you can put up with that, then Redhat Linux/Qmail on a Dell runs very fast, I'm happy with mine. But at the same time, I'm sitting on a kernel with a known suid exploit hoping Dell will release newer drivers soon... It is much nicer running Linux on an older Dell server of mine that has an AMI MegaRaid card with drivers built into the kernel. Dave -Original Message- From: Austad, Jay To: '[EMAIL PROTECTED]' Sent: 7/14/00 2:18 PM Subject: questions about performance and setup I've been given the task of setting up our own "blaster" for sending out emails of our financial news and charts to our subscribers. We outsource this right now, and it's abysmally expensive. Basically, we want 3 boxes (or so) that run in parallel and blast out the emails, about 50 million per month, but the subscription rate is growing rapidly each month. It needs to handle bounced mail by dumping the addresses into a file for later retrieval so they can be removed from the database, or by running an external script for each bounced address. I'm looking at getting 3 dell dual PIII 750's, with a 18 or 36GB 1rpm disk, and 512M or 1G of mem each. Each will run Linux or BSD. Here's what I need to know: 1. How well does qmail take advantage of multiple processors? How much memory and disk will I need? (we're at 50 million messages per month now, and we only send out monday-friday, so that's over 2 million messages per day, and it's only going up) 2. How many messages per day would one estimate that each of these servers could do? 3. I read about mini-qmail and how it's about 100 times faster blasting out email to QMQP servers. Since you can specify multiple QMQP servers, if I have a fourth machine running mini-qmail and managing the actual mailing list, can I add the other 3 as QMQP servers and have it load balance between all 3 for sending out mail? (this way I could add more servers easily if I needed to) 4. Can I easily make qmail run an external script for each bounced mail? 5. Anything else I should know? Thanks. -- Jay Austad Network Administrator CBS Marketwatch 612.817.1271 [EMAIL PROTECTED] http://cbs.marketwatch.com http://www.bigcharts.com smime.p7s
RE: questions about performance and setup
I already have Mandrake Linux 7.0 and 7.1 running on multiple Dell boxes with no trouble, some of them took work to get going, but it runs well. I have a few Crystal PC's here also that I may use instead, dual PIII 550's with 512MB ram and 9 or 18GB 1rpm drives. I'll probably use these for testing. The bulk of the messages will be the same content to many rcpt's. However, once in awhile we'll have 100,000 different messages go out to 100,000 different people. Since the QMQP support under mini-qmail doesn't load balance, can I feed it a hostname with multiple dns entries (round-robin dns)? Or better yet, how easy would it be to modify the qmail code to just load balance between them? Jay -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Friday, July 14, 2000 2:09 PM To: '[EMAIL PROTECTED]' Subject: Re: questions about performance and setup Here's what I need to know: 1. How well does qmail take advantage of multiple processors? How much Indreectly, quite well as it forks many processes, thus if the OS takes good advantage of your CPUs, then qmail inherits that advantage. memory and disk will I need? (we're at 50 million messages per month now, Are these message unique per target address or the same. If unique, your requirements are vastly different and very queue/disk intensive. If they are the same and you take advantage or VERP support on qmail, then your load will mainly be sending related which will benefit from more memory, multiple instances, etc. and we only send out monday-friday, so that's over 2 million messages per day, and it's only going up) 2. How many messages per day would one estimate that each of these servers could do? 3. I read about mini-qmail and how it's about 100 times faster blasting out email to QMQP servers. Since you can specify multiple QMQP servers, if I have a fourth machine running mini-qmail and managing the actual mailing list, can I add the other 3 as QMQP servers and have it load balance between all 3 for sending out mail? (this way I could add more servers easily if I needed to) The qmqp support doesn't load balance. It simply takes the first one it can connect to. 4. Can I easily make qmail run an external script for each bounced mail? Absolutely. 5. Anything else I should know? That all hinges on whether your emails are unique for each recipient or not. Or more importantly, the average number of recipients per unique email. Regards.
Re: questions about performance and setup
On Fri, Jul 14, 2000 at 02:29:06PM -0500, Austad, Jay wrote: I already have Mandrake Linux 7.0 and 7.1 running on multiple Dell boxes with no trouble, some of them took work to get going, but it runs well. I have a few Crystal PC's here also that I may use instead, dual PIII 550's with 512MB ram and 9 or 18GB 1rpm drives. I'll probably use these for testing. I agree with the earlier poster that more spindles for your queue (c/- raid) is a good thing in general. The bulk of the messages will be the same content to many rcpt's. However, once in awhile we'll have 100,000 different messages go out to 100,000 different people. Since the QMQP support under mini-qmail doesn't load balance, can I feed it a hostname with multiple dns entries (round-robin dns)? Or better yet, how easy would it be to modify the qmail code to just load balance between them? The manpage for qmail-qmqpc tells us that they have to be IP addresses in qmqpservers so a RR DNS won't help. If all of the messages are generated on one machine, then I'd be inclined to go for a much simpler solution than modifying qmail. I'd have an instance of qmail for each outbound server with the appropriate qmqpservers entry, then have your queue insertion script do a round-robin itself by simply cycling thru the qmail-inject command associated with each instance. for instance in 1 2 3 4 5 do getnext_message_details() /var/qmail{$instance}/bin/qmail-inject currentmessage details done Or some such. Alternatively, if you have money to burn, maybe a layer four switch with load-balancing skills. Mark. Jay -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Friday, July 14, 2000 2:09 PM To: '[EMAIL PROTECTED]' Subject: Re: questions about performance and setup Here's what I need to know: 1. How well does qmail take advantage of multiple processors? How much Indreectly, quite well as it forks many processes, thus if the OS takes good advantage of your CPUs, then qmail inherits that advantage. memory and disk will I need? (we're at 50 million messages per month now, Are these message unique per target address or the same. If unique, your requirements are vastly different and very queue/disk intensive. If they are the same and you take advantage or VERP support on qmail, then your load will mainly be sending related which will benefit from more memory, multiple instances, etc. and we only send out monday-friday, so that's over 2 million messages per day, and it's only going up) 2. How many messages per day would one estimate that each of these servers could do? 3. I read about mini-qmail and how it's about 100 times faster blasting out email to QMQP servers. Since you can specify multiple QMQP servers, if I have a fourth machine running mini-qmail and managing the actual mailing list, can I add the other 3 as QMQP servers and have it load balance between all 3 for sending out mail? (this way I could add more servers easily if I needed to) The qmqp support doesn't load balance. It simply takes the first one it can connect to. 4. Can I easily make qmail run an external script for each bounced mail? Absolutely. 5. Anything else I should know? That all hinges on whether your emails are unique for each recipient or not. Or more importantly, the average number of recipients per unique email. Regards.
RE: questions about performance and setup
Where would I start in the code to modify the QMQP servers list so that it would load balance between all of the servers in the list instead of just using the first one it can contact? This would be very useful to me. I assume qmail-qmqpc.c is one of them, are there others I would need to play around with? Jay -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Friday, July 14, 2000 3:55 PM To: '[EMAIL PROTECTED]' Subject: Re: questions about performance and setup On Fri, Jul 14, 2000 at 02:29:06PM -0500, Austad, Jay wrote: I already have Mandrake Linux 7.0 and 7.1 running on multiple Dell boxes with no trouble, some of them took work to get going, but it runs well. I have a few Crystal PC's here also that I may use instead, dual PIII 550's with 512MB ram and 9 or 18GB 1rpm drives. I'll probably use these for testing. I agree with the earlier poster that more spindles for your queue (c/- raid) is a good thing in general. The bulk of the messages will be the same content to many rcpt's. However, once in awhile we'll have 100,000 different messages go out to 100,000 different people. Since the QMQP support under mini-qmail doesn't load balance, can I feed it a hostname with multiple dns entries (round-robin dns)? Or better yet, how easy would it be to modify the qmail code to just load balance between them? The manpage for qmail-qmqpc tells us that they have to be IP addresses in qmqpservers so a RR DNS won't help. If all of the messages are generated on one machine, then I'd be inclined to go for a much simpler solution than modifying qmail. I'd have an instance of qmail for each outbound server with the appropriate qmqpservers entry, then have your queue insertion script do a round-robin itself by simply cycling thru the qmail-inject command associated with each instance. for instance in 1 2 3 4 5 do getnext_message_details() /var/qmail{$instance}/bin/qmail-inject currentmessage details done Or some such. Alternatively, if you have money to burn, maybe a layer four switch with load-balancing skills. Mark. Jay -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Friday, July 14, 2000 2:09 PM To: '[EMAIL PROTECTED]' Subject: Re: questions about performance and setup Here's what I need to know: 1. How well does qmail take advantage of multiple processors? How much Indreectly, quite well as it forks many processes, thus if the OS takes good advantage of your CPUs, then qmail inherits that advantage. memory and disk will I need? (we're at 50 million messages per month now, Are these message unique per target address or the same. If unique, your requirements are vastly different and very queue/disk intensive. If they are the same and you take advantage or VERP support on qmail, then your load will mainly be sending related which will benefit from more memory, multiple instances, etc. and we only send out monday-friday, so that's over 2 million messages per day, and it's only going up) 2. How many messages per day would one estimate that each of these servers could do? 3. I read about mini-qmail and how it's about 100 times faster blasting out email to QMQP servers. Since you can specify multiple QMQP servers, if I have a fourth machine running mini-qmail and managing the actual mailing list, can I add the other 3 as QMQP servers and have it load balance between all 3 for sending out mail? (this way I could add more servers easily if I needed to) The qmqp support doesn't load balance. It simply takes the first one it can connect to. 4. Can I easily make qmail run an external script for each bounced mail? Absolutely. 5. Anything else I should know? That all hinges on whether your emails are unique for each recipient or not. Or more importantly, the average number of recipients per unique email. Regards.
Re: questions about performance and setup
Line 153 of qmail-qmqpc.c is a good place to start. It's a trivial loop that would benefit from something like adjusting the starting point by some random value. Eg: randj = rand() % servers.len; i = 0; for (j = randj;j servers.len;++j) if (!servers.s[j]) { doit(servers.s + i); i = j + 1; } Then repeat the loop from zero to randj - 1 i = 0; for (j = 0;j randj;++j) ... Mark. On Fri, Jul 14, 2000 at 05:38:44PM -0500, Austad, Jay wrote: Where would I start in the code to modify the QMQP servers list so that it would load balance between all of the servers in the list instead of just using the first one it can contact? This would be very useful to me. I assume qmail-qmqpc.c is one of them, are there others I would need to play around with? Jay -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Friday, July 14, 2000 3:55 PM To: '[EMAIL PROTECTED]' Subject: Re: questions about performance and setup On Fri, Jul 14, 2000 at 02:29:06PM -0500, Austad, Jay wrote: I already have Mandrake Linux 7.0 and 7.1 running on multiple Dell boxes with no trouble, some of them took work to get going, but it runs well. I have a few Crystal PC's here also that I may use instead, dual PIII 550's with 512MB ram and 9 or 18GB 1rpm drives. I'll probably use these for testing. I agree with the earlier poster that more spindles for your queue (c/- raid) is a good thing in general. The bulk of the messages will be the same content to many rcpt's. However, once in awhile we'll have 100,000 different messages go out to 100,000 different people. Since the QMQP support under mini-qmail doesn't load balance, can I feed it a hostname with multiple dns entries (round-robin dns)? Or better yet, how easy would it be to modify the qmail code to just load balance between them? The manpage for qmail-qmqpc tells us that they have to be IP addresses in qmqpservers so a RR DNS won't help. If all of the messages are generated on one machine, then I'd be inclined to go for a much simpler solution than modifying qmail. I'd have an instance of qmail for each outbound server with the appropriate qmqpservers entry, then have your queue insertion script do a round-robin itself by simply cycling thru the qmail-inject command associated with each instance. for instance in 1 2 3 4 5 do getnext_message_details() /var/qmail{$instance}/bin/qmail-inject currentmessage details done Or some such. Alternatively, if you have money to burn, maybe a layer four switch with load-balancing skills. Mark. Jay -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Friday, July 14, 2000 2:09 PM To: '[EMAIL PROTECTED]' Subject: Re: questions about performance and setup Here's what I need to know: 1. How well does qmail take advantage of multiple processors? How much Indreectly, quite well as it forks many processes, thus if the OS takes good advantage of your CPUs, then qmail inherits that advantage. memory and disk will I need? (we're at 50 million messages per month now, Are these message unique per target address or the same. If unique, your requirements are vastly different and very queue/disk intensive. If they are the same and you take advantage or VERP support on qmail, then your load will mainly be sending related which will benefit from more memory, multiple instances, etc. and we only send out monday-friday, so that's over 2 million messages per day, and it's only going up) 2. How many messages per day would one estimate that each of these servers could do? 3. I read about mini-qmail and how it's about 100 times faster blasting out email to QMQP servers. Since you can specify multiple QMQP servers, if I have a fourth machine running mini-qmail and managing the actual mailing list, can I add the other 3 as QMQP servers and have it load balance between all 3 for sending out mail? (this way I could add more servers easily if I needed to) The qmqp support doesn't load balance. It simply takes the first one it can connect to. 4. Can I easily make qmail run an external script for each bounced mail? Absolutely. 5. Anything else I should know? That all hinges on whether your emails are unique for each recipient or not. Or more importantly, the average number of recipients per unique email. Regards.
Re: questions about performance and setup
Jay, That's the beauty of having multiple instances, not having to patch qmail. All you need to do is install qmail once per machine (ie, /var/qmail1, /var/qmail2,...). Then have the script that does the mailing call randomly on of the /var/qmail#/bin/qmail-inject. This will emulate round robin without any patching. JES Austad, Jay writes: Where would I start in the code to modify the QMQP servers list so that it would load balance between all of the servers in the list instead of just using the first one it can contact? This would be very useful to me. I assume qmail-qmqpc.c is one of them, are there others I would need to play around with? Jay -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Friday, July 14, 2000 3:55 PM To: '[EMAIL PROTECTED]' Subject: Re: questions about performance and setup On Fri, Jul 14, 2000 at 02:29:06PM -0500, Austad, Jay wrote: I already have Mandrake Linux 7.0 and 7.1 running on multiple Dell boxes with no trouble, some of them took work to get going, but it runs well. I have a few Crystal PC's here also that I may use instead, dual PIII 550's with 512MB ram and 9 or 18GB 1rpm drives. I'll probably use these for testing. I agree with the earlier poster that more spindles for your queue (c/- raid) is a good thing in general. The bulk of the messages will be the same content to many rcpt's. However, once in awhile we'll have 100,000 different messages go out to 100,000 different people. Since the QMQP support under mini-qmail doesn't load balance, can I feed it a hostname with multiple dns entries (round-robin dns)? Or better yet, how easy would it be to modify the qmail code to just load balance between them? The manpage for qmail-qmqpc tells us that they have to be IP addresses in qmqpservers so a RR DNS won't help. If all of the messages are generated on one machine, then I'd be inclined to go for a much simpler solution than modifying qmail. I'd have an instance of qmail for each outbound server with the appropriate qmqpservers entry, then have your queue insertion script do a round-robin itself by simply cycling thru the qmail-inject command associated with each instance. for instance in 1 2 3 4 5 do getnext_message_details() /var/qmail{$instance}/bin/qmail-inject currentmessage details done Or some such. Alternatively, if you have money to burn, maybe a layer four switch with load-balancing skills. Mark. Jay -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Friday, July 14, 2000 2:09 PM To: '[EMAIL PROTECTED]' Subject: Re: questions about performance and setup Here's what I need to know: 1. How well does qmail take advantage of multiple processors? How much Indreectly, quite well as it forks many processes, thus if the OS takes good advantage of your CPUs, then qmail inherits that advantage. memory and disk will I need? (we're at 50 million messages per month now, Are these message unique per target address or the same. If unique, your requirements are vastly different and very queue/disk intensive. If they are the same and you take advantage or VERP support on qmail, then your load will mainly be sending related which will benefit from more memory, multiple instances, etc. and we only send out monday-friday, so that's over 2 million messages per day, and it's only going up) 2. How many messages per day would one estimate that each of these servers could do? 3. I read about mini-qmail and how it's about 100 times faster blasting out email to QMQP servers. Since you can specify multiple QMQP servers, if I have a fourth machine running mini-qmail and managing the actual mailing list, can I add the other 3 as QMQP servers and have it load balance between all 3 for sending out mail? (this way I could add more servers easily if I needed to) The qmqp support doesn't load balance. It simply takes the first one it can connect to. 4. Can I easily make qmail run an external script for each bounced mail? Absolutely. 5. Anything else I should know? That all hinges on whether your emails are unique for each recipient or not. Or more importantly, the average number of recipients per unique email. Regards.
Re: questions about performance and setup
Of course there is at least one bug in here, but you get the idea. Mark. On Fri, Jul 14, 2000 at 04:00:43PM -0700, [EMAIL PROTECTED] wrote: Line 153 of qmail-qmqpc.c is a good place to start. It's a trivial loop that would benefit from something like adjusting the starting point by some random value. Eg: randj = rand() % servers.len; i = 0; for (j = randj;j servers.len;++j) if (!servers.s[j]) { doit(servers.s + i); i = j + 1; } Then repeat the loop from zero to randj - 1 i = 0; for (j = 0;j randj;++j) ...
RE: questions about performance and setup
Then have the script that does the mailing call randomly on of the /var/qmail#/bin/qmail-inject. This will emulate round robin without any patching. Won't this way be a performance hit though? I admit, it is an easy solution and would work excellent, but I have to think about efficiency also. C code is much faster than shell or perl, and I'd like to set it up once and not have to ever worry about again, or at least for a long, long time. As I said, we're doing 50 million emails a month right now, but this is increasing substantially each month, and as we rollout new subscription services, we'll have even more load. Sending 10 times this amount by the same time next year is a good possibility, possibly sooner as we seem to underestimate the rate at which we're growing much of the time... Jay -Original Message- From: JuanE [mailto:[EMAIL PROTECTED]] Sent: Friday, July 14, 2000 5:55 PM To: '[EMAIL PROTECTED]' Subject: Re: questions about performance and setup Jay, That's the beauty of having multiple instances, not having to patch qmail. All you need to do is install qmail once per machine (ie, /var/qmail1, /var/qmail2,...). Then have the script that does the mailing call randomly on of the /var/qmail#/bin/qmail-inject. This will emulate round robin without any patching. JES Austad, Jay writes: Where would I start in the code to modify the QMQP servers list so that it would load balance between all of the servers in the list instead of just using the first one it can contact? This would be very useful to me. I assume qmail-qmqpc.c is one of them, are there others I would need to play around with? Jay -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Friday, July 14, 2000 3:55 PM To: '[EMAIL PROTECTED]' Subject: Re: questions about performance and setup On Fri, Jul 14, 2000 at 02:29:06PM -0500, Austad, Jay wrote: I already have Mandrake Linux 7.0 and 7.1 running on multiple Dell boxes with no trouble, some of them took work to get going, but it runs well. I have a few Crystal PC's here also that I may use instead, dual PIII 550's with 512MB ram and 9 or 18GB 1rpm drives. I'll probably use these for testing. I agree with the earlier poster that more spindles for your queue (c/- raid) is a good thing in general. The bulk of the messages will be the same content to many rcpt's. However, once in awhile we'll have 100,000 different messages go out to 100,000 different people. Since the QMQP support under mini-qmail doesn't load balance, can I feed it a hostname with multiple dns entries (round-robin dns)? Or better yet, how easy would it be to modify the qmail code to just load balance between them? The manpage for qmail-qmqpc tells us that they have to be IP addresses in qmqpservers so a RR DNS won't help. If all of the messages are generated on one machine, then I'd be inclined to go for a much simpler solution than modifying qmail. I'd have an instance of qmail for each outbound server with the appropriate qmqpservers entry, then have your queue insertion script do a round-robin itself by simply cycling thru the qmail-inject command associated with each instance. for instance in 1 2 3 4 5 do getnext_message_details() /var/qmail{$instance}/bin/qmail-inject currentmessage details done Or some such. Alternatively, if you have money to burn, maybe a layer four switch with load-balancing skills. Mark. Jay -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Friday, July 14, 2000 2:09 PM To: '[EMAIL PROTECTED]' Subject: Re: questions about performance and setup Here's what I need to know: 1. How well does qmail take advantage of multiple processors? How much Indreectly, quite well as it forks many processes, thus if the OS takes good advantage of your CPUs, then qmail inherits that advantage. memory and disk will I need? (we're at 50 million messages per month now, Are these message unique per target address or the same. If unique, your requirements are vastly different and very queue/disk intensive. If they are the same and you take advantage or VERP support on qmail, then your load will mainly be sending related which will benefit from more memory, multiple instances, etc. and we only send out monday-friday, so that's over 2 million messages per day, and it's only going up) 2. How many messages per day would one estimate that each of these servers could do? 3. I read about mini-qmail and how it's about 100 times faster blasting out email to QMQP servers. Since you can specify multiple QMQP servers, if I have a fourth machine running mini-qmail and managing the actual mailing list, can I add the other 3 as QMQP servers and have it load balance between all 3 for sending out mail? (this way I could add
Re: questions about performance and setup
On Fri, Jul 14, 2000 at 12:21:57PM -0700, Jason Murphy wrote: The machine I built contains a DPT SmartRAID V SCSI RAID 0/1/5 controller with 5 1RPM 9.1 gig drives. The thing I notice about RAID 5 in the right configuration is that you can throw tons of IO at it and you will see little decrease in performance. Our Database server (Ya, I know, its not MAIL SERVER) gets tons of IO and its nothing to it; just eats it up and continues on its way. A massive mail injection, especially if the content is unique to the user, can overwhelm a disk subsystem. This is reccomending the exact -wrong- kind of disk system. RAID 5 has a write penalty, as it has to calculate parity for each write, and write to multiple spindles. The best type of RAID for small block writes is RAID 10 or RAID 1+0 (not to be confused with RAID 0+1). Even better is to use a disk system with write-back cache. Ideally, you need at least seven spindles. I've seen great things with the Infortrend controller. A great setup would be 1U pc's connected to an external RAID. John
RE: questions about performance and setup
Non-unique emails will most likely be generated by other machines and send the box running mini-qmail via smtp. Non-unique emails will be a small percentage of what gets sent out, for now. Jay -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Saturday, July 15, 2000 12:10 AM To: '[EMAIL PROTECTED]' Subject: Re: questions about performance and setup On Fri, Jul 14, 2000 at 07:01:46PM -0500, Austad, Jay wrote: Then have the script that does the mailing call randomly on of the /var/qmail#/bin/qmail-inject. This will emulate round robin without any patching. Won't this way be a performance hit though? I admit, it is an easy solution No. My experience is that the cost of running a script to inject the mail in a way similar to that mentioned above, is pretty small compared to the queue injection cost and the delivery cost. sh or perl will be fine. and would work excellent, but I have to think about efficiency also. C code is much faster than shell or perl, and I'd like to set it up once and not have to ever worry about again, or at least for a long, long time. As I said, we're doing 50 million emails a month right now, but this is increasing substantially each month, and as we rollout new subscription services, we'll have even more load. Sending 10 times this amount by the same time next year is a good possibility, possibly sooner as we seem to underestimate the rate at which we're growing much of the time... You may also need to look at the scalability of the generation of the emails. One system I recently looked at claimed to be able to generate nicely unique emails at a targetted database, but it burned CPU like it was free - just in generating the content. Mark. Jay -Original Message- From: JuanE [mailto:[EMAIL PROTECTED]] Sent: Friday, July 14, 2000 5:55 PM To: '[EMAIL PROTECTED]' Subject: Re: questions about performance and setup Jay, That's the beauty of having multiple instances, not having to patch qmail. All you need to do is install qmail once per machine (ie, /var/qmail1, /var/qmail2,...). Then have the script that does the mailing call randomly on of the /var/qmail#/bin/qmail-inject. This will emulate round robin without any patching. JES Austad, Jay writes: Where would I start in the code to modify the QMQP servers list so that it would load balance between all of the servers in the list instead of just using the first one it can contact? This would be very useful to me. I assume qmail-qmqpc.c is one of them, are there others I would need to play around with? Jay -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Friday, July 14, 2000 3:55 PM To: '[EMAIL PROTECTED]' Subject: Re: questions about performance and setup On Fri, Jul 14, 2000 at 02:29:06PM -0500, Austad, Jay wrote: I already have Mandrake Linux 7.0 and 7.1 running on multiple Dell boxes with no trouble, some of them took work to get going, but it runs well. I have a few Crystal PC's here also that I may use instead, dual PIII 550's with 512MB ram and 9 or 18GB 1rpm drives. I'll probably use these for testing. I agree with the earlier poster that more spindles for your queue (c/- raid) is a good thing in general. The bulk of the messages will be the same content to many rcpt's. However, once in awhile we'll have 100,000 different messages go out to 100,000 different people. Since the QMQP support under mini-qmail doesn't load balance, can I feed it a hostname with multiple dns entries (round-robin dns)? Or better yet, how easy would it be to modify the qmail code to just load balance between them? The manpage for qmail-qmqpc tells us that they have to be IP addresses in qmqpservers so a RR DNS won't help. If all of the messages are generated on one machine, then I'd be inclined to go for a much simpler solution than modifying qmail. I'd have an instance of qmail for each outbound server with the appropriate qmqpservers entry, then have your queue insertion script do a round-robin itself by simply cycling thru the qmail-inject command associated with each instance. for instance in 1 2 3 4 5 do getnext_message_details() /var/qmail{$instance}/bin/qmail-inject currentmessage details done Or some such. Alternatively, if you have money to burn, maybe a layer four switch with load-balancing skills. Mark. Jay -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Friday, July 14, 2000 2:09 PM To: '[EMAIL PROTECTED]' Subject: Re: questions about performance and setup Here's what I need to know: 1. How well does qmail take advantage of multiple processors? How much Indreectly, quite well as it forks many processes, thus if the OS takes good
Re: questions about performance and setup
Here's what I need to know: 1. How well does qmail take advantage of multiple processors? How much Indreectly, quite well as it forks many processes, thus if the OS takes good advantage of your CPUs, then qmail inherits that advantage. memory and disk will I need? (we're at 50 million messages per month now, Are these message unique per target address or the same. If unique, your requirements are vastly different and very queue/disk intensive. If they are the same and you take advantage or VERP support on qmail, then your load will mainly be sending related which will benefit from more memory, multiple instances, etc. and we only send out monday-friday, so that's over 2 million messages per day, and it's only going up) 2. How many messages per day would one estimate that each of these servers could do? 3. I read about mini-qmail and how it's about 100 times faster blasting out email to QMQP servers. Since you can specify multiple QMQP servers, if I have a fourth machine running mini-qmail and managing the actual mailing list, can I add the other 3 as QMQP servers and have it load balance between all 3 for sending out mail? (this way I could add more servers easily if I needed to) The qmqp support doesn't load balance. It simply takes the first one it can connect to. 4. Can I easily make qmail run an external script for each bounced mail? Absolutely. 5. Anything else I should know? That all hinges on whether your emails are unique for each recipient or not. Or more importantly, the average number of recipients per unique email. Regards.