Re: High volume mail handling architecture
On Fri, Oct 08, 2004 at 10:07:15AM +0100, John Hedges wrote: On Fri, Sep 10, 2004 at 05:34:07PM +, Gerrit Pape wrote: On Fri, Sep 10, 2004 at 09:49:27AM +0200, Adrian 'Dagurashibanipal' von Bidder wrote: Herre is what happens: A spammer uses my email address as the sender address in spam frequently. So, that's my plea to everybody with big mail installations: make your frontend machines aware of what mail they are supposed to accept, so that you never need to bounce. (Ok, some cases will still bounce: disk full, procmail script errors etc., but these are a very small proportion.) And the other plea is, of course, get rid of qmail and other products which accept all mail by default. As far as my experience goes, pleas or complaints against other people doesn't help much if you want to see something changed. Better help yourself. I suggest to instruct your mail user agent to make use of the (apparently almost forgotten) fact that the sender's addresses in the envelope and in the header can be different. Most today's mail transfer agents should support address extensions. If your address is used as envelope sender in unsolicited mail, it's your public one. Use a non-public address as envelope sender of mail you send, and simply change it in case it gets abused; only bouncers should send mail to this address, and they usually do within two weeks. Now you can configure your MTA to outright reject delivery notifications solely based on the information in the envelope. Sorry to reopen such an old thread. I'd saved this mail for reference as I too get a lot of bounces for spam with forged mail headers. A fresh run of spam that some wanker has initiated in my name has made my inbox unbearable this last few days so I need to do something about it. It's not because of forged headers but forged envelopes, your address is used as envelope sender in SMTP (MAIL FROM:[EMAIL PROTECTED]). It's the envelope sender address where delivery notifications, such as bounces, are sent; and those delivery notifications usually have an empty envelope sender (MAIL FROM:). Replies and followups are sent to an address specified in the headers of the mail (From: John [EMAIL PROTECTED], or Reply-To:), and have a non-empty envelope sender. If john starts to send all his mails with the envelope sender address [EMAIL PROTECTED] and still uses the same headers, communication with the recipients will not change, but delivery notification will go to [EMAIL PROTECTED]. His public, well known, unchangeable mail address [EMAIL PROTECTED] now no longer receives delivery notification for mail john himself has sent; he now can safely reject or disregard mails sent with an empty envelope sender to the envelope recipient [EMAIL PROTECTED], solely based on the envelope information, without looking at the headers or the body of the mail. Would it be possible for you, Gerrit, to expand a little on your setup? My personal setup is done with the qconfirm package, specifically, I send mails through the qconfirm-inject program which adjusts the envelope sender, and so requests bounces to go to a different address than my public one. This is qmail specific, and you need shell access to the mail server. I currently use fetchmail to get my mail from a catchall mailbox at my ISP. Can I use the envelope sender trick in this case as I can't see an easy way to differentiate between bounces and normal email once the messages reach my box? Most of the bounces are sent with an empty envelope sender (), I'm not sure whether fetchmail preserves the envelope information, it might get lost; look for Return-Path:. Although it might work with your setup, sorting out the bounces better should be done on the mail exchanger I think. Regards, Gerrit. -- Open projects at http://smarden.org/pape/. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: High volume mail handling architecture
On Fri, Sep 10, 2004 at 05:34:07PM +, Gerrit Pape wrote: On Fri, Sep 10, 2004 at 09:49:27AM +0200, Adrian 'Dagurashibanipal' von Bidder wrote: Herre is what happens: A spammer uses my email address as the sender address in spam frequently. So, I sometimes suddenly have 2000 new mails in my inbox :-( So, that's my plea to everybody with big mail installations: make your frontend machines aware of what mail they are supposed to accept, so that you never need to bounce. (Ok, some cases will still bounce: disk full, procmail script errors etc., but these are a very small proportion.) And the other plea is, of course, get rid of qmail and other products which accept all mail by default. As far as my experience goes, pleas or complaints against other people doesn't help much if you want to see something changed. Better help yourself. I suggest to instruct your mail user agent to make use of the (apparently almost forgotten) fact that the sender's addresses in the envelope and in the header can be different. Most today's mail transfer agents should support address extensions. If your address is used as envelope sender in unsolicited mail, it's your public one. Use a non-public address as envelope sender of mail you send, and simply change it in case it gets abused; only bouncers should send mail to this address, and they usually do within two weeks. Now you can configure your MTA to outright reject delivery notifications solely based on the information in the envelope. $ mconnect a.mx.smarden.org 220 smarden.org ESMTP mail from: 250 Sender accepted. rcpt to:[EMAIL PROTECTED] 553 This address cannot get bounces, either you are not bouncing to the envelope sender, or the envelope of the mail you bounce is forged. quit 221 Good bye. $ I'm doing this for about ten months now, and don't see most of the unwanted delivery notifications, including delivery confirmation requests for unsolicited mail with forged envelope. Sorry to reopen such an old thread. I'd saved this mail for reference as I too get a lot of bounces for spam with forged mail headers. A fresh run of spam that some wanker has initiated in my name has made my inbox unbearable this last few days so I need to do something about it. Would it be possible for you, Gerrit, to expand a little on your setup? I currently use fetchmail to get my mail from a catchall mailbox at my ISP. Can I use the envelope sender trick in this case as I can't see an easy way to differentiate between bounces and normal email once the messages reach my box? Cheers John -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: High volume mail handling architecture
On Thursday 09 September 2004 01.33, Ruth A. Kramer wrote: Adrian 'Dagurashibanipal' von Bidder wrote: On behalf of all joe-job victims: Whatever you do, *please* do it in a way that allows you to know whether mail is going to be delivered at the front-end incoming SMTP server. (should be trivial if your user database is in LDAP or some SQL db or whatever.) On behalf of the lurkers here who are not experienced admins (am I the only one?), could someone elaborate a little more on the above? Your guess is mostly correct. Herre is what happens: A spammer uses my email address as the sender address in spam frequently. Now this would be a minor annoyance alone because my name is connected with spamming. Now, much of the spam the spammer sends out is for invalid email addresses (like [EMAIL PROTECTED] and the like, and addresses that don't exist anymore, or addresses that are really message-IDs etc. etc). If the domain part of the address does not exist, that's no problem - the mail sending software of the spammer won't find a mail server to send the mail to. But if the spammer can get the message to a mail server, two things can happen: (i) the recipient mail server behaves properly and rejects the mail right in the SMTP transaction (with 550 User unknown or whatever). Because the spammer's software is no proper mailserver, it doesn't handle this like a mailserver and instead just discards the message. (ii) if the recipient mailserver is configured to accept all mail (because it's qmail, or MS Exchange, or because it's a front-end mailserver which doesn't know about which users exist, for example a backup MX), I'm in trouble because that mailserver will see that the mail can not be delievered, and so it generates a bounce to whatever address is in the envelope sender of the spam. So, I sometimes suddenly have 2000 new mails in my inbox :-( (Actually, in my _bounces folder, and so it doesn't bother me that much, and since I've disabled spamassassin for bounces, the server load doesn't go through the roof anymore, either. But still, there's the chance thtat I miss a real bounce in the flood.) So, that's my plea to everybody with big mail installations: make your frontend machines aware of what mail they are supposed to accept, so that you never need to bounce. (Ok, some cases will still bounce: disk full, procmail script errors etc., but these are a very small proportion.) And the other plea is, of course, get rid of qmail and other products which accept all mail by default. (And, lately, a noticeable proportion of such spam 'bounces' have been by systems like TMDA and cousins. I take a certain sadistic pleasure in confirming these mails whenever I have the time. Sorry, folks.) So long -- vbi -- Protect your privacy - encrypt your email: http://fortytwo.ch/gpg/intro pgprBKMK7ggpP.pgp Description: PGP signature
Re: High volume mail handling architecture
On Fri, Sep 10, 2004 at 09:49:27AM +0200, Adrian 'Dagurashibanipal' von Bidder wrote: Herre is what happens: A spammer uses my email address as the sender address in spam frequently. So, I sometimes suddenly have 2000 new mails in my inbox :-( So, that's my plea to everybody with big mail installations: make your frontend machines aware of what mail they are supposed to accept, so that you never need to bounce. (Ok, some cases will still bounce: disk full, procmail script errors etc., but these are a very small proportion.) And the other plea is, of course, get rid of qmail and other products which accept all mail by default. As far as my experience goes, pleas or complaints against other people doesn't help much if you want to see something changed. Better help yourself. I suggest to instruct your mail user agent to make use of the (apparently almost forgotten) fact that the sender's addresses in the envelope and in the header can be different. Most today's mail transfer agents should support address extensions. If your address is used as envelope sender in unsolicited mail, it's your public one. Use a non-public address as envelope sender of mail you send, and simply change it in case it gets abused; only bouncers should send mail to this address, and they usually do within two weeks. Now you can configure your MTA to outright reject delivery notifications solely based on the information in the envelope. $ mconnect a.mx.smarden.org 220 smarden.org ESMTP mail from: 250 Sender accepted. rcpt to:[EMAIL PROTECTED] 553 This address cannot get bounces, either you are not bouncing to the envelope sender, or the envelope of the mail you bounce is forged. quit 221 Good bye. $ I'm doing this for about ten months now, and don't see most of the unwanted delivery notifications, including delivery confirmation requests for unsolicited mail with forged envelope. Regards, Gerrit. -- Open projects at http://smarden.org/pape/. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: High volume mail handling architecture
Marcin Owsiany wrote: Well, adding more disks to the setup is what I planned to do next. I just want to make sure that the performance I get from the _current_ setup is normal. Oh okay, sorry. Thought you were looking for a performance increase. Nate -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: High volume mail handling architecture
On Fri, Sep 10, 2004 at 09:07:37PM +0200, Jonathan G - Mailing Lists wrote: Sorry, what's your MTA? Mine? On that particular machine it is qmail that does the deliveries (or rather, what is left of qmail after all the patching I've done). Marcin -- Marcin Owsiany [EMAIL PROTECTED] http://marcin.owsiany.pl/ GnuPG: 1024D/60F41216 FE67 DA2D 0ACA FC5E 3F75 D6F6 3A0D 8AA0 60F4 1216 -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: High volume mail handling architecture
:) hehehe, Ok, i was just asking because there are some MTA's that can fit better in some environments than others. I like features of QMail, Postfix and Exim, but i hate others for an ISP environment. jonathan Marcin Owsiany wrote: On Fri, Sep 10, 2004 at 09:07:37PM +0200, Jonathan G - Mailing Lists wrote: Sorry, what's your MTA? Mine? On that particular machine it is qmail that does the deliveries (or rather, what is left of qmail after all the patching I've done). Marcin -- Jonathan Gonzalez Fernandez (o mail : [EMAIL PROTECTED] //\ jabber: [EMAIL PROTECTED] V_/ site : www.surestorm.com ::: Registered Linux User #86 ::: -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: High volume mail handling architecture
RAM is always not the answer with 32Bit machines. You can cause bounce buffers with too much RAM. The sweet spot for Linux on a 32Bit platform seems to be 4GB of RAM. I had 10GB of RAM in a Courier IMAP server and the server had problems releasing swap after a week. The kernel was compiled for 64GB of RAM. When I reduced the RAM to 4GB and recompiled for a 4GB machine these problems disappeared. On 10/09/04 03:40 +1000, Russell Coker wrote: On Thu, 9 Sep 2004 18:44, Marcin Owsiany [EMAIL PROTECTED] wrote: On Thu, Sep 09, 2004 at 06:03:20AM +1000, Russell Coker wrote: You have to either be doing something very intensive or very wrong to need more than one server for 20K users. Last time I did this I got 250K users per server, and I believe that I could have easily doubled that if I was allowed to choose the hardware. We have a little over 10K users, and the disk subsystem seems to be the bottleneck. When we reach about 600 read transactions + 150 write transactions per second (as reported by sar -b), the load average starts to grow expotentially instead of proportionally. There are about 20K sectors read, and 3K written per second. (That was before I turned noatime on. After that we had about 2K sector writes and 70 write transactions less, and load average dropped to a more sane value - about 3, instead of 20.) Last time I was doing this I had some Dell 2U servers (2650 from memory) with 4 * 10K U160 disks in a RAID-5 (5th disk was hot-spare) and something like 4G of RAM. The machines had almost no read access to the drives, something less than 10% of disk access was for read because the cache worked really well (the accounts that receive the most mail are the ones that have clients checking them most often - in some cases people leave their email client on 24*7 checking every 5 mins). The write bottleneck was just under 3M/s, I don't recall how many transactions that was. To give better performance you may want to look at getting more RAM. RAM is cheap and you can eliminate most read bottlenecks by caching lots of stuff. 3K sectors written per second isn't too good, but I guess that's because of the 20K sectors read. Get some more cache and things should improve a lot. Also if using a typical Unix mail server (Postfix, Sendmail, etc) then the data is written synchronously somewhere under /var before being read from there and written to the destination. If you use a NVRAM card from UMEM http://www.umem.com/ for /var/spool then you could possibly double mail delivery performance. If you use data=journal and put the journal for the mail store file system on the umem device you could probably double performance again. Also, did you implement virus/spam scanning on that box? No! Virus/spam scanning was on the front-end machines. It was believed that the mail store machines were busy enough with doing the most basic work without virus scanning (also the number of licences for the anti-virus program didn't match the number of store machines that were planned). You want to do as much work away from the mail store as possible. Mail store machines can not be replaced without major inconvenience to everyone (customers, staff, management). Front-end anti-virus machines are disposable, if you have a traffic balancing device (such as a Cisco LocalDirector or IPVS) in front of a cluster of anti-virus machines then an anti-virus machine can go down for a few days without anyone bothering. If (hypothetically) anti-virus was to take 10% of the performance from a mail store then it could require another mail store machine (if you have 5-10 machines) and therefore that's one more machine which can break and cause massive pain to everyone. Another thing, a mail store machine should require almost no CPU power. Give it a single CPU that's not the fastest available. It sucks when you have two almost unused CPUs which are both fast and hot and then one breaks down killing the machine. -- http://www.coker.com.au/selinux/ My NSA Security Enhanced Linux packages http://www.coker.com.au/bonnie++/ Bonnie++ hard drive benchmark http://www.coker.com.au/postal/Postal SMTP/POP benchmark http://www.coker.com.au/~russell/ My home page -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED] -- -- Ted Knab Chester, Maryland 21619 USA -- [In War] Conquest is easy. Control is not. -- Kirk, Mirror, Mirror, stardate unknown -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: High volume mail handling architecture
On Tuesday 07 September 2004 14.38, Maykel Moya wrote: I'm looking for documentation which help me to design a failover, redundant and scalable mail system to handle 20K users with plans to scale soon to about 50K. 20k or 50k users are not unheard of on a single server (obviously you'll need more than one if you speak about redundancy). But it depends heavily on the kind of users - how many messages per day (or per hour in peak times) do you expect? Size distribution of those messages? Are you serving IMAP or POP boxes, quota per user? 20k office workers with big IMAP boxes need a different box from 20k ISP home users with 10M POP accounts. If you intend to run things like spamassassin, you'll be looking at a very high CPU load (but you can easily off-load this.) On behalf of all joe-job victims: Whatever you do, *please* do it in a way that allows you to know whether mail is going to be delivered at the front-end incoming SMTP server. (should be trivial if your user database is in LDAP or some SQL db or whatever.) greetings -- vbi -- In der Ehe gibt's keine grern Fehler als die wiederkommenden. -- Jean Paul (eig. Johann Paul Friedrich Richter) pgpGly4aPr8zZ.pgp Description: PGP signature
Re: High volume mail handling architecture
On Thu, Sep 09, 2004 at 06:03:20AM +1000, Russell Coker wrote: You have to either be doing something very intensive or very wrong to need more than one server for 20K users. Last time I did this I got 250K users per server, and I believe that I could have easily doubled that if I was allowed to choose the hardware. We have a little over 10K users, and the disk subsystem seems to be the bottleneck. When we reach about 600 read transactions + 150 write transactions per second (as reported by sar -b), the load average starts to grow expotentially instead of proportionally. There are about 20K sectors read, and 3K written per second. (That was before I turned noatime on. After that we had about 2K sector writes and 70 write transactions less, and load average dropped to a more sane value - about 3, instead of 20.) More than 90% of the disk transactions are on the (logical) disk where mail is stored. The only processes which touch that disk, are qmail delivery processes (qmail handed mail by another SMTP-IN box: 0.8 local deliveries per second) and courierpop3d processes (7.2 logins per second). We are using an Intel SRCU42X SCSI RAID controller, and the logical disk which caries mail is made of 3 Fujitsu 36GB 15K RPM disks. Please tell me, what problem we are facing? Is the hardware so weak? Is it underperforming? Or maybe our load is exceptionally high? I can provide more statistics if they are needed. Also, did you implement virus/spam scanning on that box? kind regards, Marcin -- Marcin Owsiany [EMAIL PROTECTED] http://marcin.owsiany.pl/ GnuPG: 1024D/60F41216 FE67 DA2D 0ACA FC5E 3F75 D6F6 3A0D 8AA0 60F4 1216 -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: High volume mail handling architecture
Hi Marcin, How many files do you have in a single directory? 100 ? Which filesystem are you using? You may want to try playimg with reiserfs... Regards Andrew Marcin Owsiany wrote: On Thu, Sep 09, 2004 at 06:03:20AM +1000, Russell Coker wrote: You have to either be doing something very intensive or very wrong to need more than one server for 20K users. Last time I did this I got 250K users per server, and I believe that I could have easily doubled that if I was allowed to choose the hardware. We have a little over 10K users, and the disk subsystem seems to be the bottleneck. When we reach about 600 read transactions + 150 write transactions per second (as reported by sar -b), the load average starts to grow expotentially instead of proportionally. There are about 20K sectors read, and 3K written per second. (That was before I turned noatime on. After that we had about 2K sector writes and 70 write transactions less, and load average dropped to a more sane value - about 3, instead of 20.) More than 90% of the disk transactions are on the (logical) disk where mail is stored. The only processes which touch that disk, are qmail delivery processes (qmail handed mail by another SMTP-IN box: 0.8 local deliveries per second) and courierpop3d processes (7.2 logins per second). We are using an Intel SRCU42X SCSI RAID controller, and the logical disk which caries mail is made of 3 Fujitsu 36GB 15K RPM disks. Please tell me, what problem we are facing? Is the hardware so weak? Is it underperforming? Or maybe our load is exceptionally high? I can provide more statistics if they are needed. Also, did you implement virus/spam scanning on that box? kind regards, Marcin -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: High volume mail handling architecture
On Sep 9, 2004, at 2:44 AM, Marcin Owsiany wrote: More than 90% of the disk transactions are on the (logical) disk where mail is stored. The only processes which touch that disk, are qmail delivery processes (qmail handed mail by another SMTP-IN box: 0.8 local deliveries per second) and courierpop3d processes (7.2 logins per second). Start splitting the user directories across logical disks that are on different platters, for goodness sake. Mount points overlaid below the primary mount point by directory can easily do this for you. -- Nate Duehr, [EMAIL PROTECTED] -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: High volume mail handling architecture
On Thu, Sep 09, 2004 at 06:43:21AM -0600, Nate Duehr wrote: On Sep 9, 2004, at 2:44 AM, Marcin Owsiany wrote: More than 90% of the disk transactions are on the (logical) disk where mail is stored. The only processes which touch that disk, are qmail delivery processes (qmail handed mail by another SMTP-IN box: 0.8 local deliveries per second) and courierpop3d processes (7.2 logins per second). Start splitting the user directories across logical disks that are on different platters, for goodness sake. Well, adding more disks to the setup is what I planned to do next. I just want to make sure that the performance I get from the _current_ setup is normal. Marcin -- Marcin Owsiany [EMAIL PROTECTED] http://marcin.owsiany.pl/ GnuPG: 1024D/60F41216 FE67 DA2D 0ACA FC5E 3F75 D6F6 3A0D 8AA0 60F4 1216 -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: High volume mail handling architecture
Adrian 'Dagurashibanipal' von Bidder wrote: On behalf of all joe-job victims: Whatever you do, *please* do it in a way that allows you to know whether mail is going to be delivered at the front-end incoming SMTP server. (should be trivial if your user database is in LDAP or some SQL db or whatever.) On behalf of the lurkers here who are not experienced admins (am I the only one?), could someone elaborate a little more on the above? What I think I know is that a joe job is when somebody gets mail that looks like it came from, for example me, but it really didn't. Is the point of the statement above that all mail must be delivered via the SMTP server, and then features built into it (disabling of anonymous relaying??) will prevent joe-jobs? thanks, Randy Kramer -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: High volume mail handling architecture
Citt Ruth A. Kramer [EMAIL PROTECTED]: Adrian 'Dagurashibanipal' von Bidder wrote: On behalf of all joe-job victims: Whatever you do, *please* do it in a way that allows you to know whether mail is going to be delivered at the front-end incoming SMTP server. (should be trivial if your user database is in LDAP or some SQL db or whatever.) Is the point of the statement above that all mail must be delivered via the SMTP server, and then features built into it (disabling of anonymous relaying??) will prevent joe-jobs? I think the point is in rejecting most of these email as soon as possible. For this to work, the front-end SMTP server has to know your users. If it doesn't, you accept these mails for further processing - spam virus filtering, which are CPU consuming and just after it your server realizes that there is no recipient for it. For the same reason I have some regexp patterns build into postfix body_checks for most common viruses. Postfix rejects these mails immediately. This usually catch about 90% of viruses, so I save a lot of CPU in virus checking of incoming mail... -- bYE, Marki This message was sent using IMP, the Internet Messaging Program. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: High volume mail handling architecture
For the same reason I have some regexp patterns build into postfix body_checks for most common viruses. Postfix rejects these mails immediately. This usually catch about 90% of viruses, so I save a lot of CPU in virus checking of incoming mail... Could you send your regexes, 90% of viruses stopped by regexes sounds interesting. Regards mike -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: High volume mail handling architecture
On Thu, 9 Sep 2004 15:32:04 +0200, Marek wrote in message [EMAIL PROTECTED]: Citát Ruth A. Kramer [EMAIL PROTECTED]: Adrian 'Dagurashibanipal' von Bidder wrote: On behalf of all joe-job victims: Whatever you do, *please* do it in a way that allows you to know whether mail is going to be delivered at the front-end incoming SMTP server. (should be trivial if your user database is in LDAP or some SQL db or whatever.) Is the point of the statement above that all mail must be delivered via the SMTP server, and then features built into it (disabling of anonymous relaying??) will prevent joe-jobs? I think the point is in rejecting most of these email as soon as possible. For this to work, the front-end SMTP server has to know your users. If it doesn't, you accept these mails for further processing - spam virus filtering, which are CPU consuming and just after it your server realizes that there is no recipient for it. For the same reason I have some regexp patterns build into postfix body_checks for most common viruses. Postfix rejects these mails immediately. This usually catch about 90% of viruses, so I save a lot of CPU in virus checking of incoming mail... ..my understanding is the point is spot spam bots asap, and either deliver to /dev/null or fry them off the net. Usually, these boxes are paid for some unsuspecting Joe Sixpack, who should have his box rebooting exactly as far as to show: Your box has been used for criminal purposes, and has been shut down to secure evidence for law enforcement. Please note that tampering with this crime scene evidence, by trying to repair or reinstall the OS etc, is a crime in itself and you will not want to try do that, as we have reported to your local law enforcemen every box we shut down, to facilitate said law enforcement. If you need your box for any lawful purpose, please feel free to contact your local law enforcement to expedite securing the crime scene evidence and make your box legally available to yourself. ..and _not_ _any_ _bit_ further. A simple replacement of the boot loader and possibly disabling the bios. I think it is legal too. ..disclaimer; I don't do (yet) mail servers, I got drgged into doing wifi bandwith trottling, my expertize is in thermochemical gasification. -- ..med vennlig hilsen = with Kind Regards from Arnt... ;-) ...with a number of polar bear hunters in his ancestry... Scenarios always come in sets of three: best case, worst case, and just in case.
Re: High volume mail handling architecture
On Thu, 9 Sep 2004 18:44, Marcin Owsiany [EMAIL PROTECTED] wrote: On Thu, Sep 09, 2004 at 06:03:20AM +1000, Russell Coker wrote: You have to either be doing something very intensive or very wrong to need more than one server for 20K users. Last time I did this I got 250K users per server, and I believe that I could have easily doubled that if I was allowed to choose the hardware. We have a little over 10K users, and the disk subsystem seems to be the bottleneck. When we reach about 600 read transactions + 150 write transactions per second (as reported by sar -b), the load average starts to grow expotentially instead of proportionally. There are about 20K sectors read, and 3K written per second. (That was before I turned noatime on. After that we had about 2K sector writes and 70 write transactions less, and load average dropped to a more sane value - about 3, instead of 20.) Last time I was doing this I had some Dell 2U servers (2650 from memory) with 4 * 10K U160 disks in a RAID-5 (5th disk was hot-spare) and something like 4G of RAM. The machines had almost no read access to the drives, something less than 10% of disk access was for read because the cache worked really well (the accounts that receive the most mail are the ones that have clients checking them most often - in some cases people leave their email client on 24*7 checking every 5 mins). The write bottleneck was just under 3M/s, I don't recall how many transactions that was. To give better performance you may want to look at getting more RAM. RAM is cheap and you can eliminate most read bottlenecks by caching lots of stuff. 3K sectors written per second isn't too good, but I guess that's because of the 20K sectors read. Get some more cache and things should improve a lot. Also if using a typical Unix mail server (Postfix, Sendmail, etc) then the data is written synchronously somewhere under /var before being read from there and written to the destination. If you use a NVRAM card from UMEM http://www.umem.com/ for /var/spool then you could possibly double mail delivery performance. If you use data=journal and put the journal for the mail store file system on the umem device you could probably double performance again. Also, did you implement virus/spam scanning on that box? No! Virus/spam scanning was on the front-end machines. It was believed that the mail store machines were busy enough with doing the most basic work without virus scanning (also the number of licences for the anti-virus program didn't match the number of store machines that were planned). You want to do as much work away from the mail store as possible. Mail store machines can not be replaced without major inconvenience to everyone (customers, staff, management). Front-end anti-virus machines are disposable, if you have a traffic balancing device (such as a Cisco LocalDirector or IPVS) in front of a cluster of anti-virus machines then an anti-virus machine can go down for a few days without anyone bothering. If (hypothetically) anti-virus was to take 10% of the performance from a mail store then it could require another mail store machine (if you have 5-10 machines) and therefore that's one more machine which can break and cause massive pain to everyone. Another thing, a mail store machine should require almost no CPU power. Give it a single CPU that's not the fastest available. It sucks when you have two almost unused CPUs which are both fast and hot and then one breaks down killing the machine. -- http://www.coker.com.au/selinux/ My NSA Security Enhanced Linux packages http://www.coker.com.au/bonnie++/ Bonnie++ hard drive benchmark http://www.coker.com.au/postal/Postal SMTP/POP benchmark http://www.coker.com.au/~russell/ My home page -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: High volume mail handling architecture
Maykel Moya [EMAIL PROTECTED] wrote: We have already choosed the software components: postfix, ldap and we ar discussing about dovecot vs. cyrus as imap server. http://www.vergenet.net/linux/mail_farm/ describes some Ideas for setting up a large Mailcluster. regards Johannes -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: High volume mail handling architecture
On Tue, 7 Sep 2004 23:48, Theo Hoogerheide [EMAIL PROTECTED] wrote: Try looking for a netapp or something else for central datastorage and a loadbalancer.. If you have a Netapp then you have to deal with Linux NFS issues which aren't fun. If you have a cluster of storage machines and front-end SMTP servers to direct delivery to the correct back-end machine as well as Perdition to proxy POP and IMAP to the correct back-end machine then you can scale easily without dealing with NFS. This setup is proven to be very scalable, when you want to add another 20k users, just add some servers :) You have to either be doing something very intensive or very wrong to need more than one server for 20K users. Last time I did this I got 250K users per server, and I believe that I could have easily doubled that if I was allowed to choose the hardware. I used Qmail (not my choice), Courier, Perdition, IMP, MySQL (for IMP), and was moving it to OpenLDAP at the time I left that project. -- http://www.coker.com.au/selinux/ My NSA Security Enhanced Linux packages http://www.coker.com.au/bonnie++/ Bonnie++ hard drive benchmark http://www.coker.com.au/postal/Postal SMTP/POP benchmark http://www.coker.com.au/~russell/ My home page -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: High volume mail handling architecture
On Tue, Sep 07, 2004 at 08:38:56AM -0400, Maykel Moya wrote: I'm looking for documentation which help me to design a failover, redundant and scalable mail system to handle 20K users with plans to scale soon to about 50K. We have already choosed the software components: postfix, ldap and we ar discussing about dovecot vs. cyrus as imap server. Courier-imap works fine on similar setup ... (my 2cents not very usefull ;-)) -- Emmanuel Lacour Easter-eggs 44-46 rue de l'Ouest - 75014 Paris - France - Métro Gaité Phone: +33 (0) 1 43 35 00 37- Fax: +33 (0) 1 41 35 00 76 mailto:[EMAIL PROTECTED] -http://www.easter-eggs.com -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: High volume mail handling architecture
On Tue, 2004-09-07 at 14:38, Maykel Moya wrote: I'm looking for documentation which help me to design a failover, redundant and scalable mail system to handle 20K users with plans to scale soon to about 50K. We have already choosed the software components: postfix, ldap and we ar discussing about dovecot vs. cyrus as imap server. Try looking for a netapp or something else for central datastorage and a loadbalancer.. We have a Alteon AD3 with a couple of smtp-servers, a couple of imap/pop3-servers and some ldap-servers. All mail is stored on the netapp and all smtp-servers and imap/pop3-servers are identical. For Ldap we have one master. The slaves get their data through replication, not via the netapp. This setup is proven to be very scalable, when you want to add another 20k users, just add some servers :) Unfortunately I don't have serious documentation about the setup, but you can mail me off-list about the setup if you have any questions. -- Kind regards, Theo Hoogerheide -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]