Re: High volume mail handling architecture

2004-10-13 Thread Gerrit Pape
On Fri, Oct 08, 2004 at 10:07:15AM +0100, John Hedges wrote:
 On Fri, Sep 10, 2004 at 05:34:07PM +, Gerrit Pape wrote:
  On Fri, Sep 10, 2004 at 09:49:27AM +0200, Adrian 'Dagurashibanipal'
  von Bidder wrote:
   Herre is what happens: A spammer uses my email address as the
   sender address in spam frequently.

   So, that's my plea to everybody with big mail installations: make your 
   frontend machines aware of what mail they are supposed to accept, so that 
   you never need to bounce. (Ok, some cases will still bounce: disk full, 
   procmail script errors etc., but these are a very small proportion.) And 
   the other plea is, of course, get rid of qmail and other products which 
   accept all mail by default.
  
  As far as my experience goes, pleas or complaints against other people
  doesn't help much if you want to see something changed.  Better help
  yourself.
  
  I suggest to instruct your mail user agent to make use of the
  (apparently almost forgotten) fact that the sender's addresses in the
  envelope and in the header can be different.  Most today's mail transfer
  agents should support address extensions.
  
  If your address is used as envelope sender in unsolicited mail, it's
  your public one.  Use a non-public address as envelope sender of mail
  you send, and simply change it in case it gets abused; only bouncers
  should send mail to this address, and they usually do within two weeks.
  Now you can configure your MTA to outright reject delivery notifications
  solely based on the information in the envelope.

 Sorry to reopen such an old thread. I'd saved this mail for reference as
 I too get a lot of bounces for spam with forged mail headers. A fresh
 run of spam that some wanker has initiated in my name has made my inbox
 unbearable this last few days so I need to do something about it.

It's not because of forged headers but forged envelopes, your address is
used as envelope sender in SMTP (MAIL FROM:[EMAIL PROTECTED]).  It's
the envelope sender address where delivery notifications, such as
bounces, are sent; and those delivery notifications usually have an
empty envelope sender (MAIL FROM:).  Replies and followups are sent to
an address specified in the headers of the mail (From: John
[EMAIL PROTECTED], or Reply-To:), and have a non-empty envelope sender.

If john starts to send all his mails with the envelope sender address
[EMAIL PROTECTED] and still uses the same headers, communication
with the recipients will not change, but delivery notification will go
to [EMAIL PROTECTED].  His public, well known, unchangeable mail
address [EMAIL PROTECTED] now no longer receives delivery notification
for mail john himself has sent; he now can safely reject or disregard
mails sent with an empty envelope sender to the envelope recipient
[EMAIL PROTECTED], solely based on the envelope information, without
looking at the headers or the body of the mail.

 Would it be possible for you, Gerrit, to expand a little on your setup?

My personal setup is done with the qconfirm package, specifically, I
send mails through the qconfirm-inject program which adjusts the
envelope sender, and so requests bounces to go to a different address
than my public one.  This is qmail specific, and you need shell access
to the mail server.

 I currently use fetchmail to get my mail from a catchall mailbox at my
 ISP. Can I use the envelope sender trick in this case as I can't see an
 easy way to differentiate between bounces and normal email once the
 messages reach my box?

Most of the bounces are sent with an empty envelope sender (), I'm not
sure whether fetchmail preserves the envelope information, it might get
lost; look for Return-Path:.  Although it might work with your setup,
sorting out the bounces better should be done on the mail exchanger I
think.

Regards, Gerrit.
-- 
Open projects at http://smarden.org/pape/.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: High volume mail handling architecture

2004-10-08 Thread John Hedges
On Fri, Sep 10, 2004 at 05:34:07PM +, Gerrit Pape wrote:
 On Fri, Sep 10, 2004 at 09:49:27AM +0200, Adrian 'Dagurashibanipal' von Bidder wrote:
  Herre is what happens: A spammer uses my email address as the sender address 
  in spam frequently.
 
  So, I sometimes suddenly have 2000 new mails in my inbox :-(
 
  So, that's my plea to everybody with big mail installations: make your 
  frontend machines aware of what mail they are supposed to accept, so that 
  you never need to bounce. (Ok, some cases will still bounce: disk full, 
  procmail script errors etc., but these are a very small proportion.) And 
  the other plea is, of course, get rid of qmail and other products which 
  accept all mail by default.
 
 As far as my experience goes, pleas or complaints against other people
 doesn't help much if you want to see something changed.  Better help
 yourself.
 
 I suggest to instruct your mail user agent to make use of the
 (apparently almost forgotten) fact that the sender's addresses in the
 envelope and in the header can be different.  Most today's mail transfer
 agents should support address extensions.
 
 If your address is used as envelope sender in unsolicited mail, it's
 your public one.  Use a non-public address as envelope sender of mail
 you send, and simply change it in case it gets abused; only bouncers
 should send mail to this address, and they usually do within two weeks.
 Now you can configure your MTA to outright reject delivery notifications
 solely based on the information in the envelope.
 
 $ mconnect a.mx.smarden.org
 220 smarden.org ESMTP
 mail from:
 250 Sender accepted.
 rcpt to:[EMAIL PROTECTED]
 553 This address cannot get bounces, either you are not bouncing to the envelope 
 sender, or the envelope of the mail you bounce is forged.
 quit
 221 Good bye.
 $ 
 
 I'm doing this for about ten months now, and don't see most of the
 unwanted delivery notifications, including delivery confirmation
 requests for unsolicited mail with forged envelope.

Sorry to reopen such an old thread. I'd saved this mail for reference as
I too get a lot of bounces for spam with forged mail headers. A fresh
run of spam that some wanker has initiated in my name has made my inbox
unbearable this last few days so I need to do something about it.

Would it be possible for you, Gerrit, to expand a little on your setup?
I currently use fetchmail to get my mail from a catchall mailbox at my
ISP. Can I use the envelope sender trick in this case as I can't see an
easy way to differentiate between bounces and normal email once the
messages reach my box?

Cheers

John


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: High volume mail handling architecture

2004-09-10 Thread Adrian 'Dagurashibanipal' von Bidder
On Thursday 09 September 2004 01.33, Ruth A. Kramer wrote:
 Adrian 'Dagurashibanipal' von Bidder wrote:
  On behalf of all joe-job victims: Whatever you do, *please* do it in a
  way that allows you to know whether mail is going to be delivered at
  the front-end incoming SMTP server. (should be trivial if your user
  database is in LDAP or some SQL db or whatever.)

 On behalf of the lurkers here who are not experienced admins (am I the
 only one?), could someone elaborate a little more on the above?

Your guess is mostly correct.

Herre is what happens: A spammer uses my email address as the sender address 
in spam frequently.

Now this would be a minor annoyance alone because my name is connected with 
spamming. Now, much of the spam the spammer sends out is for invalid email 
addresses (like [EMAIL PROTECTED] and the like, and addresses 
that don't exist anymore, or addresses that are really message-IDs etc. 
etc). If the domain part of the address does not exist, that's no problem - 
the mail sending software of the spammer won't find a mail server to send 
the mail to. But if the spammer can get the message to a mail server, two 
things can happen: (i) the recipient mail server behaves properly and 
rejects the mail right in the SMTP transaction (with 550 User unknown or 
whatever). Because the spammer's software is no proper mailserver, it 
doesn't handle this like a mailserver and instead just discards the 
message. (ii) if the recipient mailserver is configured to accept all mail 
(because it's qmail, or MS Exchange, or because it's a front-end mailserver 
which doesn't know about which users exist, for example a backup MX), I'm 
in trouble because that mailserver will see that the mail can not be 
delievered, and so it generates a bounce to whatever address is in the 
envelope sender of the spam.

So, I sometimes suddenly have 2000 new mails in my inbox :-(

(Actually, in my _bounces folder, and so it doesn't bother me that much, and 
since I've disabled spamassassin for bounces, the server load doesn't go 
through the roof anymore, either. But still, there's the chance thtat I 
miss a real bounce in the flood.)

So, that's my plea to everybody with big mail installations: make your 
frontend machines aware of what mail they are supposed to accept, so that 
you never need to bounce. (Ok, some cases will still bounce: disk full, 
procmail script errors etc., but these are a very small proportion.) And 
the other plea is, of course, get rid of qmail and other products which 
accept all mail by default.

(And, lately, a noticeable proportion of such spam 'bounces' have been by 
systems like TMDA and cousins. I take a certain sadistic pleasure in 
confirming these mails whenever I have the time. Sorry, folks.)


So long
-- vbi


-- 
Protect your privacy - encrypt your email: http://fortytwo.ch/gpg/intro


pgprBKMK7ggpP.pgp
Description: PGP signature


Re: High volume mail handling architecture

2004-09-10 Thread Gerrit Pape
On Fri, Sep 10, 2004 at 09:49:27AM +0200, Adrian 'Dagurashibanipal' von Bidder wrote:
 Herre is what happens: A spammer uses my email address as the sender address 
 in spam frequently.

 So, I sometimes suddenly have 2000 new mails in my inbox :-(

 So, that's my plea to everybody with big mail installations: make your 
 frontend machines aware of what mail they are supposed to accept, so that 
 you never need to bounce. (Ok, some cases will still bounce: disk full, 
 procmail script errors etc., but these are a very small proportion.) And 
 the other plea is, of course, get rid of qmail and other products which 
 accept all mail by default.

As far as my experience goes, pleas or complaints against other people
doesn't help much if you want to see something changed.  Better help
yourself.

I suggest to instruct your mail user agent to make use of the
(apparently almost forgotten) fact that the sender's addresses in the
envelope and in the header can be different.  Most today's mail transfer
agents should support address extensions.

If your address is used as envelope sender in unsolicited mail, it's
your public one.  Use a non-public address as envelope sender of mail
you send, and simply change it in case it gets abused; only bouncers
should send mail to this address, and they usually do within two weeks.
Now you can configure your MTA to outright reject delivery notifications
solely based on the information in the envelope.

$ mconnect a.mx.smarden.org
220 smarden.org ESMTP
mail from:
250 Sender accepted.
rcpt to:[EMAIL PROTECTED]
553 This address cannot get bounces, either you are not bouncing to the envelope 
sender, or the envelope of the mail you bounce is forged.
quit
221 Good bye.
$ 

I'm doing this for about ten months now, and don't see most of the
unwanted delivery notifications, including delivery confirmation
requests for unsolicited mail with forged envelope.

Regards, Gerrit.
-- 
Open projects at http://smarden.org/pape/.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: High volume mail handling architecture

2004-09-10 Thread Nate Duehr
Marcin Owsiany wrote:
Well, adding more disks to the setup is what I planned to do next. I
just want to make sure that the performance I get from the _current_
setup is normal.
 

Oh okay, sorry.  Thought you were looking for a performance increase.
Nate
--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]


Re: High volume mail handling architecture

2004-09-10 Thread Marcin Owsiany
On Fri, Sep 10, 2004 at 09:07:37PM +0200, Jonathan G - Mailing Lists wrote:
 Sorry, what's your MTA?

Mine? On that particular machine it is qmail that does the deliveries
(or rather, what is left of qmail after all the patching I've done).

Marcin
-- 
Marcin Owsiany [EMAIL PROTECTED] http://marcin.owsiany.pl/
GnuPG: 1024D/60F41216  FE67 DA2D 0ACA FC5E 3F75  D6F6 3A0D 8AA0 60F4 1216


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: High volume mail handling architecture

2004-09-10 Thread Jonathan G - Mailing Lists
:) hehehe, Ok, i was just asking because there are some MTA's that can 
fit better in some environments than others.

I like features of QMail, Postfix and Exim, but i hate others for an ISP 
environment.

jonathan

Marcin Owsiany wrote:
On Fri, Sep 10, 2004 at 09:07:37PM +0200, Jonathan G - Mailing Lists wrote:
Sorry, what's your MTA?

Mine? On that particular machine it is qmail that does the deliveries
(or rather, what is left of qmail after all the patching I've done).
Marcin
--
   Jonathan Gonzalez Fernandez 
   (o  mail  : [EMAIL PROTECTED]
   //\  jabber: [EMAIL PROTECTED]
   V_/  site  : www.surestorm.com
  ::: Registered Linux User #86 :::
--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]


Re: High volume mail handling architecture

2004-09-10 Thread Theodore Knab
RAM is always not the answer with 32Bit machines. You can cause bounce buffers with 
too much RAM.
The sweet spot for Linux on a 32Bit platform seems to be 4GB of RAM. I had 10GB of RAM 
in a Courier IMAP
server and the server had problems releasing swap after a week. The kernel was 
compiled for 64GB of RAM. 
When I reduced the RAM to 4GB and recompiled for a 4GB machine these problems 
disappeared.

On 10/09/04 03:40 +1000, Russell Coker wrote:
 On Thu, 9 Sep 2004 18:44, Marcin Owsiany [EMAIL PROTECTED] wrote:
  On Thu, Sep 09, 2004 at 06:03:20AM +1000, Russell Coker wrote:
   You have to either be doing something very intensive or very wrong to
   need more than one server for 20K users.  Last time I did this I got 250K
   users per server, and I believe that I could have easily doubled that if
   I was allowed to choose the hardware.
 
  We have a little over 10K users, and the disk subsystem seems to be the
  bottleneck. When we reach about 600 read transactions + 150 write
  transactions per second (as reported by sar -b), the load average starts
  to grow expotentially instead of proportionally. There are about 20K
  sectors read, and 3K written per second. (That was before I turned noatime
  on. After that we had about 2K sector writes and 70 write transactions
  less, and load average dropped to a more sane value - about 3, instead
  of 20.)
 
 Last time I was doing this I had some Dell 2U servers (2650 from memory) with 
 4 * 10K U160 disks in a RAID-5 (5th disk was hot-spare) and something like 4G 
 of RAM.  The machines had almost no read access to the drives, something less 
 than 10% of disk access was for read because the cache worked really well 
 (the accounts that receive the most mail are the ones that have clients 
 checking them most often - in some cases people leave their email client on 
 24*7 checking every 5 mins).
 
 The write bottleneck was just under 3M/s, I don't recall how many transactions 
 that was.
 
 To give better performance you may want to look at getting more RAM.  RAM is 
 cheap and you can eliminate most read bottlenecks by caching lots of stuff.
 
 3K sectors written per second isn't too good, but I guess that's because of 
 the 20K sectors read.  Get some more cache and things should improve a lot.
 
 Also if using a typical Unix mail server (Postfix, Sendmail, etc) then the 
 data is written synchronously somewhere under /var before being read from 
 there and written to the destination.  If you use a NVRAM card from UMEM 
 http://www.umem.com/ for /var/spool then you could possibly double mail 
 delivery performance.  If you use data=journal and put the journal for the 
 mail store file system on the umem device you could probably double 
 performance again.
 
  Also, did you implement virus/spam scanning on that box?
 
 No!  Virus/spam scanning was on the front-end machines.  It was believed that 
 the mail store machines were busy enough with doing the most basic work 
 without virus scanning (also the number of licences for the anti-virus 
 program didn't match the number of store machines that were planned).
 
 You want to do as much work away from the mail store as possible.  Mail store 
 machines can not be replaced without major inconvenience to everyone 
 (customers, staff, management).  Front-end anti-virus machines are 
 disposable, if you have a traffic balancing device (such as a Cisco 
 LocalDirector or IPVS) in front of a cluster of anti-virus machines then an 
 anti-virus machine can go down for a few days without anyone bothering.
 
 If (hypothetically) anti-virus was to take 10% of the performance from a mail 
 store then it could require another mail store machine (if you have 5-10 
 machines) and therefore that's one more machine which can break and cause 
 massive pain to everyone.
 
 Another thing, a mail store machine should require almost no CPU power.  Give 
 it a single CPU that's not the fastest available.  It sucks when you have two 
 almost unused CPUs which are both fast and hot and then one breaks down 
 killing the machine.
 
 -- 
 http://www.coker.com.au/selinux/   My NSA Security Enhanced Linux packages
 http://www.coker.com.au/bonnie++/  Bonnie++ hard drive benchmark
 http://www.coker.com.au/postal/Postal SMTP/POP benchmark
 http://www.coker.com.au/~russell/  My home page
 
 
 -- 
 To UNSUBSCRIBE, email to [EMAIL PROTECTED]
 with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
 

-- 
--
Ted Knab
Chester, Maryland  21619 USA
--
[In War] Conquest is easy. Control is not.
-- Kirk, Mirror, Mirror, stardate unknown



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: High volume mail handling architecture

2004-09-09 Thread Adrian 'Dagurashibanipal' von Bidder
On Tuesday 07 September 2004 14.38, Maykel Moya wrote:
 I'm looking for documentation which help me to design a failover,
 redundant and scalable mail system to handle 20K users with plans to
 scale soon to about 50K.

20k or 50k users are not unheard of on a single server (obviously you'll 
need more than one if you speak about redundancy). But it depends heavily 
on the kind of users - how many messages per day (or per hour in peak 
times) do you expect? Size distribution of those messages?

Are you serving IMAP or POP boxes, quota per user? 20k office workers with 
big IMAP boxes need a different box from 20k ISP home users with 10M POP 
accounts. If you intend to run things like spamassassin, you'll be looking 
at a very high CPU load (but you can easily off-load this.)



On behalf of all joe-job victims: Whatever you do, *please* do it in a way 
that allows you to know whether mail is going to be delivered at the 
front-end incoming SMTP server. (should be trivial if your user database is 
in LDAP or some SQL db or whatever.)


greetings
-- vbi

-- 
In der Ehe gibt's keine grern Fehler als die wiederkommenden.
  -- Jean Paul (eig. Johann Paul Friedrich Richter)


pgpGly4aPr8zZ.pgp
Description: PGP signature


Re: High volume mail handling architecture

2004-09-09 Thread Marcin Owsiany
On Thu, Sep 09, 2004 at 06:03:20AM +1000, Russell Coker wrote:
 You have to either be doing something very intensive or very wrong to need 
 more than one server for 20K users.  Last time I did this I got 250K users 
 per server, and I believe that I could have easily doubled that if I was 
 allowed to choose the hardware.

We have a little over 10K users, and the disk subsystem seems to be the
bottleneck. When we reach about 600 read transactions + 150 write
transactions per second (as reported by sar -b), the load average starts
to grow expotentially instead of proportionally. There are about 20K
sectors read, and 3K written per second. (That was before I turned noatime
on. After that we had about 2K sector writes and 70 write transactions
less, and load average dropped to a more sane value - about 3, instead
of 20.)

More than 90% of the disk transactions are on the (logical) disk where
mail is stored. The only processes which touch that disk, are qmail
delivery processes (qmail handed mail by another SMTP-IN box: 0.8 local
deliveries per second) and courierpop3d processes (7.2 logins per
second).

We are using an Intel SRCU42X SCSI RAID controller, and the logical
disk which caries mail is made of 3 Fujitsu 36GB 15K RPM disks.

Please tell me, what problem we are facing? Is the hardware so weak? Is
it underperforming? Or maybe our load is exceptionally high? I can
provide more statistics if they are needed.

Also, did you implement virus/spam scanning on that box?

kind regards,

Marcin
-- 
Marcin Owsiany [EMAIL PROTECTED] http://marcin.owsiany.pl/
GnuPG: 1024D/60F41216  FE67 DA2D 0ACA FC5E 3F75  D6F6 3A0D 8AA0 60F4 1216


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: High volume mail handling architecture

2004-09-09 Thread andrew
Hi Marcin,
How many files do you have in a single directory?  100 ?
Which filesystem are you using? You may want to try playimg with reiserfs...
Regards
Andrew
Marcin Owsiany wrote:
On Thu, Sep 09, 2004 at 06:03:20AM +1000, Russell Coker wrote:
 

You have to either be doing something very intensive or very wrong to need 
more than one server for 20K users.  Last time I did this I got 250K users 
per server, and I believe that I could have easily doubled that if I was 
allowed to choose the hardware.
   

We have a little over 10K users, and the disk subsystem seems to be the
bottleneck. When we reach about 600 read transactions + 150 write
transactions per second (as reported by sar -b), the load average starts
to grow expotentially instead of proportionally. There are about 20K
sectors read, and 3K written per second. (That was before I turned noatime
on. After that we had about 2K sector writes and 70 write transactions
less, and load average dropped to a more sane value - about 3, instead
of 20.)
More than 90% of the disk transactions are on the (logical) disk where
mail is stored. The only processes which touch that disk, are qmail
delivery processes (qmail handed mail by another SMTP-IN box: 0.8 local
deliveries per second) and courierpop3d processes (7.2 logins per
second).
We are using an Intel SRCU42X SCSI RAID controller, and the logical
disk which caries mail is made of 3 Fujitsu 36GB 15K RPM disks.
Please tell me, what problem we are facing? Is the hardware so weak? Is
it underperforming? Or maybe our load is exceptionally high? I can
provide more statistics if they are needed.
Also, did you implement virus/spam scanning on that box?
kind regards,
Marcin
 


--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]


Re: High volume mail handling architecture

2004-09-09 Thread Nate Duehr
On Sep 9, 2004, at 2:44 AM, Marcin Owsiany wrote:
More than 90% of the disk transactions are on the (logical) disk where
mail is stored. The only processes which touch that disk, are qmail
delivery processes (qmail handed mail by another SMTP-IN box: 0.8 local
deliveries per second) and courierpop3d processes (7.2 logins per
second).
Start splitting the user directories across logical disks that are on 
different platters, for goodness sake.  Mount points overlaid below the 
primary mount point by directory can easily do this for you.

--
Nate Duehr, [EMAIL PROTECTED]
--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]


Re: High volume mail handling architecture

2004-09-09 Thread Marcin Owsiany
On Thu, Sep 09, 2004 at 06:43:21AM -0600, Nate Duehr wrote:
 
 On Sep 9, 2004, at 2:44 AM, Marcin Owsiany wrote:
 
 More than 90% of the disk transactions are on the (logical) disk where
 mail is stored. The only processes which touch that disk, are qmail
 delivery processes (qmail handed mail by another SMTP-IN box: 0.8 local
 deliveries per second) and courierpop3d processes (7.2 logins per
 second).
 
 
 Start splitting the user directories across logical disks that are on 
 different platters, for goodness sake.

Well, adding more disks to the setup is what I planned to do next. I
just want to make sure that the performance I get from the _current_
setup is normal.

Marcin
-- 
Marcin Owsiany [EMAIL PROTECTED] http://marcin.owsiany.pl/
GnuPG: 1024D/60F41216  FE67 DA2D 0ACA FC5E 3F75  D6F6 3A0D 8AA0 60F4 1216


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: High volume mail handling architecture

2004-09-09 Thread Ruth A. Kramer
Adrian 'Dagurashibanipal' von Bidder wrote:
 On behalf of all joe-job victims: Whatever you do, *please* do it in a way
 that allows you to know whether mail is going to be delivered at the
 front-end incoming SMTP server. (should be trivial if your user database is
 in LDAP or some SQL db or whatever.)

On behalf of the lurkers here who are not experienced admins (am I the
only one?), could someone elaborate a little more on the above?

What I think I know is that a joe job is when somebody gets mail that
looks like it came  from, for example me, but it really didn't.

Is the point of the statement above that all mail must be delivered via
the SMTP server, and then features built into it (disabling of anonymous
relaying??) will prevent joe-jobs?  

thanks,
Randy Kramer


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: High volume mail handling architecture

2004-09-09 Thread Marek Podmaka
Citt Ruth A. Kramer [EMAIL PROTECTED]:

 Adrian 'Dagurashibanipal' von Bidder wrote:
  On behalf of all joe-job victims: Whatever you do, *please* do it in a way
  that allows you to know whether mail is going to be delivered at the
  front-end incoming SMTP server. (should be trivial if your user database
 is
  in LDAP or some SQL db or whatever.)
 
 Is the point of the statement above that all mail must be delivered via
 the SMTP server, and then features built into it (disabling of anonymous
 relaying??) will prevent joe-jobs?  

I think the point is in rejecting most of these email as soon as possible. For
this to work, the front-end SMTP server has to know your users. If it doesn't,
you accept these mails for further processing - spam  virus filtering, which
are CPU consuming and just after it your server realizes that there is no
recipient for it.
For the same reason I have some regexp patterns build into postfix body_checks
for most common viruses. Postfix rejects these mails immediately. This usually
catch about 90% of viruses, so I save a lot of CPU in virus checking of
incoming mail...

-- 
  bYE, Marki


This message was sent using IMP, the Internet Messaging Program.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: High volume mail handling architecture

2004-09-09 Thread Maykel Moya
 For the same reason I have some regexp patterns build into postfix body_checks
 for most common viruses. Postfix rejects these mails immediately. This usually
 catch about 90% of viruses, so I save a lot of CPU in virus checking of
 incoming mail...

Could you send your regexes, 90% of viruses stopped by regexes sounds
interesting.

Regards
mike



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: High volume mail handling architecture

2004-09-09 Thread Arnt Karlsen
On Thu,  9 Sep 2004 15:32:04 +0200, Marek wrote in message 
[EMAIL PROTECTED]:

 Citát Ruth A. Kramer [EMAIL PROTECTED]:
 
  Adrian 'Dagurashibanipal' von Bidder wrote:
   On behalf of all joe-job victims: Whatever you do, *please* do it
   in a way that allows you to know whether mail is going to be
   delivered at the front-end incoming SMTP server. (should be
   trivial if your user database
  is
   in LDAP or some SQL db or whatever.)
  
  Is the point of the statement above that all mail must be delivered
  via the SMTP server, and then features built into it (disabling of
  anonymous relaying??) will prevent joe-jobs?  
 
 I think the point is in rejecting most of these email as soon as
 possible. For this to work, the front-end SMTP server has to know your
 users. If it doesn't, you accept these mails for further processing -
 spam  virus filtering, which are CPU consuming and just after it your
 server realizes that there is no recipient for it.
 For the same reason I have some regexp patterns build into postfix
 body_checks for most common viruses. Postfix rejects these mails
 immediately. This usually catch about 90% of viruses, so I save a lot
 of CPU in virus checking of incoming mail...

..my understanding is the point is spot spam bots asap, and either
deliver to /dev/null or fry them off the net. Usually, these boxes are
paid for some unsuspecting Joe Sixpack, who should have his box
rebooting exactly as far as to show: Your box has been used for
criminal purposes, and has been shut down to secure evidence for 
law enforcement.  Please note that tampering with this crime scene
evidence, by trying to repair or reinstall the OS etc, is a crime in
itself and you will not want to try do that, as we have reported to your
local law enforcemen every box we shut down, to facilitate said law
enforcement.  If you need your box for any lawful purpose, please feel
free to contact your local law enforcement to expedite securing the
crime scene evidence and make your box legally available to yourself.

..and _not_ _any_ _bit_ further.  A simple replacement of
the boot loader and possibly disabling the bios.  I think it is legal
too.

..disclaimer; I don't do (yet) mail servers, I got drgged into doing
wifi bandwith trottling, my expertize is in thermochemical gasification.

-- 
..med vennlig hilsen = with Kind Regards from Arnt... ;-)
...with a number of polar bear hunters in his ancestry...
  Scenarios always come in sets of three: 
  best case, worst case, and just in case.



Re: High volume mail handling architecture

2004-09-09 Thread Russell Coker
On Thu, 9 Sep 2004 18:44, Marcin Owsiany [EMAIL PROTECTED] wrote:
 On Thu, Sep 09, 2004 at 06:03:20AM +1000, Russell Coker wrote:
  You have to either be doing something very intensive or very wrong to
  need more than one server for 20K users.  Last time I did this I got 250K
  users per server, and I believe that I could have easily doubled that if
  I was allowed to choose the hardware.

 We have a little over 10K users, and the disk subsystem seems to be the
 bottleneck. When we reach about 600 read transactions + 150 write
 transactions per second (as reported by sar -b), the load average starts
 to grow expotentially instead of proportionally. There are about 20K
 sectors read, and 3K written per second. (That was before I turned noatime
 on. After that we had about 2K sector writes and 70 write transactions
 less, and load average dropped to a more sane value - about 3, instead
 of 20.)

Last time I was doing this I had some Dell 2U servers (2650 from memory) with 
4 * 10K U160 disks in a RAID-5 (5th disk was hot-spare) and something like 4G 
of RAM.  The machines had almost no read access to the drives, something less 
than 10% of disk access was for read because the cache worked really well 
(the accounts that receive the most mail are the ones that have clients 
checking them most often - in some cases people leave their email client on 
24*7 checking every 5 mins).

The write bottleneck was just under 3M/s, I don't recall how many transactions 
that was.

To give better performance you may want to look at getting more RAM.  RAM is 
cheap and you can eliminate most read bottlenecks by caching lots of stuff.

3K sectors written per second isn't too good, but I guess that's because of 
the 20K sectors read.  Get some more cache and things should improve a lot.

Also if using a typical Unix mail server (Postfix, Sendmail, etc) then the 
data is written synchronously somewhere under /var before being read from 
there and written to the destination.  If you use a NVRAM card from UMEM 
http://www.umem.com/ for /var/spool then you could possibly double mail 
delivery performance.  If you use data=journal and put the journal for the 
mail store file system on the umem device you could probably double 
performance again.

 Also, did you implement virus/spam scanning on that box?

No!  Virus/spam scanning was on the front-end machines.  It was believed that 
the mail store machines were busy enough with doing the most basic work 
without virus scanning (also the number of licences for the anti-virus 
program didn't match the number of store machines that were planned).

You want to do as much work away from the mail store as possible.  Mail store 
machines can not be replaced without major inconvenience to everyone 
(customers, staff, management).  Front-end anti-virus machines are 
disposable, if you have a traffic balancing device (such as a Cisco 
LocalDirector or IPVS) in front of a cluster of anti-virus machines then an 
anti-virus machine can go down for a few days without anyone bothering.

If (hypothetically) anti-virus was to take 10% of the performance from a mail 
store then it could require another mail store machine (if you have 5-10 
machines) and therefore that's one more machine which can break and cause 
massive pain to everyone.

Another thing, a mail store machine should require almost no CPU power.  Give 
it a single CPU that's not the fastest available.  It sucks when you have two 
almost unused CPUs which are both fast and hot and then one breaks down 
killing the machine.

-- 
http://www.coker.com.au/selinux/   My NSA Security Enhanced Linux packages
http://www.coker.com.au/bonnie++/  Bonnie++ hard drive benchmark
http://www.coker.com.au/postal/Postal SMTP/POP benchmark
http://www.coker.com.au/~russell/  My home page


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: High volume mail handling architecture

2004-09-08 Thread Johannes Formann
Maykel Moya [EMAIL PROTECTED] wrote:

 We have already choosed the software components: postfix, ldap and we ar
 discussing about dovecot vs. cyrus as imap server.

http://www.vergenet.net/linux/mail_farm/ describes some Ideas for
setting up a large Mailcluster.

regards Johannes 


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: High volume mail handling architecture

2004-09-08 Thread Russell Coker
On Tue, 7 Sep 2004 23:48, Theo Hoogerheide [EMAIL PROTECTED] wrote:
 Try looking for a netapp or something else for central datastorage and a
 loadbalancer..

If you have a Netapp then you have to deal with Linux NFS issues which aren't 
fun.

If you have a cluster of storage machines and front-end SMTP servers to direct 
delivery to the correct back-end machine as well as Perdition to proxy POP 
and IMAP to the correct back-end machine then you can scale easily without 
dealing with NFS.

 This setup is proven to be very scalable, when you want to add another
 20k users, just add some servers :)

You have to either be doing something very intensive or very wrong to need 
more than one server for 20K users.  Last time I did this I got 250K users 
per server, and I believe that I could have easily doubled that if I was 
allowed to choose the hardware.

I used Qmail (not my choice), Courier, Perdition, IMP, MySQL (for IMP), and 
was moving it to OpenLDAP at the time I left that project.

-- 
http://www.coker.com.au/selinux/   My NSA Security Enhanced Linux packages
http://www.coker.com.au/bonnie++/  Bonnie++ hard drive benchmark
http://www.coker.com.au/postal/Postal SMTP/POP benchmark
http://www.coker.com.au/~russell/  My home page


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: High volume mail handling architecture

2004-09-07 Thread Emmanuel Lacour
On Tue, Sep 07, 2004 at 08:38:56AM -0400, Maykel Moya wrote:
 I'm looking for documentation which help me to design a failover,
 redundant and scalable mail system to handle 20K users with plans to
 scale soon to about 50K.
 
 We have already choosed the software components: postfix, ldap and we ar
 discussing about dovecot vs. cyrus as imap server.

Courier-imap works fine on similar setup ...

(my 2cents not very usefull ;-))

-- 
Emmanuel Lacour  Easter-eggs
44-46 rue de l'Ouest  -  75014 Paris   -   France -  Métro Gaité
Phone: +33 (0) 1 43 35 00 37- Fax: +33 (0) 1 41 35 00 76
mailto:[EMAIL PROTECTED]   -http://www.easter-eggs.com


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: High volume mail handling architecture

2004-09-07 Thread Theo Hoogerheide
On Tue, 2004-09-07 at 14:38, Maykel Moya wrote:
 I'm looking for documentation which help me to design a failover,
 redundant and scalable mail system to handle 20K users with plans to
 scale soon to about 50K.
 
 We have already choosed the software components: postfix, ldap and we ar
 discussing about dovecot vs. cyrus as imap server.

Try looking for a netapp or something else for central datastorage and a
loadbalancer.. 

We have a Alteon AD3 with a couple of smtp-servers, a couple of
imap/pop3-servers and some ldap-servers. All mail is stored on the
netapp and all smtp-servers and imap/pop3-servers are identical. 

For Ldap we have one master. The slaves get their data through
replication, not via the netapp. 

This setup is proven to be very scalable, when you want to add another
20k users, just add some servers :)

Unfortunately I don't have serious documentation about the setup, but
you can mail me off-list about the setup if you have any questions.

-- 
Kind regards,

Theo Hoogerheide


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]