Re: [Mailman-Users] Mailman throughput

2011-08-15 Thread Brad Knowles

On 08/14/2011 11:24 PM, Ivan Fetch wrote:


Brad, I think we are already accomplishing a lot of this minimalism,
since the MTA on the Mailman VM is only accepting the message via SMTP,
then handing it off to Mailman via the Postfix aliases. The spam and
other checks are done before hand, by another upstream gateway MTA. That
gateway then hands mailing list messages off to the Mailman box.


You're talking about inbound, and how you have outsourced many of these 
kinds of checks to other boxes.  That's fine as far as it goes, but I 
was talking about *outbound*, from Mailman to the world of recipients.



You are likely to have a certain number of messages coming into your 
system which will require a certain amount of processing to scan them 
for viruses and spam, etc


However, on outbound, you will presumably have this same number of 
messages multiplied by the number of recipients.


If that's an average of ten recipients per list, then you have a factor 
of ten increase in the amount of work done to scan those messages for 
viruses and spam -- and since all those messages are largely identical 
in those regards, that's all wasted work, and therefore that's all work 
that you want to avoid to the greatest degree possible.


As you scale up to thousands, tens of thousands, hundreds of thousands, 
etc... numbers of recipients, the more work you can avoid doing on the 
outbound side, the better.



This is true for subscribers which are not part of our organization
-  the MTA which Mailman relays to accepts the messages, and then deals
with any delivery issues. However, accounts for which this MTA is the
final destination, will tempfail under certain conditions, like
mismatched attributes in an LDAP record, or an issue with the mailstore.


And those are precisely the circumstances under which the MTA should not 
be handing a tempfail condition back to Mailman.  It should go ahead and 
blindly accept those messages and accept responsibility for them, and 
then it should deal with those tempfail cases internally.


Mailman is really, really bad at handling large queues for all the same 
reasons that MTAs from twenty years ago were bad at handling large 
queues -- they're largely single threaded, disk bound, and use a single 
outbound directory for all file locking and message queueing, which 
means that they are absolutely decimated when it comes to having to scan 
a linear linked list on disk when trying to store the next file or pull 
up the next file.


Modern MTAs are fully multi-threaded, they keep their active queue in 
memory as opposed to putting them on disk, and they hash the disk queues 
for inactive messages over a large distributed set of directories so if 
one process is working on the files in a given directory then the odds 
are vanishingly small that any other process would be blocked waiting on 
the lock for that directory.



You wouldn't put a Model-T Ford into a Formula-1 race today, and 
likewise you should not be depending on ancient queueing methods as your 
bottleneck for handling all your outgoing mail.


Or, if you have no choice but to depend on them at all, then you should 
minimize your dependence on them as much as you possibly can.



For better or worse, we are moving a lot of our mailboxes to mail
forwards over the next few months - this will move the rest of these
tempfails out of Mailman's SMTP / retry queue, and into the downstream
relay (where they belong).


From Mailman's perspective, your local MTA *IS* the downstream relay, 
and it should not be causing these kinds of loads to be put on Mailman.


Pull as much of the queueing as possible out of Mailman and put it into 
your local MTA.  From there, it becomes an MTA problem, and it doesn't 
matter to Mailman whether the mailboxes are local or remote.



I say all this as a specialist in designing and building large-scale 
mail systems (such as AOL), a long-term member of the Mailman project, 
and a member of the postmaster team for python.org where all the 
official Mailman mailing lists are hosted -- using Mailman.


--
Brad Knowles b...@shub-internet.org
LinkedIn Profile: http://tinyurl.com/y8kpxu
--
Mailman-Users mailing list Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org


Re: [Mailman-Users] Mailman throughput

2011-08-15 Thread Brad Knowles

On 08/15/2011 02:49 AM, Brad Knowles wrote:


You're talking about inbound, and how you have outsourced many of these
kinds of checks to other boxes. That's fine as far as it goes, but I was
talking about *outbound*, from Mailman to the world of recipients.


You are likely to have a certain number of messages coming into your
system which will require a certain amount of processing to scan them
for viruses and spam, etc

However, on outbound, you will presumably have this same number of
messages multiplied by the number of recipients.


I just thought of an analogy that I think will be very useful here. 
Input and output are two related, but very different processes -- both 
for computers as well as humans.  Having a pee is a different process 
from drinking a beer -- related, but still different.


Generally speaking, you want to think about mixing your inputs and your 
outputs -- and this gets more and more important as you scale up.  A 
single person who pees in the Colorado River is not going to materially 
impact the water quality of the downstream communities, but if an entire 
city were to dump untreated sewage into the river on an ongoing basis, 
that would be a different matter.



Likewise with e-mail, what works well for you as a small site is 
probably going to be something that you find doesn't necessarily work so 
well as you get bigger and bigger.  Mixing your inputs and outputs is 
one of those factors.


For example, when processing incoming e-mail, you want to apply one set 
of rules for handling viruses, but you want to apply a different set for 
outbound mail.  In both cases, you want to notify the internal person at 
your site about the situation and let them work on how to deal with the 
issue, but they are the recipient on inbound and they are the sender on 
outbound -- so you can't take a simple always notify the sender or 
always notify the recipient policy.


If you have performance complaints, then you have to look at where your 
bottlenecks are and what those bottlenecks do to you.  Eliminate the 
biggest bottlenecks first, then work on the next one.  If cost is a 
factor, then try to find big bottlenecks that you can fix that won't 
cost as much money, and keep working on eliminating those key 
bottlenecks as you find whatever the new issue is.  Again, mixing inputs 
and outputs tends to be one of those key bottlenecks, both overall and 
with regards to return-on-investment.



In the case of Mailman, we can reasonably guarantee that we follow the 
GIGO principle -- Garbage In, Garbage Out.  If you can keep the inbound 
flow of e-mail clean, then there's nothing that Mailman does that should 
make the outbound flow dirty again, so you can safely by-pass all the 
checks that you would normally make at the MTA level for outbound mail 
from Mailman.


At least, as far as your local MTA is concerned, you can eliminate all 
those checks.  If the checks are done at your edge, then changes to your 
local MTA won't have any impact on whether or not that work is done and 
how much it costs you, but at least you can avoid causing unnecessary 
additional load on Mailman itself.



Of course, the nature of mailing lists means that Mailman will multiply 
by orders of magnitude the amount of work to be done on outbound as 
compared to inbound, so if you can eliminate any of those unnecessary 
checks then that will tend to be a huge win overall with regards to both 
performance and monetary cost -- you won't have to devote so much money 
and resources to building a larger system to handle the flow, if you can 
make sure that the Mailman part of that flow is already clean and 
therefore doesn't need to be re-checked.




So, the general rules are don't mix the inputs and outputs, especially 
as you scale up.


--
Brad Knowles b...@shub-internet.org
LinkedIn Profile: http://tinyurl.com/y8kpxu
--
Mailman-Users mailing list Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org


Re: [Mailman-Users] Cannot to make work mailman correctly

2011-08-15 Thread James Brown
On 11.08.2011 17:01, Mark Sapiro wrote:
 On 8/9/2011 11:46 PM, James Brown wrote:
 I have a vds under Free-BSD-8.1-STABLE which I use for maintaining of
 public sites and etc.
 Under that vds I have a mailing system which consits from an exim-4.76
 which receives and sends emails to/from local mail users and
 dovecot-1.2.17 which lets get receiving post to email-clients.
 I want to set up the Mailman for maintaining post lists.
 Firstly, created subdomain 'list.somename.name' in my bind-settings. It
 works correctly.
 Then, after installing the Mailman from ports, I set up my Apache to
 work with it. My Apache wokrs correctly too.
 Then I checked and changed my exim configuration file and etc and tested
 list verification command with the next result:
  exim -bt n...@list.somename.name
 Address rewritten as: n...@list.somename.name
 n...@list.somename.name
 -- n...@list.somename.name
 -- n...@list.somename.name
 -- n...@list.somename.name
   router = localuser, transport = local_delivery
 
 
 For some reason, your 'mailman' router is not meeting all it's
 conditions and Exim is proceeding to 'localuser'. 

I found what was wrong in my exim configuration and now I have the next:
  exim -bt n...@list.somename.name
 Address rewritten as: n...@list.somename.name
 n...@list.somename.name
 -- n...@list.somename.name
 -- n...@list.somename.name
   router = mailman, transport = mailman


Is your definition
 
 MAILMAN_HOME=/usr/local/mailman
 
 correct? I.e. are your lists in the directory /usr/local/mailman/lists?
 
Yes, it is. That is a configuration creating when installing the Mailman
on the FreeBSD system from ports.
Maybe it needs to me to install the Mailman from sources into home
directory?
 
 Further I created a site-wide mailing list and a mailing list as in
 described in /usr/local/share/doc/mailman/mailman-install.txt but it is
 not works well.
 It is possible to subscribe to list throwgh the web-form, it is possible
 to receive emails after that for subsribing confirmation to local email
 users which I tried to subsribe, is is possible to confirm subscription
 through the web but not through email
 
 
 Because email to mailman doesn't work as above.
 
I improve the above but now I have the next:
  1QsvUg-0004Bz-06 = a...@somename.name H=hostname ([0.0.0.0]) 
 [76.11.218.145] P=esmtpsa X=TLSv1:CAMELLIA256-SHA:256 
 A=plain:a...@somename.name S=744 id=4e49050b.5050...@somename.name from 
 a...@somename.name for n...@list.somename.name
  1QsvUg-0004Bz-06 ** n...@list.somename.name (n...@list.somename.name) 
 n...@list.somename.name R=mailman T=mailman: Child process of mailman 
 transport returned 2 from command: /usr/local/mailman/mail/mailman
  1QsvUg-0004Bz-06 Completed

The subscribes receives the next emails:
 This message was created automatically by mail delivery software.
 
 A message that you sent could not be delivered to one or more of its
 recipients. This is a permanent error. The following address(es) failed:
 
   n...@list.somename.name
 local delivery failed


So, there are neither emails in the list (through e-mail) nor in the
web-archive.
 
 and it is impossible to receive
 emails from list and to see the archive sending to list throupgh the web.
 
 
 Presumably this is again because mail TO the list is not received by
 Mailman.
 
 
 With accordance to the above I have some questions:
 1) What I do wrong?
 
 
 For some reason, Exim is not routing list mail per the 'mailman' router.
 The only thing I see is that the require_files =
 MAILMAN_HOME/lists/$local_part/config.pck is not satisfied, presumably
 because MAILMAN_HOME is not defined to the correct path.
 
ls -l /usr/local/mailman
total 36
drwxrwsr-x  11 root  mailman  1536  8 ��� 16:50 Mailman
drwxrwsr-x   4 root  mailman   512  8 ��� 12:09 archives
drwxrwsr-x   2 root  mailman  1024  8 ��� 12:09 bin
drwxrwsr-x   2 root  mailman   512  8 ��� 12:09 cgi-bin
drwxrwsr-x   2 root  mailman   512  8 ��� 12:09 cron
drwxrwsr-x   2 root  mailman   512 10 ��� 09:36 data
drwxrwsr-x   2 root  mailman   512  8 ��� 12:09 icons
drwxrwsr-x   4 root  mailman   512  8 ��� 12:45 lists
drwxrwsr-x   2 root  mailman   512 15 ��� 12:00 locks
drwxrwsr-x   2 root  mailman   512 13 ��� 07:30 logs
drwxrwsr-x   2 root  mailman   512  8 ��� 12:56 mail
drwxrwsr-x  38 root  mailman   512  8 ��� 12:09 messages
drwxrwsr-x   2 root  mailman   512  8 ��� 12:09 pythonlib
drwxrwsr-x  11 root  mailman   512  8 ��� 12:26 qfiles
drwxrwsr-x   2 root  mailman   512  8 ��� 12:09 scripts
drwxrwsr-x   2 root  mailman   512  8 ��� 12:09 spam
drwxrwsr-x  39 root  mailman   512  8 ��� 12:09 templates
drwxrwsr-x   4 root  mailman   512  8 ��� 12:09 tests

pkg_info -L mailman-2.1.14_5

Information for mailman-2.1.14_5:

Files:
/usr/local/www/icons/PythonPowered.png
/usr/local/www/icons/mailman.jpg
/usr/local/www/icons/mm-icon.png
/usr/local/www/icons/powerlogo.gif
/usr/local/mailman/Mailman/Archiver/Archiver.py

Re: [Mailman-Users] Are Mailman's files sparse?

2011-08-15 Thread Barry Warsaw
On Aug 14, 2011, at 10:12 PM, Ivan Fetch wrote:

As part of copying our Mailman data from one box to another, I wanted to
verify: are any of Mailman's data files, sparse, E.G the .pck files and
other files in the data directory?

It looks to me like the answer to this is no.

Correct.
-Barry
--
Mailman-Users mailing list Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org


Re: [Mailman-Users] Mailman throughput

2011-08-15 Thread Ivan Fetch
Hi Brad,

On Aug 15, 2011, at 1:49 AM, Brad Knowles wrote:

On 08/14/2011 11:24 PM, Ivan Fetch wrote:

Brad, I think we are already accomplishing a lot of this minimalism,
since the MTA on the Mailman VM is only accepting the message via SMTP,
then handing it off to Mailman via the Postfix aliases. The spam and
other checks are done before hand, by another upstream gateway MTA. That
gateway then hands mailing list messages off to the Mailman box.

You're talking about inbound, and how you have outsourced many of these
kinds of checks to other boxes.  That's fine as far as it goes, but I
was talking about *outbound*, from Mailman to the world of recipients.


You are likely to have a certain number of messages coming into your
system which will require a certain amount of processing to scan them
for viruses and spam, etc

However, on outbound, you will presumably have this same number of
messages multiplied by the number of recipients.

If that's an average of ten recipients per list, then you have a factor
of ten increase in the amount of work done to scan those messages for
viruses and spam -- and since all those messages are largely identical
in those regards, that's all wasted work, and therefore that's all work
that you want to avoid to the greatest degree possible.

As you scale up to thousands, tens of thousands, hundreds of thousands,
etc... numbers of recipients, the more work you can avoid doing on the
outbound side, the better.


OK - now we're on the same page. :) The MTA which Mailman relays to, does not 
repeat processes like virus / spam scanning. We are re-working our gateways and 
relays over the next few months, to further separate out these roles. E.G. 
Quarantine of spam will be handled before a message hits Mailman, not after the 
message has been exploded to list subscribers.



This is true for subscribers which are not part of our organization
-  the MTA which Mailman relays to accepts the messages, and then deals
with any delivery issues. However, accounts for which this MTA is the
final destination, will tempfail under certain conditions, like
mismatched attributes in an LDAP record, or an issue with the mailstore.

And those are precisely the circumstances under which the MTA should not
be handing a tempfail condition back to Mailman.  It should go ahead and
blindly accept those messages and accept responsibility for them, and
then it should deal with those tempfail cases internally.

We are definitely moving to this (MTA will accept what ever Mailman gives it). 
For the next few months, we will have some local accounts tempfailing, until we 
get off of Sun IMS or JSMS or what ever the product is named today. Part of why 
the relayis tempfailing, is because we hapen to be using a relay which is also 
a mailstore.



Mailman is really, really bad at handling large queues for all the same
reasons that MTAs from twenty years ago were bad at handling large
queues -- they're largely single threaded, disk bound, and use a single
outbound directory for all file locking and message queueing, which
means that they are absolutely decimated when it comes to having to scan
a linear linked list on disk when trying to store the next file or pull
up the next file.

Modern MTAs are fully multi-threaded, they keep their active queue in
memory as opposed to putting them on disk, and they hash the disk queues
for inactive messages over a large distributed set of directories so if
one process is working on the files in a given directory then the odds
are vanishingly small that any other process would be blocked waiting on
the lock for that directory.

AH, good to know RE: Mailman queueing. SO, the only reason why things should be 
in qfiles/retry, woudl be something like a relay being unavailable.


For better or worse, we are moving a lot of our mailboxes to mail
forwards over the next few months - this will move the rest of these
tempfails out of Mailman's SMTP / retry queue, and into the downstream
relay (where they belong).

From Mailman's perspective, your local MTA *IS* the downstream relay,
and it should not be causing these kinds of loads to be put on Mailman.

Pull as much of the queueing as possible out of Mailman and put it into
your local MTA.  From there, it becomes an MTA problem, and it doesn't
matter to Mailman whether the mailboxes are local or remote.

WHen you say local MTA you don't mean strictly local to the Mailman box 
right? I believe you mean local as in a separate relay box.


I say all this as a specialist in designing and building large-scale
mail systems (such as AOL), a long-term member of the Mailman project,
and a member of the postmaster team for python.orghttp://python.org where all 
the
official Mailman mailing lists are hosted -- using Mailman.


Thanks Brad, for your time on this, and your later analogy RE: input and output.

- Ivan























.
--
Mailman-Users mailing list 

Re: [Mailman-Users] Cannot to make work mailman correctly

2011-08-15 Thread Mark Sapiro
On 8/15/2011 5:08 AM, James Brown wrote:
 On 11.08.2011 17:01, Mark Sapiro wrote:

 For some reason, your 'mailman' router is not meeting all it's
 conditions and Exim is proceeding to 'localuser'. 
 
 I found what was wrong in my exim configuration and now I have the next:
  exim -bt n...@list.somename.name
 Address rewritten as: n...@list.somename.name
 n...@list.somename.name
 -- n...@list.somename.name
 -- n...@list.somename.name
   router = mailman, transport = mailman


OK. This is now good.


[...]

 I improve the above but now I have the next:
  1QsvUg-0004Bz-06 = a...@somename.name H=hostname ([0.0.0.0]) 
 [76.11.218.145] P=esmtpsa X=TLSv1:CAMELLIA256-SHA:256 
 A=plain:a...@somename.name S=744 id=4e49050b.5050...@somename.name from 
 a...@somename.name for n...@list.somename.name
  1QsvUg-0004Bz-06 ** n...@list.somename.name (n...@list.somename.name) 
 n...@list.somename.name R=mailman T=mailman: Child process of mailman 
 transport returned 2 from command: /usr/local/mailman/mail/mailman
  1QsvUg-0004Bz-06 Completed


Status 2 from the /usr/local/mailman/mail/mailman command is a group
mismatch error. Your Exim configuration definition of the 'mailman'
transport contains the line

  group = MAILMAN_GROUP

and this is defined by the macro

  MAILMAN_GROUP=mailman

This group, 'mailman', does not match the expected group compiled into
(or configured in by some FreeBSD magic) the
/usr/local/mailman/mail/mailman wrapper.

The command wrote a helpful error message to stderr explaining what
group invoked it and what group it expected, but Exim neither logged it
nor reported it to the user.

To see this message, run the command

  sudo -u mailman /usr/local/mailman/mail/mailman post

This will produce a message like

Group mismatch error.  Mailman expected the mail
wrapper script to be executed as group , but
the system's mail server executed the mail script as
group mailman.  Try tweaking the mail server to run the
script as group , or re-run configure,
providing the command line option `--with-mail-gid=mailman'.

This will tell you the expected group which I have indicated as .
Then you can either change the definition in Exim to

  MAILMAN_GROUP=

or perform what ever FreeBSD magic will change the wrapper's expected
group to mailman.


 The subscribes receives the next emails:
 This message was created automatically by mail delivery software.

 A message that you sent could not be delivered to one or more of its
 recipients. This is a permanent error. The following address(es) failed:

   n...@list.somename.name
 local delivery failed
 
 
 So, there are neither emails in the list (through e-mail) nor in the
 web-archive.

 and it is impossible to receive
 emails from list and to see the archive sending to list throupgh the web.


 Presumably this is again because mail TO the list is not received by
 Mailman.


 With accordance to the above I have some questions:
 1) What I do wrong?


 For some reason, Exim is not routing list mail per the 'mailman' router.
 The only thing I see is that the require_files =
 MAILMAN_HOME/lists/$local_part/config.pck is not satisfied, presumably
 because MAILMAN_HOME is not defined to the correct path.

 ls -l /usr/local/mailman
 total 36
[...]
 drwxrwsr-x   4 root  mailman   512  8 ��� 12:45 lists
[...]
 
 and etc.
 Is it wrong?


Whatever was wrong was fixed by whatever you did to make Exim invoke the
'mailman' router for list mail. You need do nothing more.



 2) Where is I need to indicate aliases after creating the new list - in
 /etc/aliases (which is a symbolic link to /etc/mail/aliases in FreeBSD)
 or in /usr/local/etc/exim/aliases?


 You don't need aliases.  List mail should be handled by the 'mailmen'
 router and the 'mailman' transport.

 I.e. I don't need create aliases which indicates by mailman after
 creating lists such the next:
 The mailing list `news' has been created via the through-the-web
 interface.  In order to complete the activation of this mailing list, the
 proper /etc/aliases (or equivalent) file must be updated.  The program
 `newaliases' may also have to be run.
 
 Here are the entries for the /etc/aliases file:
 
 news:  |/usr/local/mailman/mail/mailman post news
 news-admin:|/usr/local/mailman/mail/mailman admin news
 news-bounces:  |/usr/local/mailman/mail/mailman bounces news
 news-confirm:  |/usr/local/mailman/mail/mailman confirm news
 news-join: |/usr/local/mailman/mail/mailman join news
 news-leave:|/usr/local/mailman/mail/mailman leave news
 news-owner:|/usr/local/mailman/mail/mailman owner news
 news-request:  |/usr/local/mailman/mail/mailman request news
 news-subscribe:|/usr/local/mailman/mail/mailman subscribe news
 news-unsubscribe:  |/usr/local/mailman/mail/mailman unsubscribe news


You get this message because you have not put

MTA = None

in mm_cfg.py. You can either put that line in mm_cfg.py or ignore the
message about aliases