On 08/15/2011 02:49 AM, Brad Knowles wrote:

You're talking about inbound, and how you have outsourced many of these
kinds of checks to other boxes. That's fine as far as it goes, but I was
talking about *outbound*, from Mailman to the world of recipients.


You are likely to have a certain number of messages coming into your
system which will require a certain amount of processing to scan them
for viruses and spam, etc....

However, on outbound, you will presumably have this same number of
messages multiplied by the number of recipients.

I just thought of an analogy that I think will be very useful here. Input and output are two related, but very different processes -- both for computers as well as humans. Having a pee is a different process from drinking a beer -- related, but still different.

Generally speaking, you want to think about mixing your inputs and your outputs -- and this gets more and more important as you scale up. A single person who pees in the Colorado River is not going to materially impact the water quality of the downstream communities, but if an entire city were to dump untreated sewage into the river on an ongoing basis, that would be a different matter.


Likewise with e-mail, what works well for you as a small site is probably going to be something that you find doesn't necessarily work so well as you get bigger and bigger. Mixing your inputs and outputs is one of those factors.

For example, when processing incoming e-mail, you want to apply one set of rules for handling viruses, but you want to apply a different set for outbound mail. In both cases, you want to notify the internal person at your site about the situation and let them work on how to deal with the issue, but they are the recipient on inbound and they are the sender on outbound -- so you can't take a simple "always notify the sender" or "always notify the recipient" policy.

If you have performance complaints, then you have to look at where your bottlenecks are and what those bottlenecks do to you. Eliminate the biggest bottlenecks first, then work on the next one. If cost is a factor, then try to find big bottlenecks that you can fix that won't cost as much money, and keep working on eliminating those key bottlenecks as you find whatever the new issue is. Again, mixing inputs and outputs tends to be one of those key bottlenecks, both overall and with regards to return-on-investment.


In the case of Mailman, we can reasonably guarantee that we follow the GIGO principle -- Garbage In, Garbage Out. If you can keep the inbound flow of e-mail clean, then there's nothing that Mailman does that should make the outbound flow dirty again, so you can safely by-pass all the checks that you would normally make at the MTA level for outbound mail from Mailman.

At least, as far as your local MTA is concerned, you can eliminate all those checks. If the checks are done at your edge, then changes to your local MTA won't have any impact on whether or not that work is done and how much it costs you, but at least you can avoid causing unnecessary additional load on Mailman itself.


Of course, the nature of mailing lists means that Mailman will multiply by orders of magnitude the amount of work to be done on outbound as compared to inbound, so if you can eliminate any of those unnecessary checks then that will tend to be a huge win overall with regards to both performance and monetary cost -- you won't have to devote so much money and resources to building a larger system to handle the flow, if you can make sure that the Mailman part of that flow is already clean and therefore doesn't need to be re-checked.



So, the general rules are don't mix the inputs and outputs, especially as you scale up.

--
Brad Knowles <b...@shub-internet.org>
LinkedIn Profile: <http://tinyurl.com/y8kpxu>
------------------------------------------------------
Mailman-Users mailing list Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Reply via email to