Was: : How to report 120,000 spams a day
Hi, I wanted to thank everyone who responded both on and off list. In the end there was still alot of confusion from people about my configuration, my intentions, my set up, some things I said But its really not worth rehashing again. The end result is I've changed my setup. The other good that came out of this is that my [EMAIL PROTECTED] Recent Average Credit went up by 10% total. Thanks again, Tuc
Re: How to report 120,000 spams a day
So then they tell me to push the virtusertable out to the MX's. So I've asked multiple people multiple times how using sendmail on an MX thats not a final delivery server how to use the virtusertable to accept the mail, process against the virtusertable, and then when the final delivery server is contactable, send it there. Of what I've read, no one can tell me. What's the problem? Sendmail will resolve the recipients using virtusertable, and queue and retry until it can send. There's nothing to it. Joseph Brennan Columbia University Information Technology
Re: How to report 120,000 spams a day
On 3/10/2008 7:15 PM, Tuc at T-B-O-H.NET wrote: In any case, if someone can explain the mechanics of having a sendmail MX that is not the final delivery server do localized verification against something and then pass it along to the final delivery server please let me know. Its not that I don't want to do any of this all, its that from what I know, at last look, the virtusertable is only consulted during final delivery. "milter-ahead" http://snertsoft.com is your friend easy to use and rock solid
RE: How to report 120,000 spams a day
At 13:38 10-03-2008, James E. Pratt wrote: No. "Possible mail loss" is really the correct term. Just because I have no backup MX, it does not mean I will lose mail (Mail loss can, and usually is caused by many more issues than just no backup/secondary MX). Yes. At 14:20 10-03-2008, Bob Proulx wrote: Loss of mail cannot result solely from a primary MX being offline. See comment about "possible". If you believe that it can then let me ask a related but different question. Does your mail server ever return a 4xx code such as: - out of disk space / insufficient system storage - dns temporarily unavailable / domain service not available - service not available - mailbox not available - user quota exceeded - too many simultaneous connections - other Yes. If the mta does ever return such a code then (for the sake of argument) the same potential for mail loss exists. If you believe that this causes loss then you would want to ensure that none of these conditions can ever happen. Of course this is impossible due to practical limits. In general, we seek to minimize the impact within practical limits. I have seen cases when a backup MX is useful but I won't call it a general rule as it can be more of a problem than it is worth. Regards, -sm
Re: How to report 120,000 spams a day
Sandy S wrote: OK, I admit I haven't been following this thread closely so I may have missed something and maybe my suggestion won't fit your needs. However, we're accomplishing something like what you describe above using Mimedefang. The Mimedefang milter includes a function called md_check_against_smtp_server which checks the recipient address against the virtusertable defined on whatever MX server you give it. If it's not a valid user voila! message is rejected during the Mimedefang processing - aka as soon as the connecting server has provided the recipient address, before the whole message has been transmitted. Otherwise processing and mail delivery continues as normal. You beat me to it! I'll just add that people have discussed alternate solutions on the MD archives that, instead of using md_check_against_smtp_server, involve exporting the list tot he remote MX so that it can still query that information if/when the primary is unavailable. Looking through the MIMEDefang mailing list archives is left as an exercise for the reader. -- Kelson Vibber SpeedGate Communications
Re: How to report 120,000 spams a day
James E. Pratt wrote: > > Bob Proulx wrote: > > >What would have been the downside of *not* having a backup MX? The > > > > Loss of mail. > > No. "Possible mail loss" is really the correct term. Just because I have > no backup MX, it does not mean I will lose mail (Mail loss can, and > usually is caused by many more issues than just no backup/secondary MX). Loss of mail cannot result solely from a primary MX being offline. If you believe that it can then let me ask a related but different question. Does your mail server ever return a 4xx code such as: - out of disk space / insufficient system storage - dns temporarily unavailable / domain service not available - service not available - mailbox not available - user quota exceeded - too many simultaneous connections - other If the mta does ever return such a code then (for the sake of argument) the same potential for mail loss exists. If you believe that this causes loss then you would want to ensure that none of these conditions can ever happen. Of course this is impossible due to practical limits. Fortunately for us mail transfer was designed to be robust in the presence of these problems. When a mail transfer agent receives a 400 level response it knows the action was not taken and the condition is temporary in nature. The mta retry interval should be at least 30 minutes. The mta should give up retrying after at least 4-5 days. Bob
Re: How to report 120,000 spams a day
Tuc at T-B-O-H.NET wrote: Hi, Everyone keeps telling me to push the userlist out to the MX. This isn't possible, since everything is handled in virtusertable. So then they tell me to push the virtusertable out to the MX's. So I've asked multiple people multiple times how using sendmail on an MX thats not a final delivery server how to use the virtusertable to accept the mail, process against the virtusertable, and then when the final delivery server is contactable, send it there. Of what I've read, no one can tell me. Maybe I'm missing a fundamental fact. Are virtusertables checked during non final delivery MX handling in sendmail? OK, I admit I haven't been following this thread closely so I may have missed something and maybe my suggestion won't fit your needs. However, we're accomplishing something like what you describe above using Mimedefang. The Mimedefang milter includes a function called md_check_against_smtp_server which checks the recipient address against the virtusertable defined on whatever MX server you give it. If it's not a valid user voila! message is rejected during the Mimedefang processing - aka as soon as the connecting server has provided the recipient address, before the whole message has been transmitted. Otherwise processing and mail delivery continues as normal. The man pages warn about this causing a little more overhead on the server you're checking along with extra log entries, but it has not been a problem here. Sandy
RE: How to report 120,000 spams a day
> -Original Message- > From: SM [mailto:[EMAIL PROTECTED] > Sent: Monday, March 10, 2008 3:49 PM > To: users@spamassassin.apache.org > Subject: Re: How to report 120,000 spams a day > > At 11:47 10-03-2008, Bob Proulx wrote: > >What would have been the downside of *not* having a backup MX? The > > Loss of mail. No. "Possible mail loss" is really the correct term. Just because I have no backup MX, it does not mean I will lose mail (Mail loss can, and usually is caused by many more issues than just no backup/secondary MX). > > >mail would have remained in the mailqueue. Comcast, AOL, Yahoo, > >Gmail, corporate servers, private servers, etc. would have retried to > >send the mail to you later. When your main mail relay came online > >they would have retried and delivered it. There would have been NO > >DIFFERENCE at all. You didn't need your backup MX relay to proxy > >relay the mail to you. > > The difference is that you are making assumptions about their retry > strategy. Yes, all are different. In the grand scheme though, who cares? We've had no "backup mx" here for over 5 years, and have lost no mail that I'm aware of... (or rather, no one has complained anyhow?). We've been down once for like 8 hours and lost nothing as far as I could tell. If it were down longer (unlikely with a hot spare ready to go, but besides the point) some stuff would just bounce and the senders would resend it. Life goes on). Regards, jp
Re: How to report 120,000 spams a day
Tuc at T-B-O-H.NET wrote: Seriously... How hard is it to setup the MX boxen to only allow 4 email addresses to pass for that particular domain, rejecting all others in the SMTP conversation? Unless the customer is dropping BIG DADDY $$$ with you, tell him policy change and that he isn't losing any email if you do not do a catchall for his domain That postmaster thing is a monster. Send the postmaster stuff to that customer and see how soon they want it turned off ;-> Otherwise do what Kris said and push or pull or whatever all the validrcptto's out to the MX's - rh Hi, Everyone keeps telling me to push the userlist out to the MX. This isn't possible, since everything is handled in virtusertable. So then they tell me to push the virtusertable out to the MX's. So I've asked multiple people multiple times how using sendmail on an MX thats not a final delivery server how to use the virtusertable to accept the mail, process against the virtusertable, and then when the final delivery server is contactable, send it there. Of what I've read, no one can tell me. Maybe I'm missing a fundamental fact. Are virtusertables checked during non final delivery MX handling in sendmail? The postmaster emails are necessary to be able to find issues with the systems before clients do. I've caught issues with disks going bad, perl updates gone wrong, memory problems, and the most recent was that a client was having email sent directly to their ISP, who finally decided I was a spammer. The "5 days worth of attempts" finally expired and I started seeing all the upchuck from the system. If I turn postmaster bounce off, I lose that. But yea, it might become something I have to do. Lose the ability to monitor things happening on my systems in the name of spam. I think the issue most people are having is that they have the luxury that every MX in their list is a final delivery host. We don't. MX's for us fall under the heading of "If the sole final delivery host is too overburdened, or is down for maintenance, hold the mail atleast until it comes back". That REALLY REALLY worked well for us when the datacenter we were at in NYC went down during 9/11 because the National Guard stopped a fuel delivery truck for an hour. Our MX was uptown. When we finally came back online. In any case, if someone can explain the mechanics of having a sendmail MX that is not the final delivery server do localized verification against something and then pass it along to the final delivery server please let me know. Its not that I don't want to do any of this all, its that from what I know, at last look, the virtusertable is only consulted during final delivery. Thanks, Tuc You can do this in the access table. You say you only have 4 users, so it isn't going to be much work. Otherwise you can install smf-sav to do the call ahead. I'd probably just do the manual method into the access table however. We have several mx's to several backends and use redundant LDAP to do our lookup and routing.
Re: How to report 120,000 spams a day
At 11:47 10-03-2008, Bob Proulx wrote: What would have been the downside of *not* having a backup MX? The Loss of mail. mail would have remained in the mailqueue. Comcast, AOL, Yahoo, Gmail, corporate servers, private servers, etc. would have retried to send the mail to you later. When your main mail relay came online they would have retried and delivered it. There would have been NO DIFFERENCE at all. You didn't need your backup MX relay to proxy relay the mail to you. The difference is that you are making assumptions about their retry strategy. Regards, -sm
Re: How to report 120,000 spams a day
Tuc at T-B-O-H.NET wrote: > Everyone keeps telling me to push the userlist out to the > MX. This isn't possible, since everything is handled in virtusertable. > So then they tell me to push the virtusertable out to the MX's. You are begining to understand why MX relays are recommended against. They don't really serve a good purpose today. They do cause hard problems to solve. If you do need one then you also need to solve the hard problems that it pulls in too. Mail transfer agents can retry mail delivery. They don't need to deliver it if your main mail server is offline. They can wait and send it later when it is online. > So I've asked multiple people multiple times how using sendmail > on an MX thats not a final delivery server how to use the virtusertable Did you ask that on the sendmail users mailing list? That would be the place. I couldn't recommend using Sendmail anymore. I recommend Postfix generally but Exim is also a fine MTA. > I think the issue most people are having is that they > have the luxury that every MX in their list is a final delivery > host. We don't. MX's for us fall under the heading of "If the > sole final delivery host is too overburdened, or is down > for maintenance, hold the mail atleast until it comes back". These days many people think that is not a worthwhile reason to have a backup MX. (I am one of those people.) Because of this it isn't solved by anyone because no one wants to work on it. It is your server and it is okay for you to want to do this. But since you are going against the current best practices it means that fewer people care about solving that problem. Which means that you would need to do it yourself. But if you do a nice solution to the problem then other people who think like you do will be greatful for your efforts. If it is really a very nice solution then it might even fall back into favor as an okay way to do things. > That REALLY REALLY worked well for us when the datacenter we > were at in NYC went down during 9/11 because the National > Guard stopped a fuel delivery truck for an hour. Our MX > was uptown. When we finally came back online. What would have been the downside of *not* having a backup MX? The mail would have remained in the mailqueue. Comcast, AOL, Yahoo, Gmail, corporate servers, private servers, etc. would have retried to send the mail to you later. When your main mail relay came online they would have retried and delivered it. There would have been NO DIFFERENCE at all. You didn't need your backup MX relay to proxy relay the mail to you. Bob
Re: How to report 120,000 spams a day
> > Seriously... > > How hard is it to setup the MX boxen to only allow 4 email addresses to pass > for that particular domain, rejecting all others in the SMTP conversation? > > Unless the customer is dropping BIG DADDY $$$ with you, tell him policy > change and that he isn't losing any email if you do not do a catchall for > his domain > > That postmaster thing is a monster. Send the postmaster stuff to that > customer and see how soon they want it turned off > > ;-> > > Otherwise do what Kris said and push or pull or whatever all the > validrcptto's out to the MX's > > - rh > Hi, Everyone keeps telling me to push the userlist out to the MX. This isn't possible, since everything is handled in virtusertable. So then they tell me to push the virtusertable out to the MX's. So I've asked multiple people multiple times how using sendmail on an MX thats not a final delivery server how to use the virtusertable to accept the mail, process against the virtusertable, and then when the final delivery server is contactable, send it there. Of what I've read, no one can tell me. Maybe I'm missing a fundamental fact. Are virtusertables checked during non final delivery MX handling in sendmail? The postmaster emails are necessary to be able to find issues with the systems before clients do. I've caught issues with disks going bad, perl updates gone wrong, memory problems, and the most recent was that a client was having email sent directly to their ISP, who finally decided I was a spammer. The "5 days worth of attempts" finally expired and I started seeing all the upchuck from the system. If I turn postmaster bounce off, I lose that. But yea, it might become something I have to do. Lose the ability to monitor things happening on my systems in the name of spam. I think the issue most people are having is that they have the luxury that every MX in their list is a final delivery host. We don't. MX's for us fall under the heading of "If the sole final delivery host is too overburdened, or is down for maintenance, hold the mail atleast until it comes back". That REALLY REALLY worked well for us when the datacenter we were at in NYC went down during 9/11 because the National Guard stopped a fuel delivery truck for an hour. Our MX was uptown. When we finally came back online. In any case, if someone can explain the mechanics of having a sendmail MX that is not the final delivery server do localized verification against something and then pass it along to the final delivery server please let me know. Its not that I don't want to do any of this all, its that from what I know, at last look, the virtusertable is only consulted during final delivery. Thanks, Tuc
RE: [spamassassin] Re: How to report 120,000 spams a day
Seriously... How hard is it to setup the MX boxen to only allow 4 email addresses to pass for that particular domain, rejecting all others in the SMTP conversation? Unless the customer is dropping BIG DADDY $$$ with you, tell him policy change and that he isn't losing any email if you do not do a catchall for his domain That postmaster thing is a monster. Send the postmaster stuff to that customer and see how soon they want it turned off ;-> Otherwise do what Kris said and push or pull or whatever all the validrcptto's out to the MX's - rh
[spamassassin] Re: How to report 120,000 spams a day
Tuc at T-B-O-H.NET wrote: There are "considerations" in doing this. Right now, all my systems are set up running sendmail, and all with the config of : define(`confCOPY_ERRORS_TO',`Postmaster') As such, true to its name, anytime there is an error, the postmaster gets a copy. 120K copies of [snip] ... eww. isn't acceptable. Yes, I could take out the COPY_ERRORS_TO, but we also run alot of things that are piped to programs, and we usually don't see the errors unless that is set. ... O_o Like what? I'm sure there are better ways to receive these other messages without relying on something of a hack to get them. I'd never enable that on any production system I maintain; the (legitimate!) mail volume alone would generate far more error messages that I really don't need to know about than would be worth wading through. (Do you *really* want to get copies of every postmaster response to a legitimate user's mistyped outbound mail?) For instance, systems here have one of our NOC staff aliases set as the cron mailto; in the event of a cronjob failure, off goes the mail to the people who can deal with it. Many tasks send email to a specific person or alias; and if mail falls apart completely we have the capability to send to pagers or SMS cell phones. Even if I did that, though, the next thing I run into is MX's. The MX blindly accepts the mail. Push a user list out to the MX. Seriously. Blind relays like that are, um, nasty. Mail forwarding is slightly less nasty (you usually only have *one* destination address instead of any destination attracting spam). I've been there; on a legacy system here I stopped relaying mail for domains I don't have a user list for some time ago - the limited benefit it offered in getting mail to the customer faster wasn't worth the glop in the queue, the postmaster mess, or the hardware and staff-time cost. (Now to convince head office... ) If you can't cut down the volume on the front-line MX, you *will* have to spend CPU and/or disk, somewhere, to deal with the mess. Feeding it to /dev/null as you've been doing is probably about as cheap as you can get. And as others have noted, it's a tainted feed as a "spamtrap"; you'd still have to postprocess it to some degree to make it useful anyway. -kgd
Re: [spamassassin] Re: How to report 120,000 spams a day
SM wrote: At 17:51 08-03-2008, Tuc at T-B-O-H.NET wrote: As part of it all, I also want to try to keep disk usage and CPU down to as little as possible. With 120,000 per day, thats a junk mail every 3/4's of a second. Since I have it set to deliver to /dev/null, I reduce the amount of disk usage. I'm looking for a solution that would be easy on the disk and easy on the CPU. So something directly out of the MTA would be great (sendmail) or something that the delivery would not store it locally. Rewrite the recipient address of these emails to another address. That should reduce disk usage on that server and filtering load. You can run the reporting on another server. It can be done hourly by processing the mailbox instead of one message at a time. That would require some code changes. Regards, -sm I'm doing spam reporting and I'm using Exim to do it. I'm not sure what you are trying to do but I've configured Exim to try one time to send an abouse report and if it fails it goes to /dev/null. In oder to get speed I mount a ram disk and put my queues in ram which makes it run really fast. I'm deliverling about 10k/hour abuse reports and not having any problem. Runs fine as a separate server in a VPS.
Re: [spamassassin] Re: [spamassassin] Re: How to report 120,000 spams
On Sun, 9 Mar 2008, Tuc at T-B-O-H.NET wrote: But it still remains, I'm looking to find what people think is the best way on an MX host to do the rejecting at SMTP time. I'm coming to this conversation kind of late, so I apologize if I've missed something important earlier in the thread, but it sounds like what you want is a call-ahead milter (an example is at http://www.snertsoft.com/sendmail/milter-ahead/). Call-ahead milters allow MXs to contact the ultimate destination server, determine whether an address is valid at the end of the line and then take various actions, including rejection. -- Public key #7BBC68D9 at| Shane Williams http://pgp.mit.edu/| System Admin - UT iSchool =--+--- All syllogisms contain three lines | [EMAIL PROTECTED] Therefore this is not a syllogism | www.ischool.utexas.edu/~shanew
Re: [spamassassin] Re: [spamassassin] Re: How to report 120,000 spams
> > Bango said that if his mom can't spell his name right, he doesn't > > care if he gets her emails. :) > > > > fair enough (he can also discard delivered mail anyway). but I've seen a > lot of people subscribing to services with a mistyped address (their > own) and then calling us to complain why they didn't get the > confirmation request... > > anyway, your "corpus" is probably usable provided one uses heuristics to > avoid hitting possible ham (or example by computing a distance between > the recipient address and your valid addresses to make sure the > recipient address is not mistyped, ... etc). but I still believe it > should be "reduced" by rejecting mail at smtp time and only keeping some > selected "trap" addresses (for example /[EMAIL PROTECTED]/ to catch > attempts to use a phone-like address). > The bango/mom thing was a joke. Not to make the situation any worse, but the user has never called me wondering where email they expected is. But then again, I rarely ever hear from the user period. Anyway, I'm fine with the 120,000 mails now being considered useless in the long run. Atleast 2 people put it well enough to me that I get it. I'm fine with not having ANY spam traps either. But it still remains, I'm looking to find what people think is the best way on an MX host to do the rejecting at SMTP time. Thanks, Tuc
Re: [spamassassin] Re: How to report 120,000 spams
Tuc at T-B-O-H.NET wrote: If you are proposing some kind of checksums or other types of 'message identifying' techniques on the messages, those few mistyped addresses could certainly make a difference for your site. What if bongo's mom mistypes to bungo, realizes her mistake and resends it to bongo a few minutes later. It is quite likely that the valid message will be rejected now since it's (almost) identical to the one your proposed system just marked as spam. What if bongo signs up for the a mailing list and mistypes his own email address (yes, this happens). Now your system marks all list mailings as spam, so everyone using your system starts losing their copies of the mailing list messages too? Bango said that if his mom can't spell his name right, he doesn't care if he gets her emails. :) fair enough (he can also discard delivered mail anyway). but I've seen a lot of people subscribing to services with a mistyped address (their own) and then calling us to complain why they didn't get the confirmation request... anyway, your "corpus" is probably usable provided one uses heuristics to avoid hitting possible ham (or example by computing a distance between the recipient address and your valid addresses to make sure the recipient address is not mistyped, ... etc). but I still believe it should be "reduced" by rejecting mail at smtp time and only keeping some selected "trap" addresses (for example /[EMAIL PROTECTED]/ to catch attempts to use a phone-like address). I'm not proposing anything. I originally wanted to see if there was some way that these 120,000 emails that don't go to a valid/usable end user could be used to help the community out in some way. I had 2 filtering systems agree to do something with them, but for reasons I'd rather not share neither one worked out. (One may still yet, I'm not sure, waiting to hear back) We also don't do sitewide Bayes/etc. We do it per received user. For this domain, it just happens that all 4 users of the domain constitute a single received user. I realize that collectively this list could propose well over 5000 reasons that make sense why "good" mail could be part of that 120,000. I just didn't think the ever so insignificant percentage mattered. For as much as spam gets through, and good mail gets marked bad also, I thought this was "acceptable". I think you have good intentions but the source of your data is flawed for anything but maybe limited statistical training. Unfortunately it probably is not great for that either, since the mail you are seeing for non existent users is probably not at all similar to the mix of spam you get to real accounts. The scanner would end up biased towards whatever junk the spammers desperate enough to use dictionaries send, which would drown out the stats from those spams that are actually difficult to detect. Ok, very valid point that makes alot of sense. Thank you. Why do you accept messages for non existent accounts? You're wasting bandwidth, regardless of what you do or don't do with the junk after you accept it. From the sound of it you could reduce your mail bandwidth to a tiny fraction of what it is now by just refusing this stuff (which is what most everyone else does, AFAIK). How do you do it on MX hosts? I realize that if I stop the wildcard acceptance and stop copying errors to postmaster that I can do it on the destination server. However, due to circumstances out of my control for the next few months, all email arrives to the main mail server via MXs ONLY. Thanks, Tuc
Re: [spamassassin] Re: How to report 120,000 spams
> The same argument applies to mail to valid addresses (bingo, bango, ...) > as well. would you like to use all your mail as a spam corpus? after > all, you get only 10 out of 12 messages to these addresses :) > Well, bingo DOES like to hear from his mom, SOMETIMES. ;) I understand your point, but like I previously said... The domain owners have told me that the incidence of mistypes and use of email addresses that people think are valid but aren't is so low that they accept that ones are being tossed and consider that an acceptable loss. > > anyway, you'll have to make your mind. N spam messages is not the same > thing as N probable spam messages, even if the probablity is > 0.999 (with a finite number > of 9s). if the probability is not 1 (exactly), then the corpus is > polluted. It may be statistically good, but that's not always good. > Ok, I see where people are coming from on it. > > The worst part of this story is that you may be silently (and > "frivoulously") discarding legitimate mail, which is not very nice (if I > mistype an address in the said domain, my mail gets dropped and I don't > have a chance to fix my typo...). Do yourself and others a favour and > find a way to reject these at smtp time. if you want to trap some spam, > use carefully selected addresses. > The owners are aware this can happen, and in the grand scheme of things are more happy that they don't have to go through the 120K emails to delete tham, than worry about "The one that got away". As mentioned in the previous message, I need to know of a suitable option for MX hosts. I may have to decide not to be so vigilant about real errors and turn error copying to postmaster, but that still won't solve MX's. Thanks, Tuc Thanks, Tuc
Re: [spamassassin] Re: How to report 120,000 spams
> > If you are proposing some kind of checksums or other types of 'message > identifying' techniques on the messages, those few mistyped addresses > could certainly make a difference for your site. What if bongo's mom > mistypes to bungo, realizes her mistake and resends it to bongo a few > minutes later. It is quite likely that the valid message will be > rejected now since it's (almost) identical to the one your proposed > system just marked as spam. What if bongo signs up for the a mailing > list and mistypes his own email address (yes, this happens). Now your > system marks all list mailings as spam, so everyone using your system > starts losing their copies of the mailing list messages too? > Bango said that if his mom can't spell his name right, he doesn't care if he gets her emails. :) I'm not proposing anything. I originally wanted to see if there was some way that these 120,000 emails that don't go to a valid/usable end user could be used to help the community out in some way. I had 2 filtering systems agree to do something with them, but for reasons I'd rather not share neither one worked out. (One may still yet, I'm not sure, waiting to hear back) We also don't do sitewide Bayes/etc. We do it per received user. For this domain, it just happens that all 4 users of the domain constitute a single received user. I realize that collectively this list could propose well over 5000 reasons that make sense why "good" mail could be part of that 120,000. I just didn't think the ever so insignificant percentage mattered. For as much as spam gets through, and good mail gets marked bad also, I thought this was "acceptable". > > I think you have good intentions but the source of your data is flawed > for anything but maybe limited statistical training. Unfortunately it > probably is not great for that either, since the mail you are seeing > for non existent users is probably not at all similar to the mix of > spam you get to real accounts. The scanner would end up biased > towards whatever junk the spammers desperate enough to use > dictionaries send, which would drown out the stats from those spams > that are actually difficult to detect. > Ok, very valid point that makes alot of sense. Thank you. > > Why do you accept messages for non existent accounts? You're wasting > bandwidth, regardless of what you do or don't do with the junk after > you accept it. From the sound of it you could reduce your mail > bandwidth to a tiny fraction of what it is now by just refusing this > stuff (which is what most everyone else does, AFAIK). > How do you do it on MX hosts? I realize that if I stop the wildcard acceptance and stop copying errors to postmaster that I can do it on the destination server. However, due to circumstances out of my control for the next few months, all email arrives to the main mail server via MXs ONLY. Thanks, Tuc
Re: How to report 120,000 spams
Tuc at T-B-O-H wrote: Tuc at T-B-O-H.NET wrote: I guess I'm still not being clear. There are 120K emails a day coming to INVALID EMAIL ADDRESSES THAT NEVER EXISTED. Its not a case of a user being fickle, its a case that they are emailing addresses that NEVER EVER ACTUALLY EXISTED. About 1 ever 3/4 of a second. So running them through ANYTHING is counter productive since , atleast in my eyes, if you try to email an email address that never existed... ITS SPAM. Its not things the user ever sees/knows, etc. I have in my sendmail virtusertable: [EMAIL PROTECTED] bingo [EMAIL PROTECTED] bango [EMAIL PROTECTED] bongo [EMAIL PROTECTED] irving [EMAIL PROTECTED] nobody The user doesn't even SEE the emails, and processing what they consider spam I really don't care about. But getting 120K emails to *@ that are absolutely known spam... I would like to help the community out by reporting them to every system possible. Yea, if the added benefit is the mail that bingo, bango, bongo and irving gets filtered a little better... I won't complain at all. Tuc Just because mail goes to invalid addresses does not mean it is spam. people do mistype addresses some time. so this "corpus" is not safe. Yes, I realize people mistype email addresses. But the domain gets 121,000 emails on an average day. Of those 121,000 emails a day, 120,000 are to email addresses that aren't of the 4 known/valid/acceptable ones. What percentage would you like to use of emails that are sent are mistyped. One out of 1000? That means 121 invalid email addresses a day? But the other 999 of 1000 aren't valid... Of the other 1000 that ARE to the 4 known/valid/acceptable email addresses, about 900 of them are marked by SA as a spam level over 5. Usually WILDLY over 5, like 20's and 30's. Of those 100 delivered, 75 of them are rejected by the spam filter (Using a method that violates the standard RFC's according to sendmail) of the "final destination" for all 4 of those email boxes (Yes, bingo, bango, bongo, irving actually all end up forwarded to [EMAIL PROTECTED]). Of the 25 that make it through, the user tells me 15 of them are usually spam. So, 10 VALID/ACCEPTABLE emails a day out of 121,000 emails received a day .. Or 8 THOUSANDS OF A SINGLE PERCENT. So, while I definitely don't think people can type bingo, bango, bongo, irving correctly 100% of the time, with a valid email ratio of 8 thousands of a percent, I don't think in the grand scheme of things that mistyped email addresses really account for much/any. The same argument applies to mail to valid addresses (bingo, bango, ...) as well. would you like to use all your mail as a spam corpus? after all, you get only 10 out of 12 messages to these addresses :) anyway, you'll have to make your mind. N spam messages is not the same thing as N probable spam messages, even if the probablity is 0.999 (with a finite number of 9s). if the probability is not 1 (exactly), then the corpus is polluted. It may be statistically good, but that's not always good. The worst part of this story is that you may be silently (and "frivoulously") discarding legitimate mail, which is not very nice (if I mistype an address in the said domain, my mail gets dropped and I don't have a chance to fix my typo...). Do yourself and others a favour and find a way to reject these at smtp time. if you want to trap some spam, use carefully selected addresses.
Re: How to report 120,000 spams
On Sun, Mar 9, 2008 at 8:53 PM, Tuc at T-B-O-H <[EMAIL PROTECTED]> wrote: > > > > Tuc at T-B-O-H.NET wrote: > > > I guess I'm still not being clear. There are 120K emails a day coming > > > to INVALID EMAIL ADDRESSES THAT NEVER EXISTED. Its not a case of a user > being > > > fickle, its a case that they are emailing addresses that NEVER EVER > ACTUALLY > > > EXISTED. About 1 ever 3/4 of a second. So running them through ANYTHING > is > > > counter productive since , atleast in my eyes, if you try to email an > email > > > address that never existed... ITS SPAM. Its not things the user ever > sees/knows, > > > etc. I have in my sendmail virtusertable: > > > > > > [EMAIL PROTECTED] bingo > > > [EMAIL PROTECTED] bango > > > [EMAIL PROTECTED] bongo > > > [EMAIL PROTECTED] irving > > > [EMAIL PROTECTED] nobody > > > > > > The user doesn't even SEE the emails, and processing what they > consider > > > spam I really don't care about. But getting 120K emails to *@ that are > absolutely > > > known spam... I would like to help the community out by reporting them > to every > > > system possible. Yea, if the added benefit is the mail that bingo, > bango, bongo > > > and irving gets filtered a little better... I won't complain at all. > > > > > > Tuc > > > > > > > Just because mail goes to invalid addresses does not mean it is spam. > > people do mistype addresses some time. so this "corpus" is not safe. > > > Yes, I realize people mistype email addresses. But the domain gets > 121,000 emails on an average day. > > Of those 121,000 emails a day, 120,000 are to email addresses that > aren't of the 4 known/valid/acceptable ones. What percentage would you like > to use of emails that are sent are mistyped. One out of 1000? That means > 121 invalid email addresses a day? But the other 999 of 1000 aren't valid... > > Of the other 1000 that ARE to the 4 known/valid/acceptable email > addresses, about 900 of them are marked by SA as a spam level over 5. > Usually WILDLY over 5, like 20's and 30's. > > Of those 100 delivered, 75 of them are rejected by the spam > filter (Using a method that violates the standard RFC's according to > sendmail) of the "final destination" for all 4 of those email boxes (Yes, > bingo, bango, bongo, irving actually all end up forwarded to > [EMAIL PROTECTED]). > > Of the 25 that make it through, the user tells me 15 of them are > usually spam. > > So, 10 VALID/ACCEPTABLE emails a day out of 121,000 emails received > a day .. Or 8 THOUSANDS OF A SINGLE PERCENT. > > So, while I definitely don't think people can type bingo, bango, > bongo, irving correctly 100% of the time, with a valid email ratio of 8 > thousands of a percent, I don't think in the grand scheme of things > that mistyped email addresses really account for much/any. > > Tuc > If you are proposing some kind of checksums or other types of 'message identifying' techniques on the messages, those few mistyped addresses could certainly make a difference for your site. What if bongo's mom mistypes to bungo, realizes her mistake and resends it to bongo a few minutes later. It is quite likely that the valid message will be rejected now since it's (almost) identical to the one your proposed system just marked as spam. What if bongo signs up for the a mailing list and mistypes his own email address (yes, this happens). Now your system marks all list mailings as spam, so everyone using your system starts losing their copies of the mailing list messages too? I think you have good intentions but the source of your data is flawed for anything but maybe limited statistical training. Unfortunately it probably is not great for that either, since the mail you are seeing for non existent users is probably not at all similar to the mix of spam you get to real accounts. The scanner would end up biased towards whatever junk the spammers desperate enough to use dictionaries send, which would drown out the stats from those spams that are actually difficult to detect. Why do you accept messages for non existent accounts? You're wasting bandwidth, regardless of what you do or don't do with the junk after you accept it. From the sound of it you could reduce your mail bandwidth to a tiny fraction of what it is now by just refusing this stuff (which is what most everyone else does, AFAIK). -Aaron
Re: How to report 120,000 spams
> > Tuc at T-B-O-H.NET wrote: > > I guess I'm still not being clear. There are 120K emails a day coming > > to INVALID EMAIL ADDRESSES THAT NEVER EXISTED. Its not a case of a user > > being > > fickle, its a case that they are emailing addresses that NEVER EVER ACTUALLY > > EXISTED. About 1 ever 3/4 of a second. So running them through ANYTHING is > > counter productive since , atleast in my eyes, if you try to email an email > > address that never existed... ITS SPAM. Its not things the user ever > > sees/knows, > > etc. I have in my sendmail virtusertable: > > > > [EMAIL PROTECTED] bingo > > [EMAIL PROTECTED] bango > > [EMAIL PROTECTED] bongo > > [EMAIL PROTECTED] irving > > [EMAIL PROTECTED] nobody > > > > The user doesn't even SEE the emails, and processing what they consider > > spam I really don't care about. But getting 120K emails to *@ that are > > absolutely > > known spam... I would like to help the community out by reporting them to > > every > > system possible. Yea, if the added benefit is the mail that bingo, bango, > > bongo > > and irving gets filtered a little better... I won't complain at all. > > > > Tuc > > > > Just because mail goes to invalid addresses does not mean it is spam. > people do mistype addresses some time. so this "corpus" is not safe. > Yes, I realize people mistype email addresses. But the domain gets 121,000 emails on an average day. Of those 121,000 emails a day, 120,000 are to email addresses that aren't of the 4 known/valid/acceptable ones. What percentage would you like to use of emails that are sent are mistyped. One out of 1000? That means 121 invalid email addresses a day? But the other 999 of 1000 aren't valid... Of the other 1000 that ARE to the 4 known/valid/acceptable email addresses, about 900 of them are marked by SA as a spam level over 5. Usually WILDLY over 5, like 20's and 30's. Of those 100 delivered, 75 of them are rejected by the spam filter (Using a method that violates the standard RFC's according to sendmail) of the "final destination" for all 4 of those email boxes (Yes, bingo, bango, bongo, irving actually all end up forwarded to [EMAIL PROTECTED]). Of the 25 that make it through, the user tells me 15 of them are usually spam. So, 10 VALID/ACCEPTABLE emails a day out of 121,000 emails received a day .. Or 8 THOUSANDS OF A SINGLE PERCENT. So, while I definitely don't think people can type bingo, bango, bongo, irving correctly 100% of the time, with a valid email ratio of 8 thousands of a percent, I don't think in the grand scheme of things that mistyped email addresses really account for much/any. Tuc
Re: [spamassassin] Re: [spamassassin] Re: How to report 120,000 spams a day
Tuc at T-B-O-H.NET wrote: I guess I'm still not being clear. There are 120K emails a day coming to INVALID EMAIL ADDRESSES THAT NEVER EXISTED. Its not a case of a user being fickle, its a case that they are emailing addresses that NEVER EVER ACTUALLY EXISTED. About 1 ever 3/4 of a second. So running them through ANYTHING is counter productive since , atleast in my eyes, if you try to email an email address that never existed... ITS SPAM. Its not things the user ever sees/knows, etc. I have in my sendmail virtusertable: [EMAIL PROTECTED] bingo [EMAIL PROTECTED] bango [EMAIL PROTECTED] bongo [EMAIL PROTECTED] irving [EMAIL PROTECTED] nobody The user doesn't even SEE the emails, and processing what they consider spam I really don't care about. But getting 120K emails to *@ that are absolutely known spam... I would like to help the community out by reporting them to every system possible. Yea, if the added benefit is the mail that bingo, bango, bongo and irving gets filtered a little better... I won't complain at all. Tuc Just because mail goes to invalid addresses does not mean it is spam. people do mistype addresses some time. so this "corpus" is not safe.
Re: [spamassassin] RE: [spamassassin] Re: [spamassassin] Re: How to report 120,000 spams a day
> > > Hi, > > > > Thanks for the reply. In as much as I'd like to help the community, > > I'm under a set of constraints. Starting a whole other server to start > > doing > > this isn't something that fits under those constraints. It looks like > > I'll probably just end up having to /dev/null them as I have been. > > > > Tuc > > Tuc > > Didn't it come out that you were accepting emails to any email address > whether it is a valid email address or not? > > If so, that is where to start... > > do not accept those emails... reject them properly. > > - rh > There are "considerations" in doing this. Right now, all my systems are set up running sendmail, and all with the config of : define(`confCOPY_ERRORS_TO',`Postmaster') As such, true to its name, anytime there is an error, the postmaster gets a copy. 120K copies of The original message was received at Sun, 9 Mar 2008 15:12:41 -0400 (EDT) from pD9E3AE30.dip.t-dialin.net [217.227.174.48] - The following addresses had permanent fatal errors - <[EMAIL PROTECTED]> (reason: 550 5.0.0 <[EMAIL PROTECTED]>... No such user here) - Transcript of session follows - ... while talking to smtp.example.com.: >>> DATA <<< 550 5.0.0 <[EMAIL PROTECTED]>... No such user here 550 5.1.1 <[EMAIL PROTECTED]>... User unknown <<< 503 5.0.0 Need RCPT (recipient) - Message header follows - isn't acceptable. Yes, I could take out the COPY_ERRORS_TO, but we also run alot of things that are piped to programs, and we usually don't see the errors unless that is set. If there is some way to have my errors copied to me, but "User unknown" not, then I'll implement it. My way of preventing it from happening, but still seeing my errors, was to /dev/null addresses that don't exist. I could have the COPY_ERRORS_TO sent to a special user that uses procmail to weed them out, but then it defeats my attempts to reduce disk space wear and tear, CPU, etc. Even if I did that, though, the next thing I run into is MX's. The MX blindly accepts the mail. If the destination server rejects it, then usually the original sender is forged or invalid, etc. That then causes a mail spool backup on the MX host until it then errors out after 5 days of inability to make its delivery. I'd love to take advantage of some functionality ZoneEdit (My DNS provider) gives and letting them scan and forward the email. However, with the amount of emails and databits it is, I think the cost would be more than I care to pay given its a "favor" account. (Also why setting up another server doesn't make sense.) Tuc
RE: [spamassassin] Re: [spamassassin] Re: How to report 120,000 spams a day
> Hi, > > Thanks for the reply. In as much as I'd like to help the community, > I'm under a set of constraints. Starting a whole other server to start > doing > this isn't something that fits under those constraints. It looks like > I'll probably just end up having to /dev/null them as I have been. > > Tuc Tuc Didn't it come out that you were accepting emails to any email address whether it is a valid email address or not? If so, that is where to start... do not accept those emails... reject them properly. - rh
Re: [spamassassin] Re: [spamassassin] Re: How to report 120,000 spams a day
At 11:01 09-03-2008, Tuc at T-B-O-H.NET wrote: I guess I'm still not being clear. There are 120K emails a day coming to INVALID EMAIL ADDRESSES THAT NEVER EXISTED. Its not a case of a user being fickle, its a case that they are emailing addresses that NEVER EVER ACTUALLY EXISTED. About 1 ever 3/4 of a second. So running them through ANYTHING is counter productive since , atleast in my eyes, if you try to email an email address that never existed... ITS SPAM. Its not things the user ever sees/knows, I see delivery attempts to invalid email address regularly. They get rejected at the SMTP level. Running such messages through SpamAssassin doesn't make sense. Your previous message mentioned that you wanted to report these "spam" messages and my reply was based upon that. etc. I have in my sendmail virtusertable: [EMAIL PROTECTED] bingo [EMAIL PROTECTED] bango [EMAIL PROTECTED] bongo [EMAIL PROTECTED] irving [EMAIL PROTECTED] nobody The above is incorrect as there is still a processing overhead. I suggest using: @example.com error:nouser User unknown Regards, -sm
Re: [spamassassin] Re: [spamassassin] Re: How to report 120,000 spams a day
> > Automatic reporting - that's another thing entirely. As was pointed out in > previous replys, the user > community is not always accurate in reporting what is legit spam, and what > is/was requested > or "permitted". I tend to report manually, although I am writing some code > to semi-automate the > process. The program picks out domains, TLDs in URLs and IP addresses (in > spam), puts them in edit > windows, and then allows me to view the message. At this point, I can click > a button to report the > offending hosts/ips/etc. or not. But, it is semi-manual and therefore > involves time. The tradeoff is > accurate reporting to the various block lists. > I guess I'm still not being clear. There are 120K emails a day coming to INVALID EMAIL ADDRESSES THAT NEVER EXISTED. Its not a case of a user being fickle, its a case that they are emailing addresses that NEVER EVER ACTUALLY EXISTED. About 1 ever 3/4 of a second. So running them through ANYTHING is counter productive since , atleast in my eyes, if you try to email an email address that never existed... ITS SPAM. Its not things the user ever sees/knows, etc. I have in my sendmail virtusertable: [EMAIL PROTECTED] bingo [EMAIL PROTECTED] bango [EMAIL PROTECTED] bongo [EMAIL PROTECTED] irving [EMAIL PROTECTED] nobody The user doesn't even SEE the emails, and processing what they consider spam I really don't care about. But getting 120K emails to *@ that are absolutely known spam... I would like to help the community out by reporting them to every system possible. Yea, if the added benefit is the mail that bingo, bango, bongo and irving gets filtered a little better... I won't complain at all. Tuc
Re: [spamassassin] Re: [spamassassin] Re: How to report 120,000 spams a day
> > At 17:51 08-03-2008, Tuc at T-B-O-H.NET wrote: > > As part of it all, I also want to try to keep disk usage and CPU > >down to as little as possible. With 120,000 per day, thats a junk mail > >every 3/4's of a second. Since I have it set to deliver to /dev/null, I > >reduce the amount of disk usage. I'm looking for a solution that would be > >easy on the disk and easy on the CPU. So something directly out of the MTA > >would be great (sendmail) or something that the delivery would not store > >it locally. > > Rewrite the recipient address of these emails to another > address. That should reduce disk usage on that server and filtering > load. You can run the reporting on another server. It can be done > hourly by processing the mailbox instead of one message at a > time. That would require some code changes. > > Regards, > -sm > Hi, Thanks for the reply. In as much as I'd like to help the community, I'm under a set of constraints. Starting a whole other server to start doing this isn't something that fits under those constraints. It looks like I'll probably just end up having to /dev/null them as I have been. Tuc
Re: [spamassassin] Re: How to report 120,000 spams a day
At 17:51 08-03-2008, Tuc at T-B-O-H.NET wrote: As part of it all, I also want to try to keep disk usage and CPU down to as little as possible. With 120,000 per day, thats a junk mail every 3/4's of a second. Since I have it set to deliver to /dev/null, I reduce the amount of disk usage. I'm looking for a solution that would be easy on the disk and easy on the CPU. So something directly out of the MTA would be great (sendmail) or something that the delivery would not store it locally. Rewrite the recipient address of these emails to another address. That should reduce disk usage on that server and filtering load. You can run the reporting on another server. It can be done hourly by processing the mailbox instead of one message at a time. That would require some code changes. Regards, -sm
Re: [spamassassin] Re: How to report 120,000 spams a day
Tuc at T-B-O-H.NET wrote: > >> >> On 08.03.08 18:28, Tuc at T-B-O-H wrote: >> > > Our mail server receives about 128K emails a day. Of >> > > those, 120K are absolutely known spam so I don't even run >> > > them through spamassassin. Of the 8K left, 6K are determined >> > > to be spams, and 2K are considered "good". >> > > >> > > I'm wondering if there is some way to help the >> > > community (and, admittedly, ourselves) to somehow process >> > > and report those spams to various databases. For the >> > > smaller users, I've implemented the SiteWideRazor and >> > > use procmail to save off their spams to "probably-spam" >> > > and process them through "spamassassin -r" once an hour. >> > > >> > > For our bigger ones, though, so as not to wear >> > > a hole in the disk drive, I wondered if there were any >> > > suggestions what to do. >> >> >Anyone?? >> >> afaik razor requires manual reporting, not anything automatic. Also note >> that some people tend to mark as "spam" anything they don't like, even >> mailing lists they have subscribed to (but are unable to unsubscribe - >> this >> if very common form of dumbness) >> >> You can run DCC server which does something similar but is completely >> automated. >> > Hi, > > Thanks for the reply. > > I have a feeling that I'm not explaining myself well enough given > this and private replies I've received. > > I am mail hosting for a domain, we'll call it example.com . There > are, and have only been 4 VALID email addresses for example.com such as : > > [EMAIL PROTECTED] > [EMAIL PROTECTED] > [EMAIL PROTECTED] > [EMAIL PROTECTED] > > Those come in, get scanned by SA, and the ones we think are good > enough we pass along to the owners email address on his local ISP > (Hughes.net, > who has their email processed by Tucows's securehostedemail.com that > violates > RFC's and causes sendmail to pump out kernel based messages which I can't > get > anyone there to listen to!). > > In the mean time, anything that isn't going to bingo, bango, bongo > or irving is sent straight to /dev/null from the MTA. Its these messages > that > go straight to /dev/null that I'd like to somehow get processed into > something > useful for the community. Its not the result of a user getting an email > from > examplemacys.com, and saying "Well, I did subscribe, but I have no need > for > their shoe sale this week, I call "SPAM" ". These are messages to > email > addresses at example.com that were NEVER legit email addresses. > > As part of it all, I also want to try to keep disk usage and CPU > down to as little as possible. With 120,000 per day, thats a junk mail > every 3/4's of a second. Since I have it set to deliver to /dev/null, I > reduce the amount of disk usage. I'm looking for a solution that would be > easy on the disk and easy on the CPU. So something directly out of the > MTA > would be great (sendmail) or something that the delivery would not store > it locally. > > I'm concerned if I set up another user, who has a .procmailrc to > send it directly to "spamassassin -r" that it start spawning off way too > many processes, too many perl invocations, etc. Same for piping to > razor-report (And it only benefits razor, no one else). > > I thought DCC was running on this system, but it appears not. I'll > have to check why and get it running. I thought it was just another > database > for SA to check, I'll have to read more about it. Thanks. > > Tuc > > Thanks, Tuc > > > Wow! You receive a LOT of spam. I manage a site which, for today (so far - there's an hour left), we have blocked 139,980 spam emails !!! And, this is down from what we used to get. The problems you are dealing with - disk space, resource usage, etc. is why we finally resorted to writing a spam blocker (in C - no perl) that blocks the spam at the SMTP protocol level (there is another topic titled "Yet another spam blocker" which discusses this) and never lets the messages make it to the disk at all. There are also other advantages to blocking at protocol. Automatic reporting - that's another thing entirely. As was pointed out in previous replys, the user community is not always accurate in reporting what is legit spam, and what is/was requested or "permitted". I tend to report manually, although I am writing some code to semi-automate the process. The program picks out domains, TLDs in URLs and IP addresses (in spam), puts them in edit windows, and then allows me to view the message. At this point, I can click a button to report the offending hosts/ips/etc. or not. But, it is semi-manual and therefore involves time. The tradeoff is accurate reporting to the various block lists. I wish I had a better answer for you! Regards, Steve -- View this message in context: http://www.nabble.com/How-to-report-120%2C000-spams-a-day-tp15857111p15923807.html Sent from the SpamAssas
Re: [spamassassin] Re: How to report 120,000 spams a day
-- Michael Scheidell, CTO >|SECNAP Network Security Winner 2008 Network Products Guide Hot Companies FreeBsd SpamAssassin Ports maintainer Charter member, ICSA labs anti-spam consortium > From: "Tuc at T-B-O-H.NET" <[EMAIL PROTECTED]> > Date: Sat, 8 Mar 2008 19:51:49 -0500 (EST) > To: Matus UHLAR - fantomas <[EMAIL PROTECTED]> > Cc: > Subject: Re: [spamassassin] Re: How to report 120,000 spams a day > >> >> On 08.03.08 18:28, Tuc at T-B-O-H wrote: > > Thanks for the reply. > > I have a feeling that I'm not explaining myself well enough given > this and private replies I've received. > > I am mail hosting for a domain, we'll call it example.com . There > are, and have only been 4 VALID email addresses for example.com such as : > > [EMAIL PROTECTED] > [EMAIL PROTECTED] > [EMAIL PROTECTED] > [EMAIL PROTECTED] > > Those come in, get scanned by SA, and the ones we think are good > enough we pass along to the owners email address on his local ISP (Hughes.net, > who has their email processed by Tucows's securehostedemail.com that violates > RFC's and causes sendmail to pump out kernel based messages which I can't get > anyone there to listen to!). > > In the mean time, anything that isn't going to bingo, bango, bongo Then you need to enlist the paid services of an email consultant since you have things totally fsucked up and no amount of reporting is going to help you or your client, and no, 'the community' doesn't want more copies of the same zombot spam that we all get al day long. (and, sorry, abut 120,000 emails a day for 4 users? At, what, 99.99% spam ratio? Maybe if you started to drop the emails to unknown users it would never have gotten that bad) > or irving is sent straight to /dev/null from the MTA. Its these messages that > go straight to /dev/null that I'd like to somehow get processed into something Don't send them to dev null, don't accept them. By accepting them you are wasting bandwidth, and will waste it more trying to report it. > I thought DCC was running on this system, but it appears not. I'll > have to check why and get it running. I thought it was just another database > for SA to check, I'll have to read more about it. Thanks. At 120,000 per day, you are required to run a local DCC server, and running a local DCC server is the only way to process that much email. Trying to push 120,000 emails a day through the overburdened public servers will most likely, eventually get your ip address blacklisted, if it isn't already. _ This email has been scanned and certified safe by SpammerTrap(tm). For Information please see http://www.spammertrap.com _
Re: [spamassassin] Re: How to report 120,000 spams a day
> > On 08.03.08 18:28, Tuc at T-B-O-H wrote: > > > Our mail server receives about 128K emails a day. Of > > > those, 120K are absolutely known spam so I don't even run > > > them through spamassassin. Of the 8K left, 6K are determined > > > to be spams, and 2K are considered "good". > > > > > > I'm wondering if there is some way to help the > > > community (and, admittedly, ourselves) to somehow process > > > and report those spams to various databases. For the > > > smaller users, I've implemented the SiteWideRazor and > > > use procmail to save off their spams to "probably-spam" > > > and process them through "spamassassin -r" once an hour. > > > > > > For our bigger ones, though, so as not to wear > > > a hole in the disk drive, I wondered if there were any > > > suggestions what to do. > > > Anyone?? > > afaik razor requires manual reporting, not anything automatic. Also note > that some people tend to mark as "spam" anything they don't like, even > mailing lists they have subscribed to (but are unable to unsubscribe - this > if very common form of dumbness) > > You can run DCC server which does something similar but is completely > automated. > Hi, Thanks for the reply. I have a feeling that I'm not explaining myself well enough given this and private replies I've received. I am mail hosting for a domain, we'll call it example.com . There are, and have only been 4 VALID email addresses for example.com such as : [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] Those come in, get scanned by SA, and the ones we think are good enough we pass along to the owners email address on his local ISP (Hughes.net, who has their email processed by Tucows's securehostedemail.com that violates RFC's and causes sendmail to pump out kernel based messages which I can't get anyone there to listen to!). In the mean time, anything that isn't going to bingo, bango, bongo or irving is sent straight to /dev/null from the MTA. Its these messages that go straight to /dev/null that I'd like to somehow get processed into something useful for the community. Its not the result of a user getting an email from examplemacys.com, and saying "Well, I did subscribe, but I have no need for their shoe sale this week, I call "SPAM" ". These are messages to email addresses at example.com that were NEVER legit email addresses. As part of it all, I also want to try to keep disk usage and CPU down to as little as possible. With 120,000 per day, thats a junk mail every 3/4's of a second. Since I have it set to deliver to /dev/null, I reduce the amount of disk usage. I'm looking for a solution that would be easy on the disk and easy on the CPU. So something directly out of the MTA would be great (sendmail) or something that the delivery would not store it locally. I'm concerned if I set up another user, who has a .procmailrc to send it directly to "spamassassin -r" that it start spawning off way too many processes, too many perl invocations, etc. Same for piping to razor-report (And it only benefits razor, no one else). I thought DCC was running on this system, but it appears not. I'll have to check why and get it running. I thought it was just another database for SA to check, I'll have to read more about it. Thanks. Tuc Thanks, Tuc
Re: How to report 120,000 spams a day
On 08.03.08 18:28, Tuc at T-B-O-H wrote: > > Our mail server receives about 128K emails a day. Of > > those, 120K are absolutely known spam so I don't even run > > them through spamassassin. Of the 8K left, 6K are determined > > to be spams, and 2K are considered "good". > > > > I'm wondering if there is some way to help the > > community (and, admittedly, ourselves) to somehow process > > and report those spams to various databases. For the > > smaller users, I've implemented the SiteWideRazor and > > use procmail to save off their spams to "probably-spam" > > and process them through "spamassassin -r" once an hour. > > > > For our bigger ones, though, so as not to wear > > a hole in the disk drive, I wondered if there were any > > suggestions what to do. > Anyone?? afaik razor requires manual reporting, not anything automatic. Also note that some people tend to mark as "spam" anything they don't like, even mailing lists they have subscribed to (but are unable to unsubscribe - this if very common form of dumbness) You can run DCC server which does something similar but is completely automated. -- Matus UHLAR - fantomas, [EMAIL PROTECTED] ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. How does cat play with mouse? cat /dev/mouse
Re: How to report 120,000 spams a day
> > Hi, > > Our mail server receives about 128K emails a day. Of > those, 120K are absolutely known spam so I don't even run > them through spamassassin. Of the 8K left, 6K are determined > to be spams, and 2K are considered "good". > > I'm wondering if there is some way to help the > community (and, admittedly, ourselves) to somehow process > and report those spams to various databases. For the > smaller users, I've implemented the SiteWideRazor and > use procmail to save off their spams to "probably-spam" > and process them through "spamassassin -r" once an hour. > > For our bigger ones, though, so as not to wear > a hole in the disk drive, I wondered if there were any > suggestions what to do. > > Thanks, Tuc > Anyone?? Thanks, Tuc
How to report 120,000 spams a day
Hi, Our mail server receives about 128K emails a day. Of those, 120K are absolutely known spam so I don't even run them through spamassassin. Of the 8K left, 6K are determined to be spams, and 2K are considered "good". I'm wondering if there is some way to help the community (and, admittedly, ourselves) to somehow process and report those spams to various databases. For the smaller users, I've implemented the SiteWideRazor and use procmail to save off their spams to "probably-spam" and process them through "spamassassin -r" once an hour. For our bigger ones, though, so as not to wear a hole in the disk drive, I wondered if there were any suggestions what to do. Thanks, Tuc