sa ignoring whitelist_from in user_prefs
For a particular user, I'm finding no correlation between his whitelist_from's in user_prefs and the whitelist status as reported in incoming messages. I see messages with no USER_IN_WHITELIST when both the From and From: addresses match a whitelist_from line in the user_prefs file. I also see messages with USER_IN_WHITELIST but that userid is NOT listed in the user_prefs. What could cause this?? Thanks, Rich
whitelist problems
I'm having a whitelist-related problem. -a lot of spam comes through with WHITELISTED in the headers, yet i can never find the senders, IPs, etc of said messages in any whitelists, including the auto-whitelist. -auto-whitelist is in use although I've disabled it everywhere. I'm running spamassassin 3.1.7 through amavis on redhat. Besides /etc/amavisd.conf, /etc/mail/spamassassin/*, amavis user home/.spamassassin/*, where else could there possibly be another whitelist? I've searched through all the conf files for alternate places but couldn't find any. Why is auto-whitelist in use? I've explicitly disabled it in /etc/amavisd.conf, /etc/mail/spamassassin/v310.pre, /etc/mail/spamassassin/local.cf. amavis user home/.spamassassin/user_prefs is empty. But still the auto-whitelist file continues to grow and mails occasionally get AWL in their headers.
Re: IADB, 70_iadb.cf and multiple A records returned
On Sat, 2007-02-10 at 00:00 -0500, Theo Van Dinter wrote: O If the last one is true, is the ^ $ really necessary? [...] If it really is a RE, what preventes '127.0.0.1' to not match 127.0.0.10? Or 127.1.0.1 to not match 127.120.1.1 ? You answered your own question. :) Ok, this answers the first one. This also implies that the sub-test values is always a RE and needs to be proper delimeted. So, in the following cases: header RCVD_IN_SBLeval:check_rbl_sub('sblxbl', '127.0.0.2') header RCVD_IN_MAPS_RSS eval:check_rbl_sub('rblplus', '4') (the last one is really commented out) If spamhaus decides expand the ruturn code and 127.0.0.20 becomes valid for something like this ip has an opt-in list, this rule would be broken, right? (sure, we dont expect this change to happen). -Raul Dias
pyzor error
pyzor stopped working on my fedora core 5 system. I get the following error: Traceback (most recent call last): File /usr/bin/pyzor, line 3, in ? import pyzor.client ImportError: No module named pyzor.client The contents of /usr/bin/pyzor are: #!/usr/bin/python import pyzor.client pyzor.client.run() What's going on and how do I fix this? Thank you in advance!
Re: pyzor error
At 06:04 AM Saturday, 2/10/2007, you wrote -= pyzor stopped working on my fedora core 5 system. I get the following error: Traceback (most recent call last): File /usr/bin/pyzor, line 3, in ? import pyzor.client ImportError: No module named pyzor.client The contents of /usr/bin/pyzor are: #!/usr/bin/python import pyzor.client pyzor.client.run() What's going on and how do I fix this? Thank you in advance! Did you check the wiki first? It's a wealth of information: If you get the following error message, define PYTHONPATH to point at ($HOME/lib/python): Traceback (most recent call last): File stdin, line 1, in ? ImportError: No module named pyzor.client Ed . . . . . . . . . . . . . . . . . . Randomly Generated Quote (1017 of 1172): Those who put out the people's eyes, reproach them for their blindness. -John Milton, poet (1608-1674)
Re: pyzor error
yes. the that error message is slightly different but in any case I do not understand what 'define PYTHONPATH to point at ($HOME/lib/python)' means (what/where/how). - Original Message - From: Ed Kasky [EMAIL PROTECTED] To: Webmaster [EMAIL PROTECTED] Cc: users@spamassassin.apache.org Sent: Saturday, February 10, 2007 9:19 AM Subject: Re: pyzor error At 06:04 AM Saturday, 2/10/2007, you wrote -= pyzor stopped working on my fedora core 5 system. I get the following error: Traceback (most recent call last): File /usr/bin/pyzor, line 3, in ? import pyzor.client ImportError: No module named pyzor.client The contents of /usr/bin/pyzor are: #!/usr/bin/python import pyzor.client pyzor.client.run() What's going on and how do I fix this? Thank you in advance! Did you check the wiki first? It's a wealth of information: If you get the following error message, define PYTHONPATH to point at ($HOME/lib/python): Traceback (most recent call last): File stdin, line 1, in ? ImportError: No module named pyzor.client Ed . . . . . . . . . . . . . . . . . . Randomly Generated Quote (1017 of 1172): Those who put out the people's eyes, reproach them for their blindness. -John Milton, poet (1608-1674) -- No virus found in this incoming message. Checked by AVG Free Edition. Version: 7.5.432 / Virus Database: 268.17.33/678 - Release Date: 2/9/2007 4:06 PM
Re: whitelist problems
urgrue wrote: I'm having a whitelist-related problem. -a lot of spam comes through with WHITELISTED in the headers, yet i can never find the senders, IPs, etc of said messages in any whitelists, including the auto-whitelist. -auto-whitelist is in use although I've disabled it everywhere. The auto-whitelist has nothing to do with anything that says WHITELISTED. The auto-whitelist will show up as a rule named AWL. Nothing else. That said, can you be VERY specific about what your headers say? Does it say USER_IN_WHITELIST? If so, check your whitelist_from and whitelist_from_rcvd entries. In particular, make sure you didn't do anything like the common mistake of whitelist_from [EMAIL PROTECTED]. Any spammer can trivially forge a From: or Return-Path header, and forging your own domain in these fields is a common tactic because spammers know many people make this mistake. I'm running spamassassin 3.1.7 through amavis on redhat. Besides /etc/amavisd.conf, /etc/mail/spamassassin/*, amavis user home/.spamassassin/*, where else could there possibly be another whitelist? I've searched through all the conf files for alternate places but couldn't find any. Why is auto-whitelist in use? I've explicitly disabled it in /etc/amavisd.conf, /etc/mail/spamassassin/v310.pre, /etc/mail/spamassassin/local.cf. amavis user home/.spamassassin/user_prefs is empty. But still the auto-whitelist file continues to grow and mails occasionally get AWL in their headers. Well, disabling the loadplugin should do it. That said, did you restart amavis after doing so? (these files only get parsed when a SA instance starts up, and amavis keeps its own perl-API based copy of SA in order to avoid the waste of calling out to external commands like spamc or spamassassin)
Startting spamassassin
Hi, I've just installed spamassassin. I'ts been a long time since i've installed the last mail server and i never used version 3. Ok, i've compiled it and copied spamd to /etc/init.d If i just run ./spamd start, it will run as root and stucks the terminal. So, i'm running ./spamd -u qscand start . Is there any place where i can configure the user qscan to be the user that spamassassin runs by default ? Any help would be appreciated. Warm Regards, Mário Gamito
Re: Startting spamassassin
On Sat, 10 Feb 2007 16:44:16 +, Mário Gamito [EMAIL PROTECTED] wrote: Hi, I've just installed spamassassin. I'ts been a long time since i've installed the last mail server and i never used version 3. Ok, i've compiled it and copied spamd to /etc/init.d If i just run ./spamd start, it will run as root and stucks the terminal. So, i'm running ./spamd -u qscand start . Is there any place where i can configure the user qscan to be the user that spamassassin runs by default ? Any help would be appreciated. Warm Regards, Mário Gamito From past experience it's usually easier to lob SA in from rpm/yum. I run it here on 3 servers and (knock on wood), this approach has yet to cause a problem. It's worth noting that one of the mail programs (who's name escapes me) installs SA; I pull that off as part of my setup since I don't use nix as a workstation so it has no reason to run a mail client. After install it's just a matter of running setup from the cl and enabling spamassassin (if it hasn't already been enabled). This will of course depend very much on exactly what flavour of nix you are running, your mailserver and various other things. I use CentOS and have been very pleased with it. Let me know if you need a step by step guide; I have one kicking about here somewhere from the 'old days' of FC3. Hope that helps. Nigel
Re: Startting spamassassin
Hi, I have spamassassin already 100% installed in a Linux server. I just want to know how to run it as user qscand without having to type ./spamd -u qscand start , so i can start it at boot time. Regards, Mário Gamito Nigel Frankcom wrote: On Sat, 10 Feb 2007 16:44:16 +, Mário Gamito [EMAIL PROTECTED] wrote: Hi, I've just installed spamassassin. I'ts been a long time since i've installed the last mail server and i never used version 3. Ok, i've compiled it and copied spamd to /etc/init.d If i just run ./spamd start, it will run as root and stucks the terminal. So, i'm running ./spamd -u qscand start . Is there any place where i can configure the user qscan to be the user that spamassassin runs by default ? Any help would be appreciated. Warm Regards, Mário Gamito From past experience it's usually easier to lob SA in from rpm/yum. I run it here on 3 servers and (knock on wood), this approach has yet to cause a problem. It's worth noting that one of the mail programs (who's name escapes me) installs SA; I pull that off as part of my setup since I don't use nix as a workstation so it has no reason to run a mail client. After install it's just a matter of running setup from the cl and enabling spamassassin (if it hasn't already been enabled). This will of course depend very much on exactly what flavour of nix you are running, your mailserver and various other things. I use CentOS and have been very pleased with it. Let me know if you need a step by step guide; I have one kicking about here somewhere from the 'old days' of FC3. Hope that helps. Nigel
Re: Startting spamassassin
On Sat, 10 Feb 2007 17:12:24 +, Mário Gamito [EMAIL PROTECTED] wrote: Hi, I have spamassassin already 100% installed in a Linux server. I just want to know how to run it as user qscand without having to type ./spamd -u qscand start , so i can start it at boot time. Regards, Mário Gamito Nigel Frankcom wrote: On Sat, 10 Feb 2007 16:44:16 +, Mário Gamito [EMAIL PROTECTED] wrote: Hi, I've just installed spamassassin. I'ts been a long time since i've installed the last mail server and i never used version 3. Ok, i've compiled it and copied spamd to /etc/init.d If i just run ./spamd start, it will run as root and stucks the terminal. So, i'm running ./spamd -u qscand start . Is there any place where i can configure the user qscan to be the user that spamassassin runs by default ? Any help would be appreciated. Warm Regards, Mário Gamito From past experience it's usually easier to lob SA in from rpm/yum. I run it here on 3 servers and (knock on wood), this approach has yet to cause a problem. It's worth noting that one of the mail programs (who's name escapes me) installs SA; I pull that off as part of my setup since I don't use nix as a workstation so it has no reason to run a mail client. After install it's just a matter of running setup from the cl and enabling spamassassin (if it hasn't already been enabled). This will of course depend very much on exactly what flavour of nix you are running, your mailserver and various other things. I use CentOS and have been very pleased with it. Let me know if you need a step by step guide; I have one kicking about here somewhere from the 'old days' of FC3. Hope that helps. Nigel I'm assuming you are running this under qmail? If I recall correctly www.qmailrocks.org has a decent section on getting SA working with qmail; though it's been a while since I tried. Hope that helps Kind regards Nigel
Re: whitelist problems
On Sat, 10 Feb 2007, Matt Kettler wrote: In particular, make sure you didn't do anything like the common mistake of whitelist_from [EMAIL PROTECTED]. Any spammer can trivially forge a From: or Return-Path header, and forging your own domain in these fields is a common tactic because spammers know many people make this mistake. Perhaps --lint should warn about whitelist_from being used at all... -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ [EMAIL PROTECTED]FALaholic #11174 pgpk -a [EMAIL PROTECTED] key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- End users want eye candy and the ooo's and hhh's experience when reading mail. To them email isn't a tool, but an entertainment form. -- Steve Lake --- 2 days until Abraham Lincoln's and Charles Darwin's 198th Birthdays
Re: Startting spamassassin
Mário Gamito wrote: Hi, I've just installed spamassassin. I'ts been a long time since i've installed the last mail server and i never used version 3. Ok, i've compiled it and copied spamd to /etc/init.d Don't do that. spamd isn't an init script. It's a binary executable. It belongs in /usr/sbin or similar. I would *STRONGLY* suggest using make install to put spamd where it belongs. If you need an init script, look in the spamd directory. There are several init scripts in there you can work from. (they all end in rc-script.sh) If i just run ./spamd start, it will run as root and stucks the terminal. Well, don't pass the word start to spamd for starters. That's something you pass to an init script. So, i'm running ./spamd -u qscand start . You got the -u part right... But don't use , use the -d option instead. However, all of this really should be rolled up in one of the provided init scripts. Is there any place where i can configure the user qscan to be the user that spamassassin runs by default ? Yes, the -u parameter to spamd. But you first need to get your system sorted out into something resembling sanity.
A New Approach: Find the Ham
I've developed a new approach to scoring that I want to 1) share with everyone and 2) make into a working system thats as accurate as what I've already built, but easier to use. First, the theory: SITUATION In the beginning, all email was ham. When spam came along, we left the ham alone and targeted the annoyance (spam). ASSUMPTION All messages are ham unless x,y,z score says they're spam. APPROACH Block nothing, then create rules to catch what you don't want. ie, build tests that target the spam, then score the millions of ways spam can occur. RESULT Huge time spent tuning and retuning weights, catching everything in sight (including much ham). NEW SITUATION Ham is now the tiniest minority of all email. NEW ASSUMPTION All messages are spam unless x,y,z score says they're ham. NEW APPROACH Block everything, then create rules to not catch what you do want. ie, build tests that target the spam (keeping all the tests you've already built), then score the thousands of ways ham triggers on those tests. NEW RESULT Spend less time and energy while catching more of what you do want and less of what you don't. CHALLENGE All filtering software is written to score for results that equal spam - catch the bad SOLUTION Make filtering software score for results that equal ham - uncatch the good. Your thoughts? Dan BTW, is there a better forum for this level of question?
RE: A New Approach: Find the Ham
From: Dan [mailto:[EMAIL PROTECTED] I've developed a new approach to scoring that I want to 1) share with everyone and 2) make into a working system thats as accurate as what I've already built, but easier to use. First, the theory: SITUATION In the beginning, all email was ham. When spam came along, we left the ham alone and targeted the annoyance (spam). ASSUMPTION All messages are ham unless x,y,z score says they're spam. APPROACH Block nothing, then create rules to catch what you don't want. ie, build tests that target the spam, then score the millions of ways spam can occur. RESULT Huge time spent tuning and retuning weights, catching everything in sight (including much ham). NEW SITUATION Ham is now the tiniest minority of all email. NEW ASSUMPTION All messages are spam unless x,y,z score says they're ham. NEW APPROACH Block everything, then create rules to not catch what you do want. ie, build tests that target the spam (keeping all the tests you've already built), then score the thousands of ways ham triggers on those tests. NEW RESULT Spend less time and energy while catching more of what you do want and less of what you don't. CHALLENGE All filtering software is written to score for results that equal spam - catch the bad SOLUTION Make filtering software score for results that equal ham - uncatch the good. Your thoughts? How can this method spend less time and energy? Aren't you going to build a mirrored method with respect to the actual one? Your rules wouldn't be like the actual ones, but negated? Giampaolo Dan BTW, is there a better forum for this level of question?
Re: A New Approach: Find the Ham
On Sat, 10 Feb 2007 20:52:17 +0100, Giampaolo Tomassoni [EMAIL PROTECTED] wrote: From: Dan [mailto:[EMAIL PROTECTED] I've developed a new approach to scoring that I want to 1) share with everyone and 2) make into a working system thats as accurate as what I've already built, but easier to use. First, the theory: SITUATION In the beginning, all email was ham. When spam came along, we left the ham alone and targeted the annoyance (spam). ASSUMPTION All messages are ham unless x,y,z score says they're spam. APPROACH Block nothing, then create rules to catch what you don't want. ie, build tests that target the spam, then score the millions of ways spam can occur. RESULT Huge time spent tuning and retuning weights, catching everything in sight (including much ham). NEW SITUATION Ham is now the tiniest minority of all email. NEW ASSUMPTION All messages are spam unless x,y,z score says they're ham. NEW APPROACH Block everything, then create rules to not catch what you do want. ie, build tests that target the spam (keeping all the tests you've already built), then score the thousands of ways ham triggers on those tests. NEW RESULT Spend less time and energy while catching more of what you do want and less of what you don't. CHALLENGE All filtering software is written to score for results that equal spam - catch the bad SOLUTION Make filtering software score for results that equal ham - uncatch the good. Your thoughts? How can this method spend less time and energy? Aren't you going to build a mirrored method with respect to the actual one? Your rules wouldn't be like the actual ones, but negated? Giampaolo Dan BTW, is there a better forum for this level of question? Dan has a good point; on the surface at least. spam now accounts for 80%+ of all mail, so why are we concentrating on that? At least the point is worth debate (IMHO). Can it be done? Even I can see that it can, given the right impetus. Though perhaps too many companies are making a good $/£/Y off anti-spam systems based on, around or directly using SA. Be interesting to see where this thread goes. Kind regards Nigel
Re: A New Approach: Find the Ham
CHALLENGE All filtering software is written to score for results that equal spam - catch the bad SOLUTION Make filtering software score for results that equal ham - uncatch the good. Your thoughts? How can this method spend less time and energy? Aren't you going to build a mirrored method with respect to the actual one? Your rules wouldn't be like the actual ones, but negated? Giampaolo Dan BTW, is there a better forum for this level of question? This would be easier to filter. It would also be more adaptive to a statistical approach than a regex approach. Personally, I think HTML email should be outright discarded from the start. If you look at this arguement presented by the OP then it reinforces the idea that most ascii is ham and most html is spam. Therefore, reject delivery of all html based email. Or to be more succinct -- reject any MIME type of alternative content or html only content. That would remove probably 90% of the spam in one shot.
Re: A New Approach: Find the Ham
Dan wrote: I've developed a new approach to scoring that I want to 1) share with everyone and 2) make into a working system thats as accurate as what I've already built, but easier to use. First, the theory: NEW ASSUMPTION All messages are spam unless x,y,z score says they're ham. NEW APPROACH Block everything, then create rules to not catch what you do want. ie, build tests that target the spam (keeping all the tests you've already built), then score the thousands of ways ham triggers on those tests. It strikes me that the hardest part of this approach is filtering out too much ham. At least for me, it's more important to make sure that people reach me, than to filter out all spam. If we take the approach that everything is to be filtered out, except x,y,z - then the risk of filtering out too much seems pretty high.
RE: A New Approach: Find the Ham
From: Tom Allison [mailto:[EMAIL PROTECTED] CHALLENGE All filtering software is written to score for results that equal spam - catch the bad SOLUTION Make filtering software score for results that equal ham - uncatch the good. Your thoughts? How can this method spend less time and energy? Aren't you going to build a mirrored method with respect to the actual one? Your rules wouldn't be like the actual ones, but negated? Giampaolo Dan BTW, is there a better forum for this level of question? This would be easier to filter. It would also be more adaptive to a statistical approach than a regex approach. Personally, I think HTML email should be outright discarded from the start. If you look at this arguement presented by the OP then it reinforces the idea that most ascii is ham and most html is spam. Therefore, reject delivery of all html based email. Or to be more succinct -- reject any MIME type of alternative content or html only content. That would remove probably 90% of the spam in one shot. Sending text/ascii e-mails may probably fit your habits and the ones from your contacts, but it would result in thrashing a lot of ham on larger userbases. Giampaolo
RE: A New Approach: Find the Ham
From: Tom Allison [mailto:[EMAIL PROTECTED] CHALLENGE All filtering software is written to score for results that equal spam - catch the bad SOLUTION Make filtering software score for results that equal ham - uncatch the good. Your thoughts? How can this method spend less time and energy? Aren't you going to build a mirrored method with respect to the actual one? Your rules wouldn't be like the actual ones, but negated? Giampaolo Dan BTW, is there a better forum for this level of question? This would be easier to filter. It would also be more adaptive to a statistical approach than a regex approach. Personally, I think HTML email should be outright discarded from the start. If you look at this arguement presented by the OP then it reinforces the idea that most ascii is ham and most html is spam. Therefore, reject delivery of all html based email. Or to be more succinct -- reject any MIME type of alternative content or html only content. That would remove probably 90% of the spam in one shot. Sending text/ascii e-mails may probably fit your habits and the ones from your contacts, but it would result in thrashing a lot of ham on larger userbases. Giampaolo
Re: A New Approach: Find the Ham
One consideration is that spam getting through is never more than an annoyance. Ham getting caught can be a big problem. So any kind of deny by default system has to deal with how to respond to people sending you mail that gets trapped and provide a way for the sender to get approval. How does one join the global whitelist and how does one prevent spammers from joining it? I dont think spam will ever go away until sending email costs money, via some kind of global digital stamp system. Which, frankly, i would welcome with open arms, but will probably never happen. Dan has a good point; on the surface at least. spam now accounts for 80%+ of all mail, so why are we concentrating on that? At least the point is worth debate (IMHO). Can it be done? Even I can see that it can, given the right impetus. Though perhaps too many companies are making a good $/£/Y off anti-spam systems based on, around or directly using SA. Be interesting to see where this thread goes. Kind regards Nigel
Re: A New Approach: Find the Ham
This would be easier to filter. It would also be more adaptive to a statistical approach than a regex approach. Personally, I think HTML email should be outright discarded from the start. If you look at this arguement presented by the OP then it reinforces the idea that most ascii is ham and most html is spam. Therefore, reject delivery of all html based email. Or to be more succinct -- reject any MIME type of alternative content or html only content. That would remove probably 90% of the spam in one shot. Yeah, for about a week. Obviously they wont keep sending HTML mail if everyone is blocking it, right?
Re: A New Approach: Find the Ham
On Sat, 10 Feb 2007 15:14:56 -0500, Miles Fidelman [EMAIL PROTECTED] wrote: Dan wrote: I've developed a new approach to scoring that I want to 1) share with everyone and 2) make into a working system thats as accurate as what I've already built, but easier to use. First, the theory: NEW ASSUMPTION All messages are spam unless x,y,z score says they're ham. NEW APPROACH Block everything, then create rules to not catch what you do want. ie, build tests that target the spam (keeping all the tests you've already built), then score the thousands of ways ham triggers on those tests. It strikes me that the hardest part of this approach is filtering out too much ham. At least for me, it's more important to make sure that people reach me, than to filter out all spam. If we take the approach that everything is to be filtered out, except x,y,z - then the risk of filtering out too much seems pretty high. These are my local stats... I'd far rather those numbers were the other way round. Even if Dan is wrong, at least he's thinking. http://www.blue-canoe.com/stats/index.php?D1=11 What do Theo, Matt Co have to say? They've been doing this a lot longer than us. Kind regards
RE: A New Approach: Find the Ham
From: Miles Fidelman [mailto:[EMAIL PROTECTED] Dan wrote: I've developed a new approach to scoring that I want to 1) share with everyone and 2) make into a working system thats as accurate as what I've already built, but easier to use. First, the theory: NEW ASSUMPTION All messages are spam unless x,y,z score says they're ham. NEW APPROACH Block everything, then create rules to not catch what you do want. ie, build tests that target the spam (keeping all the tests you've already built), then score the thousands of ways ham triggers on those tests. It strikes me that the hardest part of this approach is filtering out too much ham. At least for me, it's more important to make sure that people reach me, than to filter out all spam. If we take the approach that everything is to be filtered out, except x,y,z - then the risk of filtering out too much seems pretty high. I definitely agree with you. By the way, if Dan really brought a new perspective to us (i.e.: a new way to detect ham), what would stop us in integrating it into SA? I would like to see this new perspective, however... Giampaolo
Re: A New Approach: Find the Ham
Clarifications: 1) I'm not talking about generating new rules. Rules stay the same. I'm describing a new scoring process only. 2) This would not be a replacement to SA, but an improvement. Just a new way to process results already generated by SA. Ideally, this would be a replacement for weights and metas. Dan How can this method spend less time and energy? Aren't you going to build a mirrored method with respect to the actual one? Your rules wouldn't be like the actual ones, but negated? Giampaolo Dan has a good point; on the surface at least. spam now accounts for 80%+ of all mail, so why are we concentrating on that? At least the point is worth debate (IMHO). Can it be done? Even I can see that it can, given the right impetus. Though perhaps too many companies are making a good $/£/Y off anti-spam systems based on, around or directly using SA. Be interesting to see where this thread goes. Kind regards Nigel
Re: A New Approach: Find the Ham
Is that the same as whitelisting, maybe I do not understand, but a very rigorous approach would be a whitelist methodology which, once a new account is created, they send email to everyone they want to communicate with, and it 'autowhitelists' those addresses, so you can only receive from those you communicate with (or want to), i.e. the user will have to authorize the receipt of a message into the whitelist (that way the email address owner is soley responsible for what they receive). The main problem (although someone may be able to come up with an appropriate compromise), is that if everyone were using this methodology, how would one ever receive email? But nonetheless, since there is less ham than spam nowadays, it make more since to do what you are saying and deal with only the traffic the user wishes to see instead of that which they don't, seems the actual programming need to deal with this would be less stressful on machine resources as well. I.e. less resources would be consumed dealing with less incoming crap (er mail, I mean) Stop it at the connection... maybe a ulog plugin just a thought Miles Fidelman wrote: Dan wrote: I've developed a new approach to scoring that I want to 1) share with everyone and 2) make into a working system thats as accurate as what I've already built, but easier to use. First, the theory: NEW ASSUMPTION All messages are spam unless x,y,z score says they're ham. NEW APPROACH Block everything, then create rules to not catch what you do want. ie, build tests that target the spam (keeping all the tests you've already built), then score the thousands of ways ham triggers on those tests. It strikes me that the hardest part of this approach is filtering out too much ham. At least for me, it's more important to make sure that people reach me, than to filter out all spam. If we take the approach that everything is to be filtered out, except x,y,z - then the risk of filtering out too much seems pretty high.
Re: whitelist problems
The auto-whitelist has nothing to do with anything that says WHITELISTED. The auto-whitelist will show up as a rule named AWL. Nothing else. That said, can you be VERY specific about what your headers say? Does it say USER_IN_WHITELIST? If so, check your whitelist_from and whitelist_from_rcvd entries. It says, precisely: X-Spam-Status: No, hits=- tagged_above=-.0 required=5.0 WHITELISTED So if its not whitelist_from or the AWL, what can it be? Well, disabling the loadplugin should do it. That said, did you restart amavis after doing so? (these files only get parsed when a SA instance starts up, and amavis keeps its own perl-API based copy of SA in order to avoid the waste of calling out to external commands like spamc or spamassassin) AWL was never enabled, and I did restart amavis many times. The loadplugin line is commented out and I restarted amavis just now to make absolutely sure, deleted the auto-whitelist file, but back it came. I don't get it.
Re: whitelist problems
On Sat, Feb 10, 2007 at 10:34:35PM +0200, urgrue wrote: It says, precisely: X-Spam-Status: No, hits=- tagged_above=-.0 required=5.0 WHITELISTED So if its not whitelist_from or the AWL, what can it be? That's not an SA header, so I'm guessing you call SA from a third party daemon. I'd look there. AWL was never enabled, and I did restart amavis many times. The Aha. amavis... -- Randomly Selected Tagline: A closed mouth gathers no foot. pgp3eTSZvFfDp.pgp Description: PGP signature
Re: A New Approach: Find the Ham
On Feb 10, 2007, at 12:14, Miles Fidelman wrote: Dan wrote: I've developed a new approach to scoring that I want to 1) share with everyone and 2) make into a working system thats as accurate as what I've already built, but easier to use. First, the theory: NEW ASSUMPTION All messages are spam unless x,y,z score says they're ham. NEW APPROACH Block everything, then create rules to not catch what you do want. ie, build tests that target the spam (keeping all the tests you've already built), then score the thousands of ways ham triggers on those tests. It strikes me that the hardest part of this approach is filtering out too much ham. At least for me, it's more important to make sure that people reach me, than to filter out all spam. If we take the approach that everything is to be filtered out, except x,y,z - then the risk of filtering out too much seems pretty high. Actually, [unparalleled] accuracy is built into this approach. Currently, a ham gets caught and you either take out the rule that caught it or make a whitelist entry. Lots of ongoing work = little cumulative return With Find the Ham, whitelisting is almost obsolete. When you find an FP, you make an exception for the specific profile, the permutation of which tests/rules caught the message so this specific assembly doesn't catch any more. The rules stays at full strength for every other permutation and no whitelist is needed. This training process is the best part of the whole approach. It begins with huge FPs, but significant improvements take only a few weeks. A few months (depending on the diversity of your ham) and FPs are very very rare. Little ongoing work = huge cumulative return Dan
How to Scan just incoming not outcoming emails?
Hi: I have a Centos Linux, running Apache, Sendmail, Spam Assassin and MailScanner. This Server is POP as well as SMTP for all the mailboxes of my customers. Actually, the SpamAssassin at this Server filters both, the emails that are being received and the emails that are being sent. This is giving my Server a really heavy load. I think I don't have neither the need (nor the obligation) to filter agains spam the emails those mailboxes are sending. This is a task up to the users at their own desktops and networks. But I have to filter just what is being received. So, my question is: is it possible to set Sendmail / Spam Assassin in order filters just the receiving emails? If so, please, tell me what to do. But, please, tell me like a cooking recipe, because I am not quite experienced with operating systems. Thanks a lot. Mario./ -- No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.5.441 / Virus Database: 268.17.33/678 - Release Date: 9/2/2007 16:06 ___ Yahoo! Mail - Sempre a melhor opção para você! Experimente já e veja as novidades. http://br.yahoo.com/mailbeta/tudonovo/
Re: How to Scan just incoming not outcoming emails?
On Sat, Feb 10, 2007 at 07:42:55PM -0300, correiob wrote: So, my question is: is it possible to set Sendmail / Spam Assassin in order filters just the receiving emails? If so, please, tell me what to do. But, please, tell me like a cooking recipe, because I am not quite experienced with operating systems. Thanks a lot. Probably, but it's not a SpamAssassin question. You need to look at your setup and change it so that it only sends the mails you want scanned to SA. According to your mail, you're using MailScanner, so you can look at their docs/ask them, for more information. -- Randomly Selected Tagline: Clones are people two. pgpXLkwsXk1HO.pgp Description: PGP signature
Re: How to Scan just incoming not outcoming emails?
At 02:42 PM 2/10/2007, correiob wrote: Hi: I have a Centos Linux, running Apache, Sendmail, Spam Assassin and MailScanner. This Server is POP as well as SMTP for all the mailboxes of my customers. Actually, the SpamAssassin at this Server filters both, the emails that are being received and the emails that are being sent. This is giving my Server a really heavy load. I think I don't have neither the need (nor the obligation) to filter agains spam the emails those mailboxes are sending. This is a task up to the users at their own desktops and networks. But I have to filter just what is being received. So, my question is: is it possible to set Sendmail / Spam Assassin in order filters just the receiving emails? If so, please, tell me what to do. But, please, tell me like a cooking recipe, because I am not quite experienced with operating systems. Thanks a lot. This is more of a sendmail question, so if no one here can answer, you may look there. I run Sendmail with postfix and spamassassin and only my incoming mails are scanned. And as SpamAssassin only does what it's told, you've somehow told sendmail to scan outgoing e-mails. Perhaps something in your sendmail config file?
Re: A New Approach: Find the Ham
NEW SITUATION Ham is now the tiniest minority of all email. NEW ASSUMPTION All messages are spam unless x,y,z score says they're ham. NEW APPROACH Block everything, then create rules to not catch what you do want. ie, build tests that target the spam (keeping all the tests you've already built), then score the thousands of ways ham triggers on those tests. NEW RESULT Spend less time and energy while catching more of what you do want and less of what you don't. CHALLENGE All filtering software is written to score for results that equal spam - catch the bad SOLUTION Make filtering software score for results that equal ham - uncatch the good. Your thoughts? Here is my $0,02. I have a similar approach already. My problem is that 80% of the messages are in pt_BR, which makes a lot of the rules in SA that target english uneffective. There is a lot of grey area that have too much spam (FN) and ham (FP). So, my approach is to quarentine mail from some users a low as 4.0 (or even less). This mail is separated to an imap folder and then manually inspected to ham and spam folders. This let rules be created to catch spam, but also to catch ham (which is harder and dangerous ground). If necessary, white and black lists are created, but this is the last resource as it is not an affordable/scalable solution. The spam and ham folder is then trainned with sa-learn and the ham is given back to the user if necessary. This approach has a drawback. An explicity authorization of the user is necessary (in my view). So a user (if wants to help) may choose to let their mail be quarentined and then get it back, or let their mail (above 4.0 score) be analysed but not quarantined (just a copy is kept and it is not necessary to give back). A good side of this is that is not necessary lot of users to let their mail be analysed. The rules will improve for everyone based of a few users. Bayes also plays a more important rule than in a english environment, because of the lack of good rules in the native language. Site-wide Bayes is missed (per user is used), but would help separated the grey area even more for non monitored users or low volume users. in the scripts side I use Mail::IMAPClient and I urge anyone writting your own scripts to stay away from Mail::Box. -Raul Dias
Re: IADB, 70_iadb.cf and multiple A records returned
On Sat, Feb 10, 2007 at 10:09:35AM -0300, Raul Dias wrote: This also implies that the sub-test values is always a RE and needs to be proper delimeted. If you read perldoc Mail::SpamAssassin::Conf, specifically the check_rbl_sub() section, it'll explain what the subtests can be. It can be several things, including an RE. -- Randomly Selected Tagline: No, I'm not interested in developing a powerful brain. All I'm after is just a mediocre brain, something like the president of American Telephone and Telegraph Company. -- Alan Turing on the possibilities of a thinking machine, 1943. pgp6716N0MiyG.pgp Description: PGP signature
Re: How to Scan just incoming not outcoming emails?
Hi, Evan / Theo: Well, until what I have understood, my Sendmail / Mailscanner are the responsible to send to Spam Assassin the emails to be filterd, so, I have to set Sendmail / Mailscanner in order they send to SA just incoming emails, right? Thanks a lot. Mario./ At 18:47 10/2/2007, Evan Platt wrote: At 02:42 PM 2/10/2007, correiob wrote: Hi: I have a Centos Linux, running Apache, Sendmail, Spam Assassin and MailScanner. This Server is POP as well as SMTP for all the mailboxes of my customers. Actually, the SpamAssassin at this Server filters both, the emails that are being received and the emails that are being sent. This is giving my Server a really heavy load. I think I don't have neither the need (nor the obligation) to filter agains spam the emails those mailboxes are sending. This is a task up to the users at their own desktops and networks. But I have to filter just what is being received. So, my question is: is it possible to set Sendmail / Spam Assassin in order filters just the receiving emails? If so, please, tell me what to do. But, please, tell me like a cooking recipe, because I am not quite experienced with operating systems. Thanks a lot. This is more of a sendmail question, so if no one here can answer, you may look there. I run Sendmail with postfix and spamassassin and only my incoming mails are scanned. And as SpamAssassin only does what it's told, you've somehow told sendmail to scan outgoing e-mails. Perhaps something in your sendmail config file? -- No virus found in this incoming message. Checked by AVG Free Edition. Version: 7.5.441 / Virus Database: 268.17.33/678 - Release Date: 9/2/2007 16:06 -- No virus found in this incoming message. Checked by AVG Free Edition. Version: 7.5.441 / Virus Database: 268.17.33/678 - Release Date: 9/2/2007 16:06 -- No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.5.441 / Virus Database: 268.17.33/678 - Release Date: 9/2/2007 16:06 ___ Yahoo! Mail - Sempre a melhor opção para você! Experimente já e veja as novidades. http://br.yahoo.com/mailbeta/tudonovo/
Re: A New Approach: Find the Ham
On Sat, 10 Feb 2007, Dan wrote: With Find the Ham, whitelisting is almost obsolete. When you find an FP, How do you ever find FPs if you have so many TP to sort through that it's not even worth sorting through FP+TP to find the FP ? IMHO, that'd be why we assume that mails are ham rather than assume that they are spam. _ _ __ ___ _ _ _ ... | Mathieu Bouchard - tél:+1.514.383.3801 - http://artengine.ca/matju | Freelance Digital Arts Engineer, Montréal QC Canada
Re: IADB, 70_iadb.cf and multiple A records returned
On Sat, 2007-02-10 at 16:53 -0500, Theo Van Dinter wrote: On Sat, Feb 10, 2007 at 10:09:35AM -0300, Raul Dias wrote: This also implies that the sub-test values is always a RE and needs to be proper delimeted. If you read perldoc Mail::SpamAssassin::Conf, specifically the check_rbl_sub() section, it'll explain what the subtests can be. It can be several things, including an RE. Yes, I read that. The question is what makes it a RE if not the delimiter? As we discussed earlier the ^ $ is necessary to avoid matching other numbers, which will only be possible if the value is a RE. So: 1 - '^127.0.0.1$' matches only 127.0.0.1 and thats a RE. 2 - '127.0.0.1' might match 127.0.0.12 (if it is considered an RE). If 2 is false, than 1 is unecessary, right? -Raul Dias
Re: whitelist problems
John D. Hardin wrote: On Sat, 10 Feb 2007, Matt Kettler wrote: In particular, make sure you didn't do anything like the common mistake of whitelist_from [EMAIL PROTECTED]. Any spammer can trivially forge a From: or Return-Path header, and forging your own domain in these fields is a common tactic because spammers know many people make this mistake. Perhaps --lint should warn about whitelist_from being used at all... Or at least warn if you haven't set an option: yes_i_understand_whitelist_from_sucks 1 After all, generating a lint warning will mess up folks who use RDJ.. there should be a way to disable the warning for people who really have no other option. (otherwise they'd just remove the damn thing.)
Re: A New Approach: Find the Ham
On Feb 10, 2007, at 14:38, Mathieu Bouchard wrote: How do you ever find FPs if you have so many TP to sort through that it's not even worth sorting through FP+TP to find the FP ? IMHO, that'd be why we assume that mails are ham rather than assume that they are spam. I haven't found FP reviewing to be a big deal. In my latest SA based configuration, for example, I organize captures according to the quantity of tests a given message fails. The more tests are involved, the less a message needs to be double checked. So as with other particulars, ease of use will depend on how well the approach is implemented. Dan
Re: whitelist problems
urgrue wrote: The auto-whitelist has nothing to do with anything that says WHITELISTED. The auto-whitelist will show up as a rule named AWL. Nothing else. That said, can you be VERY specific about what your headers say? Does it say USER_IN_WHITELIST? If so, check your whitelist_from and whitelist_from_rcvd entries. It says, precisely: X-Spam-Status: No, hits=- tagged_above=-.0 required=5.0 WHITELISTED So if its not whitelist_from or the AWL, what can it be? That's nothing coming from spamassassin. There's no such thing as WHITELISTED in SA, so that must be an amavis thing. Also, the lack of any score is a good hint SA didn't do it. All of SA's whitelists are just score modifiers. Spamassassin is always more specific than than just WHITELISTED. AFAIK, this is a complete list of whitelist rules for SA: USER_IN_WHITELIST USER_IN_DEF_WHITELIST SUBJECT_IN_WHITELIST USER_IN_DKIM_WHITELIST USER_IN_DEF_DKIM_WL USER_IN_SPF_WHITELIST USER_IN_DK_WHITELIST USER_IN_DEF_DK_WL USER_IN_SPF_WHITELIST USER_IN_DEF_SPF_WL USER_IN_WHITELIST_TO USER_IN_MORE_SPAM_TO USER_IN_ALL_SPAM_TO AWL note there are some other things with negative scores, like bonded sender, habeas coi, and hashcash, but none of these has rule names even remotely related to the word white.
How to use eval: methods without calling check?
I'd like to programatically call the methods SA uses to check for 8bit charsets and the like but I personally do not care to make use of the rules engine at all. Do I need an instance of PerMsgStatus fully setup before I can call eval: methods programatically? For instance I already have my $spamtest = new Mail::SpamAssassin({ PREFIX = $PREFIX, DEF_RULES_DIR = $DEF_RULES_DIR, LOCAL_RULES_DIR = $LOCAL_RULES_DIR, LOCAL_STATE_DIR = $LOCAL_STATE_DIR, userprefs_filename = $PREFIX/.spamassassin/user_prefs, userstate_dir = $PREFIX/.spamassassin, debug = $debugLevel }); but I do not want to have to call check() I'm looking to call things like check_for_faraway_charset, check_for_faraway_charset_in_headers
Charset dealing in SA
I am writting some rules with accents which is out of ASCII. In my case it is ISO-8859-1 and I am sure it will match ISO-8859-1 equivalent messages. However, how will it behave agains different charset (utf-8) in the message body? Does SA do anything regarding this issue like converting everything to utf-8 first before running the REs? This is something to consider with most non english languages. Or is this something I shouldn't worry about? -Raul Dias
bad OCR with some GIF images
Hello, I'm using SA 3.1.7 with FuzzyOCR 3.5.1 . This month I started having troubles with some GIF spams. The OCR can't recognize it and prints out only some letters after doing the OCR. Have anybody seen it? Max [EMAIL PROTECTED] f]# spamassassin --debug FuzzyOcr Přep\:\ Now\ this\ is\ clearly\ not\ re.eml /dev/null [21573] dbg: FuzzyOcr: focr_bin_helper: 'pnmnorm,pnminvert,pamthreshold,ppmtopgm,pamtopnm' [21573] info: FuzzyOcr: Adding 5 new helper apps [21573] dbg: FuzzyOcr: focr_bin_helper: 'tesseract' [21573] info: FuzzyOcr: Adding 1 new helper apps [21573] info: FuzzyOcr: Starting preprocessor parser for file /etc/mail/spamassassin/FuzzyOcr.preps... [21573] dbg: FuzzyOcr: line: preprocessor normalize { [21573] dbg: FuzzyOcr: line: command = pnmnorm [21573] dbg: FuzzyOcr: line: } [21573] dbg: FuzzyOcr: line: preprocessor invert { [21573] dbg: FuzzyOcr: line: command = pnminvert [21573] dbg: FuzzyOcr: line: } [21573] dbg: FuzzyOcr: line: preprocessor ppmtopgm { [21573] dbg: FuzzyOcr: line: command = ppmtopgm [21573] dbg: FuzzyOcr: line: } [21573] dbg: FuzzyOcr: line: preprocessor pamtopnm { [21573] dbg: FuzzyOcr: line: command = pamtopnm [21573] dbg: FuzzyOcr: line: } [21573] dbg: FuzzyOcr: line: preprocessor pamthreshold { [21573] dbg: FuzzyOcr: line: command = pamthreshold [21573] dbg: FuzzyOcr: line: args = -simple -threshold 0.5 [21573] dbg: FuzzyOcr: line: } [21573] dbg: FuzzyOcr: line: preprocessor maketiff { [21573] dbg: FuzzyOcr: line: command = pnmtotiff [21573] dbg: FuzzyOcr: line: args = -color -truecolor [21573] dbg: FuzzyOcr: line: } [21573] info: FuzzyOcr: Starting scanset parser for file /etc/mail/spamassassin/FuzzyOcr.scansets... [21573] dbg: FuzzyOcr: line scanset ocrad { [21573] dbg: FuzzyOcr: line command = $ocrad [21573] dbg: FuzzyOcr: line args = -s5 $input [21573] dbg: FuzzyOcr: line } [21573] dbg: FuzzyOcr: line scanset ocrad-invert { [21573] dbg: FuzzyOcr: line command = $ocrad [21573] dbg: FuzzyOcr: line args = -s5 -i $input [21573] dbg: FuzzyOcr: line } [21573] dbg: FuzzyOcr: line scanset ocrad-decolorize-invert { [21573] dbg: FuzzyOcr: line preprocessors = ppmtopgm, pamthreshold, pamtopnm [21573] dbg: FuzzyOcr: line command = $ocrad [21573] dbg: FuzzyOcr: line args = -s5 -i $input [21573] dbg: FuzzyOcr: line } [21573] dbg: FuzzyOcr: line scanset ocrad-decolorize { [21573] dbg: FuzzyOcr: line preprocessors = ppmtopgm, pamthreshold, pamtopnm [21573] dbg: FuzzyOcr: line command = $ocrad [21573] dbg: FuzzyOcr: line args = -s5 $input [21573] dbg: FuzzyOcr: line } [21573] dbg: FuzzyOcr: line scanset gocr { [21573] dbg: FuzzyOcr: line command = $gocr [21573] dbg: FuzzyOcr: line args = -i $input [21573] dbg: FuzzyOcr: line } [21573] dbg: FuzzyOcr: line scanset gocr-180 { [21573] dbg: FuzzyOcr: line command = $gocr [21573] dbg: FuzzyOcr: line args = -l 180 -d 2 -i $input [21573] dbg: FuzzyOcr: line } [21573] info: FuzzyOcr: Searching in: /usr/local/netpbm/bin [21573] info: FuzzyOcr: Searching in: /usr/local/bin [21573] info: FuzzyOcr: Searching in: /usr/bin [21573] info: FuzzyOcr: Using gifsicle = /usr/bin/gifsicle [21573] dbg: FuzzyOcr: Using giffix = /bin/giffix [21573] dbg: FuzzyOcr: Using giftext = /bin/giftext [21573] dbg: FuzzyOcr: Using gifinter = /bin/gifinter [21573] info: FuzzyOcr: Using giftopnm = /usr/bin/giftopnm [21573] info: FuzzyOcr: Using jpegtopnm = /usr/bin/jpegtopnm [21573] info: FuzzyOcr: Using pngtopnm = /usr/bin/pngtopnm [21573] info: FuzzyOcr: Using bmptopnm = /usr/bin/bmptopnm [21573] info: FuzzyOcr: Using tifftopnm = /usr/bin/tifftopnm [21573] info: FuzzyOcr: Using ppmhist = /usr/bin/ppmhist [21573] info: FuzzyOcr: Using pamfile = /usr/bin/pamfile [21573] info: FuzzyOcr: Using ocrad = /usr/bin/ocrad [21573] dbg: FuzzyOcr: Using gocr = /usr/local/bin/gocr [21573] info: FuzzyOcr: Using pnmnorm = /usr/bin/pnmnorm [21573] info: FuzzyOcr: Using pnminvert = /usr/bin/pnminvert [21573] info: FuzzyOcr: Using pamthreshold = /usr/bin/pamthreshold [21573] info: FuzzyOcr: Using ppmtopgm = /usr/bin/ppmtopgm [21573] info: FuzzyOcr: Using pamtopnm = /usr/bin/pamtopnm [21573] info: FuzzyOcr: Using tesseract = /usr/bin/tesseract [21573] dbg: FuzzyOcr: Threshold[max_hash] = 5 [21573] dbg: FuzzyOcr: Threshold[c] = 5 [21573] dbg: FuzzyOcr: Threshold[s] = 0.01 [21573] dbg: FuzzyOcr: Threshold[w] = 0.01 [21573] dbg: FuzzyOcr: Threshold[h] = 0.01 [21573] dbg: FuzzyOcr: Threshold[cn] = 0.01 [21573] dbg: FuzzyOcr: focr_add_score = 1 [21573] dbg: FuzzyOcr: focr_autodisable_negative_score = -8 [21573] dbg: FuzzyOcr: focr_autodisable_score = 1000 [21573] dbg: FuzzyOcr: focr_autosort_buffer = 10 [21573] dbg: FuzzyOcr: focr_autosort_scanset = 1 [21573] dbg: FuzzyOcr: focr_base_score = 5 [21573] dbg: FuzzyOcr: focr_corrupt_score = 2.5 [21573] dbg: FuzzyOcr: focr_corrupt_unfixable_score = 5 [21573] dbg: FuzzyOcr: focr_counts_required = 2 [21573] dbg: FuzzyOcr: focr_db_hash = /etc/mail/spamassassin/FuzzyOcr.db [21573] dbg: FuzzyOcr: focr_db_max_days = 35 [21573] dbg: FuzzyOcr:
Re: How to use eval: methods without calling check?
On Sat, Feb 10, 2007 at 06:30:50PM -0600, Robert Nicholson wrote: I'd like to programatically call the methods SA uses to check for 8bit charsets and the like but I personally do not care to make use of the rules engine at all. Do I need an instance of PerMsgStatus fully setup before I can call eval: methods programatically? Generally speaking, yes. I'm looking to call things like check_for_faraway_charset, check_for_faraway_charset_in_headers If you look at the code, those functions clearly want a PMS object. -- Randomly Selected Tagline: Check book: a book with a unhappy ending. pgpRAwZ8xxaKa.pgp Description: PGP signature
Re: How to use eval: methods without calling check?
This appears to be working sub handle_potential_faraway { my $mail = shift(@_); $spamtest-init(1); my $msg = Mail::SpamAssassin::PerMsgStatus-new($spamtest, $mail); if ($msg-check_for_faraway_charset()) { ignore_mail($mail); } elsif ($msg-check_for_faraway_charset_in_headers()) { ignore_mail($mail); } elsif ($msg-html_charset_faraway()) { ignore_mail($mail); } elsif ($msg-check_for_mime('mime_faraway_charset')) { ignore_mail($mail); } } On Feb 10, 2007, at 6:50 PM, Theo Van Dinter wrote: On Sat, Feb 10, 2007 at 06:30:50PM -0600, Robert Nicholson wrote: I'd like to programatically call the methods SA uses to check for 8bit charsets and the like but I personally do not care to make use of the rules engine at all. Do I need an instance of PerMsgStatus fully setup before I can call eval: methods programatically? Generally speaking, yes. I'm looking to call things like check_for_faraway_charset, check_for_faraway_charset_in_headers If you look at the code, those functions clearly want a PMS object. -- Randomly Selected Tagline: Check book: a book with a unhappy ending.
Re: A New Approach: Find the Ham
Good point, but will cause trouble UNLESS we find a way to recognize ham 100%. And it must me exactly 100% (99% won't be enough). As other users said, with current system, if we can filter 70-80 of the spam, remaining 20-30% will only be an annoyance, but ham will be delivered. But with the new approach event if the spam stopped 100%, only 1% undelivered ham will cause a lot of trouble. Just my 1 Yen :-) Dan wrote: I've developed a new approach to scoring that I want to 1) share with everyone and 2) make into a working system thats as accurate as what I've already built, but easier to use. First, the theory: SITUATION In the beginning, all email was ham. When spam came along, we left the ham alone and targeted the annoyance (spam). ASSUMPTION All messages are ham unless x,y,z score says they're spam. APPROACH Block nothing, then create rules to catch what you don't want. ie, build tests that target the spam, then score the millions of ways spam can occur. RESULT Huge time spent tuning and retuning weights, catching everything in sight (including much ham). NEW SITUATION Ham is now the tiniest minority of all email. NEW ASSUMPTION All messages are spam unless x,y,z score says they're ham. NEW APPROACH Block everything, then create rules to not catch what you do want. ie, build tests that target the spam (keeping all the tests you've already built), then score the thousands of ways ham triggers on those tests. NEW RESULT Spend less time and energy while catching more of what you do want and less of what you don't. CHALLENGE All filtering software is written to score for results that equal spam - catch the bad SOLUTION Make filtering software score for results that equal ham - uncatch the good. Your thoughts? Dan BTW, is there a better forum for this level of question?
Re: How to Scan just incoming not outcoming emails?
| So, my question is: is it possible to set Sendmail / Spam Assassin in | order filters just the receiving emails? If so, please, tell me what | to do. But, please, tell me like a cooking recipe, because I am not | quite experienced with operating systems. Thanks a lot. | | Mario./ Call SA from Procmail rather than from MailScanner. This is what I do. B
What does this mean? FROM_ENDS_IN_NUMS From: ends in numbers
Hi, I got at test mailing spam report back with a score I had never seen before for FROM_ENDS_IN_NUMS From: ends in numbers There wasn't a number in the email from or reply, so I just didn't get this. Thanks RC Never miss an email again! Yahoo! Toolbar alerts you the instant new Mail arrives. http://tools.search.yahoo.com/toolbar/features/mail/
Query regarding whitelist_to
hi, Spam mail originated to list of user, if one user in whitelist_to then score will be neagtive so all other user also get that spam mail. how to aviod this. Regards sushma
Re: Query regarding whitelist_to
On Sun, Feb 11, 2007 at 11:36:56AM +, sushma wrote: Spam mail originated to list of user, if one user in whitelist_to then score will be neagtive so all other user also get that spam mail. how to aviod this. If you scan your mails site-wide (ie: once per message), then you can't avoid this. Your best bet is to remove the whitelist_to and then at delivery for the specific user/address, don't filter based on the scan results. Otherwise, switch to per-user filtering and set whitelist_to for the appropriate user. -- Randomly Selected Tagline: Human beings, who are almost unique in having the ability to learn from the experience of others, are also remarkable for their apparent disinclination to do so.- Douglas Adams pgptE4dPA8Wjc.pgp Description: PGP signature
Re: spamassassin learning method
John D. Hardin wrote: On Sat, 10 Feb 2007, Rizal Ferdiyan wrote: I want to create spamassassin learning method, if my client find any spam for their email they can forward it The act of forwarding completely changes the message. Yes, i know that. Email will be add with forward header. The best way is for them to move the message to a folder that you have access to. What is the mail server that the messages eventually end up on? Sendmail with standard mbox/maildir? Exchange? My smtp proxy server serve many mail server client. My client build many server with their own, so that server contain two mbox format, mailbox or maildir. But i don't have access for that mail server client. That for i want my client forward spam to my account that i create, example [EMAIL PROTECTED] cause i can't move spam in their server. Any idea how to solve this problem ? -- Best Regards, -Rizal Ferdiyan