Re: Another SARE channel with the most used rules available

2006-09-10 Thread [EMAIL PROTECTED]




Hi,


 Any chance
of adding support for 3.1.5? (Currently fails with "dns: 
 query failed: 5.1.3.saupdates.openprotect.com
= NXDOMAIN".) 


We've already added txt record for the 3.1.5 release and it should
work now.


cheers,
skar.
-- 
OpenProtect - The email virus/spam filter
http://openprotect.com





Re: Another SARE channel with the most used rules available

2006-09-10 Thread [EMAIL PROTECTED]




Hi,


 gpg --armor
--export KEYID 

 the man page is amazingly helpful ;-)
Thanks, that was helpful. The instructions have been updated to use
this technique instead of copying the entire public key ring of root.


cheers,
skar.
-- 
OpenProtect - The email virus/spam filter
http://openprotect.com





Stopping Domain Spam

2006-09-10 Thread gordonnz

I am new to modifying SpamAssassin but recently I have been daily getting
several hundred spam e-mails addressed to anything@mydomain. Can I create
a sort of white list of my proper addresses so that only my properly
addressed messages get through or is there a better way?
-- 
View this message in context: 
http://www.nabble.com/Stopping-Domain-Spam-tf2247132.html#a6232207
Sent from the SpamAssassin - Users forum at Nabble.com.



Re: Stopping Domain Spam

2006-09-10 Thread jdow

From: gordonnz [EMAIL PROTECTED]


I am new to modifying SpamAssassin but recently I have been daily getting
several hundred spam e-mails addressed to anything@mydomain. Can I create
a sort of white list of my proper addresses so that only my properly
addressed messages get through or is there a better way?


You could; but, it would be inefficient beyond belief if you have more
than a small number of users. Even with a small number of users letting
the domain spam get as far as spamassassin is ridiculous. There are
ways to setup your MTA to reject messages that do not come to valid
addresses. That may involve a chunk of reading for your particular MTA,
though.

{^_^}



Re: Stopping Domain Spam

2006-09-10 Thread mouss

gordonnz wrote:

I am new to modifying SpamAssassin but recently I have been daily getting
several hundred spam e-mails addressed to anything@mydomain. Can I create
a sort of white list of my proper addresses so that only my properly
addressed messages get through or is there a better way?
  


configure your MTA to validate recipient addresses, and to reject 
invalid ones during the smtp transaction.


Re: A Note Regarding DHCP Zone

2006-09-10 Thread mouss

David Cary Hart wrote:

Based upon removal requests, we are seeing a considerable increase in
SA usage. I added some notes to our website recently that I wanted to
share on this list:
  


What's the purpose of duplicating sorbs and other lists? This will only 
make unlisting more complicated.



Please Note:The dhcp zone also contains some static generic hosts:
  

so you call it dhcp but you do not know for sure that these are dhcp IPs
Yet another aribtrary list? you could also include blars and the like in 
your list, if you goal is to have it large.

* Most of these are in mixed dynamic and static ranges. We are
white listing these immediately upon request and verification.
  


Can you detail this process please? Why isn't it automatic?


* Many of these have mis-configured DNS such as inconsistent
forward and reverse DNS or no A record for the host name.
* In all cases, where we have received maps from the ISP, those
will override all other considerations. We are continuing to make
progress in obtaining increasing cooperation from providers in that
regard.
* We add a dynamic range only when we received spam from the
range.

It's a balancing act.  
  


I'd say random art... Thank you for your participation to the Balkanize 
The Internet project.







Re: SPF Scores

2006-09-10 Thread Michael Scheidell
Daryl C. W. O'Shea wrote:


 (yes, I looked up the ip address and pulled a txt record from aol, and
 yes, the ips are in the range, and yes, I have gotten SPF_SOFTFAIL from
 domains without any spf records)

 Bug 5077 includes a one line patch to fix this.  It'll be included in
 3.1.6 but is trivial to apply by hand now.

 http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5077


Thanks, I'll apply the patch and revert back to default scores



-- 
Michael Scheidell, CTO
SECNAP Network Security / www.secnap.com
[EMAIL PROTECTED]  / 1+561-999-5000, x 1131



Re: A Note Regarding DHCP Zone

2006-09-10 Thread David Cary Hart
On Sun, 10 Sep 2006 13:38:37 +0200, mouss [EMAIL PROTECTED] opined:
 David Cary Hart wrote:
  Based upon removal requests, we are seeing a considerable
  increase in SA usage. I added some notes to our website recently
  that I wanted to share on this list:

 
 What's the purpose of duplicating sorbs and other lists? This will
 only make unlisting more complicated.

We are listing a large number of ranges that SORBS does not. There
are numerous operational differences that make both lists useful.
 
  Please Note:The dhcp zone also contains some static generic hosts:

 so you call it dhcp but you do not know for sure that these are
 dhcp IPs Yet another aribtrary list? you could also include
 blars and the like in your list, if you goal is to have it large.

ALL DHCP lists include some static, generic hosts. Most of the white
listing that we handle is also listed by SORBS, some by NJABL. The
fact that we choose to discuss the matter is part of the quality
management process. 

  * Most of these are in mixed dynamic and static ranges. We are
  white listing these immediately upon request and verification.

 
 Can you detail this process please? Why isn't it automatic?

It is fully documented on our site. We prefer to provide EXTREMELY
expeditious handling (usually within a hour). Furthermore, we choose
to have a meaningful dialog to help them get unlisted. 

What is automated is that every removal request generates the
necessary data (from dig to whois) to evaluate the issue.

Frankly, we have made enormous progress is a relatively short period
of time. 
 
 
  It's a balancing act.  

 
 I'd say random art... Thank you for your participation to the
 Balkanize The Internet project.
 
Nonsense. The endeavor is well documented and proactively managed. Do
you have anything constructive to add? 

-- 
Our DNSRBL - Eliminate Spam at the Source: http://www.TQMcube.com
   Don't Subsidize Criminals: http://boulderpledge.org


Re: TQMcube Geo Zone config files

2006-09-10 Thread mouss

Andreas Pettersson wrote:
In case anybody is interrested, I've compiled a config file for the 
geo zone at TQM http://tqmcube.com/worldzone.php
It might not be of great use, but it is interresting to gather some 
statistics of where the mails come from.


Files found here
http://anp.ath.cx/tqmcube/



How does/would this compare to using RELAY_COUNTRY?
are they similar (so one should only use one of them) or complementary?



Re: Juste a little question

2006-09-10 Thread mouss

jdow wrote:

Are you sure that the path from the sender to you involves exactly
one SpamAssassin run, yours? The LAST SpamAssassin run is the one that
gets scored. And many initial setups seem to somehow get SpamAssassin
into the loop twice, which is not good.


Another caveat is to run spamassassin using different users with 
different configurations (or different bayes db).




Odd error (or is it an error)

2006-09-10 Thread Steven Stern
The following appears periodically in my maillog. I think it has to do
with an attempt to do a cpan upgrade or SpamAssassin that I had to back
out and replace with the Fedora RPM.  In any case, is this anything to
worry about?


Sep 10 11:12:30 mooch spamd[26250]: (?:(?=[\s,]))* matches null string
many times in regex; marked by -- HERE in m/\G(?:(?=[\s,]))* -- HERE
\Z/ at /usr/lib/perl5/5.8.8/Text/Wrap.pm line 46.


-- 

  Steve



Fwd: Drink it, forget it !

2006-09-10 Thread Robert Nicholson
Why didn't DATE_IN_FUTURE file on this message?Begin forwarded message:From: "Frederick Harris" [EMAIL PROTECTED]Date: January 14, 2007 12:07:22 AM CSTTo: [EMAIL PROTECTED]Subject: Drink it, forget it !X-Spam-Dcc: : grub.camros.com 1113; Body=1 Fuz1=1X-Spam-Flag: YESX-Spam-Checker-Version: SpamAssassin 3.1.1 (2006-03-10) on grub.camros.comX-Spam-Level: X-Spam-Status: Yes, score=12.5 required=0.6 tests=BAYES_99,HTML_90_100, HTML_IMAGE_ONLY_04,HTML_MESSAGE,HTML_TITLE_EMPTY,MIME_HTML_MOSTLY, RCVD_IN_NJABL_DUL,RCVD_IN_SORBS_DUL,UNPARSEABLE_RELAY autolearn=no  version=3.1.1X-Spam-Report: *  0.0 UNPARSEABLE_RELAY Informational: message has unparseable relay *      lines *  0.1 HTML_90_100 BODY: Message is 90% to 100% HTML *  1.1 MIME_HTML_MOSTLY BODY: Multipart message mostly text/html MIME *  0.0 HTML_MESSAGE BODY: HTML included in message *  3.5 BAYES_99 BODY: Bayesian spam probability is 99 to 100% *      [score: 0.9942] *  0.2 HTML_TITLE_EMPTY BODY: HTML title contains no text *  3.6 HTML_IMAGE_ONLY_04 BODY: HTML: images with 0-400 bytes of words *  2.0 RCVD_IN_SORBS_DUL RBL: SORBS: sent directly from dynamic IP address *      [124.121.58.8 listed in dnsbl.sorbs.net] *  1.9 RCVD_IN_NJABL_DUL RBL: NJABL: dialup sender did non-local SMTP *      [124.121.58.8 listed in combined.njabl.org]Received: (qmail 9695 invoked from network); 4 Sep 2006 05:53:13 -Received: from ppp-124.121.58.8.revip2.asianet.co.th (HELO caching4-true.asianet.co.th) (124.121.58.8) by 64.34.193.12 with SMTP; 4 Sep 2006 05:53:13 -Received: from [61.15.158.107] (helo=[71353437]) by caching4-true.asianet.co.th with smtp (Exim 4.60 (FreeBSD)) (envelope-from [EMAIL PROTECTED]) id WDL-C580H-YQ for [EMAIL PROTECTED]; Sun, 14 Jan 2007 13:07:22 +0700Received: from klenske.com (52680622055 [01783113]) by mail.klenke.de (Qmailv1) with ESMTP id 6E7WB7CKFT5 for [EMAIL PROTECTED]; Sun, 14 Jan 2007 13:07:22 +0700Return-Path: [EMAIL PROTECTED]Envelope-To: [EMAIL PROTECTED]Delivery-Date: Sun, 14 Jan 2007 13:07:22 +0700X-Mailer: MIME-tools 4.104 (Entity 4.116)X-Priority: 3Message-Id: [EMAIL PROTECTED]Mime-Version: 1.0Content-Type: text/plainContent-Disposition: inlineContent-Transfer-Encoding: binaryContent-Length: 493Lines: 15 6E4IEHB8UW48PDUQIXUK4Content-Type: text/plain; charset=windows-1253Content-Transfer-Encoding: 7bit6E4IEHB8UW48PDUQIXUK4Content-Type: text/html; charset=windows-1253Content-Transfer-Encoding: 7bit!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"HTMLTITLE/TITLEIMG width=283 height=410 alt="" hspace=1 vspace=1 src=""cid:FFH007M8.4T77IITD.2MJGL6HN.A0G8IKL4">cid:FFH007M8.4T77IITD.2MJGL6HN.A0G8IKL4_csseditor"/HTML6E4IEHB8UW48PDUQIXUK4--[[Removed gif attachment]] 

Re: Which DB is actually used?

2006-09-10 Thread mouss

Bo Mellberg wrote:

jdow skrev:

From: Bo Mellberg [EMAIL PROTECTED]


I have SA 3.1.4 configured and running on Debian Sarge using apt-get.

I'm finding it hard to know what directory is actually used for the 
bayes-database:


max:~# ls /root/.spamassassin/ -al
total 2344
drwx--  2 root root4096 Sep  8 07:52 .
drwxr-xr-x 12 root root4096 Sep  5 09:37 ..
-rw---  1 root root   12288 Sep  4 14:20 auto-whitelist
-rw-rw-rw-  1 root root   6 Sep  4 14:20 auto-whitelist.mutex
-rw-rw-rw-  1 root root   13992 Sep  4 14:08 bayes.mutex
-rw---  1 root root  344064 Sep  4 14:05 bayes_seen
-rw---  1 root root 2605056 Sep  8 07:52 bayes_toks
-rw-r--r--  1 root root1487 Sep  4 14:20 user_prefs
max:~# ls /home/bosse/.spamassassin/ -al
total 4564
drwx--S--- 2 bosse bosse4096 Sep  7 10:35 .
drwxr-sr-x 5 bosse bosse4096 Aug 31 16:19 ..
-rw--- 1 root  bosse   12288 Sep  6 01:06 auto-whitelist
-rw--- 1 root  bosse   6 Sep  6 01:06 auto-whitelist.mutex
-rw-rw-rw- 1 bosse bosse   15282 Sep  6 01:06 bayes.mutex
-rw--- 1 root  bosse   86136 Sep  6 01:06 bayes_journal
-rw--- 1 bosse bosse  339968 Sep  6 01:06 bayes_seen
-rw--- 1 root  bosse 5255168 Sep  6 01:06 bayes_toks
-rw--- 1 root  bosse1165 Oct  2  2005 user_prefs
max:~# ls /var/spool/exim4/.spamassassin/ -al
total 3424
drwx-- 2 Debian-exim Debian-exim4096 Sep  8 08:04 .
drwxr-x--- 7 Debian-exim Debian-exim4096 Sep  5 15:54 ..
-rw--- 1 Debian-exim Debian-exim 1298432 Sep  8 08:04 
auto-whitelist
-rw-rw-rw- 1 Debian-exim Debian-exim   6 Sep  4 14:15 
auto-whitelist.mutex

-rw-rw-rw- 1 Debian-exim Debian-exim   6 Sep  4 14:15 bayes.mutex
-rw--- 1 Debian-exim Debian-exim   64704 Sep  8 08:04 bayes_journal
-rw--- 1 Debian-exim Debian-exim  319488 Sep  8 08:04 bayes_seen
-rw--- 1 Debian-exim Debian-exim 2629632 Sep  8 08:04 bayes_toks
-rw-r--r-- 1 Debian-exim Debian-exim1175 Nov  1  2005 user_prefs

As you can see there are three directories which are all quite 
recently changed. How can I make sure that only one directory is used?


I would like to make SA site-wide, but the filtering is working 
really good right now so I'm afraid i'll break something. BTW, the 
user bosse is my own account used for my email.


* I just performed sa-learn --sync -D as root.
* I've never touched the exim directory, still it has the latest 
change date.


Thanks in advance.

/Bo


Bo - I can't particularly help you with the single site-wide database
thing. It seems you have a bit if a mishmash that depending on things
you have done may be actually acting the way you want it to act. It
looks like you might have played with training or tests as bosse
and root and otherwise have everything working on the exim4 global
database. Always test and train as the user that is used for filtering
the email by the MTA. Other tests and training are meaningless.

If you do not have many users at all, dozens or less, then do
consider using per user BAYES. It CAN provide the users with a better
anti-spam experience. The reasoning behind this is that one user's
spam is almost always going to be some other user's ham. If you have
hundreds then there might be a good reason for a single BAYES database.
By the time you're into thousands you're using virtual accounts and
a global database may be required. But it won't provide quite the pin-
point accuracy of a per user database.

{^_^}




Thanks for this info,

It seems like the exim-users database is being touched regularly, so 
I'm guessing that it has been set up by apt-get in some 
auto-learning state.


I have earlier trained spam and ham as user bosse, which is why 
there is a working db there as well.


As I am the only user on my system, it really doesn't matter if I use 
site-wide or not, but rather how I invoke sa-learn.


Lets say I remove the databases for bosse and root. Is this the 
proper way to invoke sa-learn:


1. Log on as user bosse
2. sa-learn --showdots --sync --dbpath /var/spool/exim4/.spamassassin 
--spam /home/bosse/Maildir/.MissedSpam/cur


If I set up a cron job to do the above I could just toss missed spam 
into the MissedSpam-folder right?


One way is to use a mysql db and have something like this in your 
configuration:


## global bayes db
bayes_sql_override_username spamassassin

then you won't have to worry who runs the filter and who trains it.

see the wiki for how to migrate. if you migrate, connect to your mysql 
and update the user field to match the one used in the configuration 
(spamassassin above).




Re: site-wide config?

2006-09-10 Thread mouss

Russell Jones wrote:

Sorry if this is covered somewhere in the documentation, and if so can someone 
be nice enough to point it to me :) I can't seem to locate it.

I would like to set spamassassin to use a site-wide configuration, so that when 
I tell it to sa-learn, it will apply what it learns to every single email 
account on the server.

If someone can point me to the documentation and/or examples of how to set 
this, I would be very grateful.

Thanks!
  


The easiest way is to use mysql, and set:

## global bayes ddb
bayes_sql_override_username spamassassin




Re: TQMcube Geo Zone config files

2006-09-10 Thread Andreas Pettersson

mouss wrote:


How does/would this compare to using RELAY_COUNTRY?
are they similar (so one should only use one of them) or complementary?




I don't know. I haven't used RELAY_COUNTRY, but now that I'm aware of 
its existense I'll have a look at it :)



Regards,
Andreas




Re: TQMcube Geo Zone config files

2006-09-10 Thread Andreas Pettersson

Andreas Pettersson wrote:



I don't know. I haven't used RELAY_COUNTRY, but now that I'm aware of 
its existense I'll have a look at it :)




Ok, I've had a quick look now. RelayCountry presents the country code of 
the last relay either as a separate header, or as the _RELAYCOUNTRY_ 
header markup. When looking at only one mail it wouldn't make any 
difference using TQM or RelayCountry, but I fancy about statistics, and 
since I already have tools for grabbing the amount of ham and spam each 
rule has triggered on, my vote falls on TQM.


rule  spam   ham
TQMCUBE_W_US 18427   428
TQMCUBE_W_FR  560552
TQMCUBE_W_ES  5040 7
TQMCUBE_W_KR  3794 2
TQMCUBE_W_CN  3600 4
TQMCUBE_W_PL  3235 3
TQMCUBE_W_BR  2582 1
TQMCUBE_W_DE  2184   149
TQMCUBE_W_IT  2165 9
...

Regards,
Andreas



Re: Drink it, forget it !

2006-09-10 Thread Robert Nicholson
It seems to have decided that date_diff is 0 for some reason in check_for_shifted_dateOn Sep 10, 2006, at 11:42 AM, Robert Nicholson wrote:Why didn't DATE_IN_FUTURE file on this message?Begin forwarded message:From: "Frederick Harris" [EMAIL PROTECTED]Date: January 14, 2007 12:07:22 AM CSTTo: [EMAIL PROTECTED]Subject: Drink it, forget it !X-Spam-Dcc: : grub.camros.com 1113; Body=1 Fuz1=1X-Spam-Flag: YESX-Spam-Checker-Version: SpamAssassin 3.1.1 (2006-03-10) on grub.camros.comX-Spam-Level: X-Spam-Status: Yes, score=12.5 required=0.6 tests=BAYES_99,HTML_90_100, HTML_IMAGE_ONLY_04,HTML_MESSAGE,HTML_TITLE_EMPTY,MIME_HTML_MOSTLY, RCVD_IN_NJABL_DUL,RCVD_IN_SORBS_DUL,UNPARSEABLE_RELAY autolearn=no  version=3.1.1X-Spam-Report: *  0.0 UNPARSEABLE_RELAY Informational: message has unparseable relay *      lines *  0.1 HTML_90_100 BODY: Message is 90% to 100% HTML *  1.1 MIME_HTML_MOSTLY BODY: Multipart message mostly text/html MIME *  0.0 HTML_MESSAGE BODY: HTML included in message *  3.5 BAYES_99 BODY: Bayesian spam probability is 99 to 100% *      [score: 0.9942] *  0.2 HTML_TITLE_EMPTY BODY: HTML title contains no text *  3.6 HTML_IMAGE_ONLY_04 BODY: HTML: images with 0-400 bytes of words *  2.0 RCVD_IN_SORBS_DUL RBL: SORBS: sent directly from dynamic IP address *      [124.121.58.8 listed in dnsbl.sorbs.net] *  1.9 RCVD_IN_NJABL_DUL RBL: NJABL: dialup sender did non-local SMTP *      [124.121.58.8 listed in combined.njabl.org]Received: (qmail 9695 invoked from network); 4 Sep 2006 05:53:13 -Received: from ppp-124.121.58.8.revip2.asianet.co.th (HELO caching4-true.asianet.co.th) (124.121.58.8) by 64.34.193.12 with SMTP; 4 Sep 2006 05:53:13 -Received: from [61.15.158.107] (helo=[71353437]) by caching4-true.asianet.co.th with smtp (Exim 4.60 (FreeBSD)) (envelope-from [EMAIL PROTECTED]) id WDL-C580H-YQ for [EMAIL PROTECTED]; Sun, 14 Jan 2007 13:07:22 +0700Received: from klenske.com (52680622055 [01783113]) by mail.klenke.de (Qmailv1) with ESMTP id 6E7WB7CKFT5 for [EMAIL PROTECTED]; Sun, 14 Jan 2007 13:07:22 +0700Return-Path: [EMAIL PROTECTED]Envelope-To: [EMAIL PROTECTED]Delivery-Date: Sun, 14 Jan 2007 13:07:22 +0700X-Mailer: MIME-tools 4.104 (Entity 4.116)X-Priority: 3Message-Id: [EMAIL PROTECTED]Mime-Version: 1.0Content-Type: text/plainContent-Disposition: inlineContent-Transfer-Encoding: binaryContent-Length: 493Lines: 15 6E4IEHB8UW48PDUQIXUK4Content-Type: text/plain; charset=windows-1253Content-Transfer-Encoding: 7bit6E4IEHB8UW48PDUQIXUK4Content-Type: text/html; charset=windows-1253Content-Transfer-Encoding: 7bit!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"HTMLTITLE/TITLEIMG width=283 height=410 alt="" hspace=1 vspace=1 src=""cid:FFH007M8.4T77IITD.2MJGL6HN.A0G8IKL4">cid:FFH007M8.4T77IITD.2MJGL6HN.A0G8IKL4_csseditor"/HTML6E4IEHB8UW48PDUQIXUK4--[[Removed gif attachment]] 

Re: Drink it, forget it ! .... bug in _check_date_diff

2006-09-10 Thread Robert Nicholson
i'm guessing what happened here was that it took the first Received header... which is the same as the Date: header.What i'd rather it take though is the header closest to me.so instead of usingReceived: from [61.15.158.107] (helo=[71353437]) by caching4-true.asianet.co.th with smtp (Exim 4.60 (FreeBSD)) (envelope-from [EMAIL PROTECTED]) id WDL-C580H-YQ for [EMAIL PROTECTED]; Sun, 14 Jan 2007 13:07:22 +0700Received: from klenske.com (52680622055 [01783113]) by mail.klenke.de (Qmailv1) with ESMTP id 6E7WB7CKFT5 for [EMAIL PROTECTED]; Sun, 14 Jan 2007 13:07:22 +0700it should useReceived: (qmail 9695 invoked from network); 4 Sep 2006 05:53:13 -It seems the code only considers if the last header is the same it doesn't exclude others as well.I think it could exclude all headers that have a diff of 0this is where it all goes wrong@diffs = sort { abs($a) = abs($b) } @diffs;2001:     $self-{date_diff} = $diffs[0];On Sep 10, 2006, at 12:32 PM, Robert Nicholson wrote:It seems to have decided that date_diff is 0 for some reason in check_for_shifted_dateOn Sep 10, 2006, at 11:42 AM, Robert Nicholson wrote:Why didn't DATE_IN_FUTURE file on this message?Begin forwarded message:From: "Frederick Harris" [EMAIL PROTECTED]Date: January 14, 2007 12:07:22 AM CSTTo: [EMAIL PROTECTED]Subject: Drink it, forget it !X-Spam-Dcc: : grub.camros.com 1113; Body=1 Fuz1=1X-Spam-Flag: YESX-Spam-Checker-Version: SpamAssassin 3.1.1 (2006-03-10) on grub.camros.comX-Spam-Level: X-Spam-Status: Yes, score=12.5 required=0.6 tests=BAYES_99,HTML_90_100, HTML_IMAGE_ONLY_04,HTML_MESSAGE,HTML_TITLE_EMPTY,MIME_HTML_MOSTLY, RCVD_IN_NJABL_DUL,RCVD_IN_SORBS_DUL,UNPARSEABLE_RELAY autolearn=no  version=3.1.1X-Spam-Report: *  0.0 UNPARSEABLE_RELAY Informational: message has unparseable relay *      lines *  0.1 HTML_90_100 BODY: Message is 90% to 100% HTML *  1.1 MIME_HTML_MOSTLY BODY: Multipart message mostly text/html MIME *  0.0 HTML_MESSAGE BODY: HTML included in message *  3.5 BAYES_99 BODY: Bayesian spam probability is 99 to 100% *      [score: 0.9942] *  0.2 HTML_TITLE_EMPTY BODY: HTML title contains no text *  3.6 HTML_IMAGE_ONLY_04 BODY: HTML: images with 0-400 bytes of words *  2.0 RCVD_IN_SORBS_DUL RBL: SORBS: sent directly from dynamic IP address *      [124.121.58.8 listed in dnsbl.sorbs.net] *  1.9 RCVD_IN_NJABL_DUL RBL: NJABL: dialup sender did non-local SMTP *      [124.121.58.8 listed in combined.njabl.org]Received: (qmail 9695 invoked from network); 4 Sep 2006 05:53:13 -Received: from ppp-124.121.58.8.revip2.asianet.co.th (HELO caching4-true.asianet.co.th) (124.121.58.8) by 64.34.193.12 with SMTP; 4 Sep 2006 05:53:13 -Received: from [61.15.158.107] (helo=[71353437]) by caching4-true.asianet.co.th with smtp (Exim 4.60 (FreeBSD)) (envelope-from [EMAIL PROTECTED]) id WDL-C580H-YQ for [EMAIL PROTECTED]; Sun, 14 Jan 2007 13:07:22 +0700Received: from klenske.com (52680622055 [01783113]) by mail.klenke.de (Qmailv1) with ESMTP id 6E7WB7CKFT5 for [EMAIL PROTECTED]; Sun, 14 Jan 2007 13:07:22 +0700Return-Path: [EMAIL PROTECTED]Envelope-To: [EMAIL PROTECTED]Delivery-Date: Sun, 14 Jan 2007 13:07:22 +0700X-Mailer: MIME-tools 4.104 (Entity 4.116)X-Priority: 3Message-Id: [EMAIL PROTECTED]Mime-Version: 1.0Content-Type: text/plainContent-Disposition: inlineContent-Transfer-Encoding: binaryContent-Length: 493Lines: 15 6E4IEHB8UW48PDUQIXUK4Content-Type: text/plain; charset=windows-1253Content-Transfer-Encoding: 7bit6E4IEHB8UW48PDUQIXUK4Content-Type: text/html; charset=windows-1253Content-Transfer-Encoding: 7bit!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"HTMLTITLE/TITLEIMG width=283 height=410 alt="" hspace=1 vspace=1 src=""cid:FFH007M8.4T77IITD.2MJGL6HN.A0G8IKL4">cid:FFH007M8.4T77IITD.2MJGL6HN.A0G8IKL4_csseditor"/HTML6E4IEHB8UW48PDUQIXUK4--[[Removed gif attachment]] 

Re: Odd error (or is it an error)

2006-09-10 Thread Theo Van Dinter
On Sun, Sep 10, 2006 at 11:14:13AM -0500, Steven Stern wrote:
 The following appears periodically in my maillog. I think it has to do
 with an attempt to do a cpan upgrade or SpamAssassin that I had to back
 out and replace with the Fedora RPM.  In any case, is this anything to
 worry about?

It's a bug in Text::Wrap:
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5056

-- 
Randomly Generated Tagline:
Make like a drum and beat it!


pgpERUFQoCPdF.pgp
Description: PGP signature


Re: Drink it, forget it ! .... bug in _check_date_diff

2006-09-10 Thread Robert Nicholson
So if I use the following instead it then fires the rule  # use the date with the smallest absolute difference  # (experimentally, this results in the fewest false positives)  @diffs = sort { abs($a) = abs($b) } @diffs;  # pick the first one that isn't 0  foreach my $diff (@diffs)  {     next if $diff == 0;     $self-{date_diff} = $diff;     return;  }  $self-{date_diff} = 0;  #$self-{date_diff} = $diffs[0];This looks to be something Spammers are deliberately working around as how could you possibly get two received headers with the same date, time to the second?On Sep 10, 2006, at 12:53 PM, Robert Nicholson wrote:i'm guessing what happened here was that it took the first Received header... which is the same as the Date: header.What i'd rather it take though is the header closest to me.so instead of usingReceived: from [61.15.158.107] (helo=[71353437]) by caching4-true.asianet.co.th with smtp (Exim 4.60 (FreeBSD)) (envelope-from [EMAIL PROTECTED]) id WDL-C580H-YQ for [EMAIL PROTECTED]; Sun, 14 Jan 2007 13:07:22 +0700Received: from klenske.com (52680622055 [01783113]) by mail.klenke.de (Qmailv1) with ESMTP id 6E7WB7CKFT5 for [EMAIL PROTECTED]; Sun, 14 Jan 2007 13:07:22 +0700it should useReceived: (qmail 9695 invoked from network); 4 Sep 2006 05:53:13 -It seems the code only considers if the last header is the same it doesn't exclude others as well.I think it could exclude all headers that have a diff of 0this is where it all goes wrong@diffs = sort { abs($a) = abs($b) } @diffs;2001:     $self-{date_diff} = $diffs[0];On Sep 10, 2006, at 12:32 PM, Robert Nicholson wrote:It seems to have decided that date_diff is 0 for some reason in check_for_shifted_dateOn Sep 10, 2006, at 11:42 AM, Robert Nicholson wrote:Why didn't DATE_IN_FUTURE file on this message?Begin forwarded message:From: "Frederick Harris" [EMAIL PROTECTED]Date: January 14, 2007 12:07:22 AM CSTTo: [EMAIL PROTECTED]Subject: Drink it, forget it !X-Spam-Dcc: : grub.camros.com 1113; Body=1 Fuz1=1X-Spam-Flag: YESX-Spam-Checker-Version: SpamAssassin 3.1.1 (2006-03-10) on grub.camros.comX-Spam-Level: X-Spam-Status: Yes, score=12.5 required=0.6 tests=BAYES_99,HTML_90_100, HTML_IMAGE_ONLY_04,HTML_MESSAGE,HTML_TITLE_EMPTY,MIME_HTML_MOSTLY, RCVD_IN_NJABL_DUL,RCVD_IN_SORBS_DUL,UNPARSEABLE_RELAY autolearn=no  version=3.1.1X-Spam-Report: *  0.0 UNPARSEABLE_RELAY Informational: message has unparseable relay *      lines *  0.1 HTML_90_100 BODY: Message is 90% to 100% HTML *  1.1 MIME_HTML_MOSTLY BODY: Multipart message mostly text/html MIME *  0.0 HTML_MESSAGE BODY: HTML included in message *  3.5 BAYES_99 BODY: Bayesian spam probability is 99 to 100% *      [score: 0.9942] *  0.2 HTML_TITLE_EMPTY BODY: HTML title contains no text *  3.6 HTML_IMAGE_ONLY_04 BODY: HTML: images with 0-400 bytes of words *  2.0 RCVD_IN_SORBS_DUL RBL: SORBS: sent directly from dynamic IP address *      [124.121.58.8 listed in dnsbl.sorbs.net] *  1.9 RCVD_IN_NJABL_DUL RBL: NJABL: dialup sender did non-local SMTP *      [124.121.58.8 listed in combined.njabl.org]Received: (qmail 9695 invoked from network); 4 Sep 2006 05:53:13 -Received: from ppp-124.121.58.8.revip2.asianet.co.th (HELO caching4-true.asianet.co.th) (124.121.58.8) by 64.34.193.12 with SMTP; 4 Sep 2006 05:53:13 -Received: from [61.15.158.107] (helo=[71353437]) by caching4-true.asianet.co.th with smtp (Exim 4.60 (FreeBSD)) (envelope-from [EMAIL PROTECTED]) id WDL-C580H-YQ for [EMAIL PROTECTED]; Sun, 14 Jan 2007 13:07:22 +0700Received: from klenske.com (52680622055 [01783113]) by mail.klenke.de (Qmailv1) with ESMTP id 6E7WB7CKFT5 for [EMAIL PROTECTED]; Sun, 14 Jan 2007 13:07:22 +0700Return-Path: [EMAIL PROTECTED]Envelope-To: [EMAIL PROTECTED]Delivery-Date: Sun, 14 Jan 2007 13:07:22 +0700X-Mailer: MIME-tools 4.104 (Entity 4.116)X-Priority: 3Message-Id: [EMAIL PROTECTED]Mime-Version: 1.0Content-Type: text/plainContent-Disposition: inlineContent-Transfer-Encoding: binaryContent-Length: 493Lines: 15 6E4IEHB8UW48PDUQIXUK4Content-Type: text/plain; charset=windows-1253Content-Transfer-Encoding: 7bit6E4IEHB8UW48PDUQIXUK4Content-Type: text/html; charset=windows-1253Content-Transfer-Encoding: 7bit!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"HTMLTITLE/TITLEIMG width=283 height=410 alt="" hspace=1 vspace=1 src=""cid:FFH007M8.4T77IITD.2MJGL6HN.A0G8IKL4">cid:FFH007M8.4T77IITD.2MJGL6HN.A0G8IKL4_csseditor"/HTML6E4IEHB8UW48PDUQIXUK4--[[Removed gif attachment]] 

Re: A Note Regarding DHCP Zone

2006-09-10 Thread mouss

David Cary Hart wrote:

We are listing a large number of ranges that SORBS does not. There
are numerous operational differences that make both lists useful.
  


It's ok if the lists contain different entries. my concern is if a list 
includes another one. you can setup aggregate lists if you want, but 
this is not what I am talking about.


The issue is that if a list includes (most or all of) another list, then 
we need to add _OVERLAP_ style rules to avoid a single reason getting 
a too high score. if on the other hands lists are kept independent, then 
finding an IP on two (or more) is meaningful.


The other issue is the one I cited: the unoisting process gets more 
complicated. which isn't good.



ALL DHCP lists include some static, generic hosts. Most of the white
listing that we handle is also listed by SORBS, some by NJABL. The
fact that we choose to discuss the matter is part of the quality
management process. 

  


I know I'm nitpicking, but I find the name misleading. why not choose a 
different name? (I don't like dul nor duhl either).




* Most of these are in mixed dynamic and static ranges. We are
white listing these immediately upon request and verification.
  
  

Can you detail this process please? Why isn't it automatic?



It is fully documented on our site. We prefer to provide EXTREMELY
expeditious handling (usually within a hour). Furthermore, we choose
to have a meaningful dialog to help them get unlisted. 


What is automated is that every removal request generates the
necessary data (from dig to whois) to evaluate the issue.

Frankly, we have made enormous progress is a relatively short period
of time. 
  


as long as you keep doing so, I can only congratulate you. but if one 
day you don't have the time to manage this this way, then please do 
something to handle that.


It's a balancing act.  
  
  

I'd say random art... Thank you for your participation to the
Balkanize The Internet project.


Nonsense. 


Don't take it bad. It was voluntarily provocative.

The endeavor is well documented and proactively managed. Do
you have anything constructive to add? 

  


My main concern is list dependencies. a well-known example is the 
various uribl overlap. sometimes, these are enough to give a huge score, 
so one needs a lot of tuning and tweaking to get a more realistic score. 
This is one reason I don't like list inclusions, except for things like 
what xbl does: different results depending on the source.


regards,
mouss



Re: Fwd: Drink it, forget it !

2006-09-10 Thread Matt Kettler
At a casual guess, I'd say that the UNPARSABLE_RELAY might be related.

Run it through with -D on and see which Received: headers are unparseable.


Robert Nicholson wrote:
 Why didn't DATE_IN_FUTURE file on this message?

 Begin forwarded message:

 *From: *Frederick Harris [EMAIL PROTECTED]
 mailto:[EMAIL PROTECTED]
 *Date: *January 14, 2007 12:07:22 AM CST
 *To: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED]
 *Subject: **Drink it, forget it !*
 *X-Spam-Dcc: *: grub.camros.com 1113; Body=1 Fuz1=1
 *X-Spam-Flag: *YES
 *X-Spam-Checker-Version: *SpamAssassin 3.1.1 (2006-03-10) on
 grub.camros.com
 *X-Spam-Level: *
 *X-Spam-Status: *Yes, score=12.5 required=0.6
 tests=BAYES_99,HTML_90_100,
 HTML_IMAGE_ONLY_04,HTML_MESSAGE,HTML_TITLE_EMPTY,MIME_HTML_MOSTLY,
 RCVD_IN_NJABL_DUL,RCVD_IN_SORBS_DUL,UNPARSEABLE_RELAY autolearn=no 
 version=3.1.1
 *
 *



Re: Drink it, forget it !

2006-09-10 Thread Robert Nicholson

The last two received headers in the message looked forged?

[12657] dbg: received-header: parsed as [ ip=61.15.158.107  
rdns=cm61-15-158-107.hkcable.com.hk helo=!71353437! by=caching4- 
true.asianet.co.th ident= [EMAIL PROTECTED] intl=0 id=WDL- 
C580H-YQ auth= ]
[12657] dbg: received-header: relay 61.15.158.107 trusted? no  
internal? no
[12657] dbg: received-header: unknown format: from klenske.com  
(52680622055 [01783113]) by mail.klenke.de (Qmailv1) with ESMTP id  
6E7WB7CKFT5 for [EMAIL PROTECTED]; Sun, 14 Jan 2007 13:07:22 +0700
[12657] dbg: received-header: unparseable: from klenske.com  
(52680622055 [01783113]) by mail.klenke.de (Qmailv1) with ESMTP id  
6E7WB7CKFT5 for [EMAIL PROTECTED]; Sun, 14 Jan 2007 13:07:22 +0700


On Sep 10, 2006, at 1:30 PM, Matt Kettler wrote:


At a casual guess, I'd say that the UNPARSABLE_RELAY might be related.

Run it through with -D on and see which Received: headers are  
unparseable.



Robert Nicholson wrote:

Why didn't DATE_IN_FUTURE file on this message?

Begin forwarded message:


*From: *Frederick Harris [EMAIL PROTECTED]
mailto:[EMAIL PROTECTED]
*Date: *January 14, 2007 12:07:22 AM CST
*To: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED]
*Subject: **Drink it, forget it !*
*X-Spam-Dcc: *: grub.camros.com 1113; Body=1 Fuz1=1
*X-Spam-Flag: *YES
*X-Spam-Checker-Version: *SpamAssassin 3.1.1 (2006-03-10) on
grub.camros.com
*X-Spam-Level: *
*X-Spam-Status: *Yes, score=12.5 required=0.6
tests=BAYES_99,HTML_90_100,
HTML_IMAGE_ONLY_04,HTML_MESSAGE,HTML_TITLE_EMPTY,MIME_HTML_MOSTLY,
RCVD_IN_NJABL_DUL,RCVD_IN_SORBS_DUL,UNPARSEABLE_RELAY autolearn=no
version=3.1.1
*
*




Re: Drink it, forget it ! .... bug in _check_date_diff

2006-09-10 Thread Daryl C. W. O'Shea

Robert Nicholson wrote:

This looks to be something Spammers are deliberately working around as 
how could you possibly get two received headers with the same date, time 
to the second?


That's like saying how could you possibly get two received headers with 
the same date, time to the minute or hour.  Why not?


I good portion of my mail, both ham and spam, is passed through a series 
of relays within a second.



Daryl


Re: Drink it, forget it ! .... bug in _check_date_diff

2006-09-10 Thread Robert Nicholson
Well either way. Assuming that the lowest numbered date diff  
represents the real receive time is niave at best.


On Sep 10, 2006, at 2:03 PM, Daryl C. W. O'Shea wrote:


Robert Nicholson wrote:

This looks to be something Spammers are deliberately working  
around as how could you possibly get two received headers with the  
same date, time to the second?


That's like saying how could you possibly get two received headers  
with the same date, time to the minute or hour.  Why not?


I good portion of my mail, both ham and spam, is passed through a  
series of relays within a second.



Daryl


Re: Drink it, forget it ! .... bug in _check_date_diff

2006-09-10 Thread Daryl C. W. O'Shea
I haven't read any of the rest of this thread, but I'll respond to the 
latest...


Robert Nicholson wrote:
Well either way. Assuming that the lowest numbered date diff represents 
the real receive time is niave at best.


As is assuming that the rule assumes that the times are real.

Comparing the Date: header and the time in a received header that you 
can trust would FP all over the place any time there was a delay (down 
MX, etc.) in legit mail.



Daryl


Fwd: פריצת דרך מאתגרת

2006-09-10 Thread Robert Nicholson
Why didn't foreign charset rules catch this?Begin forwarded message:From: [EMAIL PROTECTED]Date: September 10, 2006 2:17:51 PM CDTTo: [EMAIL PROTECTED]Subject: פריצת דרך מאתגרתX-Spam-Dcc: : grub.camros.com 1113; Body=5 Fuz1=5 Fuz2=3X-Spam-Flag: YESX-Spam-Checker-Version: SpamAssassin 3.1.1 (2006-03-10) on grub.camros.comX-Spam-Level: *X-Spam-Status: Yes, score=5.7 required=0.6 tests=BAYES_95,FRONTPAGE, HTML_90_100,HTML_IMAGE_RATIO_02,HTML_MESSAGE,HTML_TITLE_SUBJ_DIFF, MIME_HTML_ONLY,NO_REAL_NAME,UNPARSEABLE_RELAY autolearn=no  version=3.1.1X-Spam-Report: *  1.0 NO_REAL_NAME From: does not include a real name *  0.0 UNPARSEABLE_RELAY Informational: message has unparseable relay *      lines *  0.5 HTML_IMAGE_RATIO_02 BODY: HTML has a low ratio of text to image *      area *  0.1 HTML_90_100 BODY: Message is 90% to 100% HTML *  0.0 HTML_MESSAGE BODY: HTML included in message *  3.0 BAYES_95 BODY: Bayesian spam probability is 95 to 99% *      [score: 0.9667] *  0.0 MIME_HTML_ONLY BODY: Message only has text/html MIME parts *  0.9 FRONTPAGE RAW: Frontpage used to create the message *  0.3 HTML_TITLE_SUBJ_DIFF HTML_TITLE_SUBJ_DIFFReceived: (qmail 10557 invoked from network); 10 Sep 2006 18:17:08 -Received: from  (HELO kini12.com) (208.53.131.241) by 64.34.193.12 with SMTP; 10 Sep 2006 18:17:08 -Message-Id: [EMAIL PROTECTED]Mime-Version: 1.0Content-Type: text/html; charset="windows-1255"Content-Transfer-Encoding: quoted-printableLines: 124  להגיע למיליון לקוחות ?גם אתם רוצים    נא לחצו כאן מתנצלים אם גרמנו להפרעה, להסרה מרשימת הדיוורנמען נכבד, אנו  לחץלהסרה לחצו כאן 

Re: פריצת דרך מאתגרת

2006-09-10 Thread Robert Nicholson
Windows-1255and apparently with locales DB6 x @locales0  'en'1  'th'2  'it'3  'en_US'Mail::SpamAssassin::Locales::is_charset_ok_for_locales($1, @locales)returns trueMail::SpamAssassin::Locales::is_charset_ok_for_locales(/home/robert/lib/perl5/site_perl/5.8.0/Mail/SpamAssassin/Locales.pm:91):91:       return 1 if ($cs =~ /^WINDOWS/);      # argh, Windowswhat?On Sep 10, 2006, at 4:38 PM, Robert Nicholson wrote:Why didn't foreign charset rules catch this?Begin forwarded message:From: [EMAIL PROTECTED]Date: September 10, 2006 2:17:51 PM CDTTo: [EMAIL PROTECTED]Subject: פריצת דרך מאתגרתX-Spam-Dcc: : grub.camros.com 1113; Body=5 Fuz1=5 Fuz2=3X-Spam-Flag: YESX-Spam-Checker-Version: SpamAssassin 3.1.1 (2006-03-10) on grub.camros.comX-Spam-Level: *X-Spam-Status: Yes, score=5.7 required=0.6 tests=BAYES_95,FRONTPAGE, HTML_90_100,HTML_IMAGE_RATIO_02,HTML_MESSAGE,HTML_TITLE_SUBJ_DIFF, MIME_HTML_ONLY,NO_REAL_NAME,UNPARSEABLE_RELAY autolearn=no  version=3.1.1X-Spam-Report: *  1.0 NO_REAL_NAME From: does not include a real name *  0.0 UNPARSEABLE_RELAY Informational: message has unparseable relay *      lines *  0.5 HTML_IMAGE_RATIO_02 BODY: HTML has a low ratio of text to image *      area *  0.1 HTML_90_100 BODY: Message is 90% to 100% HTML *  0.0 HTML_MESSAGE BODY: HTML included in message *  3.0 BAYES_95 BODY: Bayesian spam probability is 95 to 99% *      [score: 0.9667] *  0.0 MIME_HTML_ONLY BODY: Message only has text/html MIME parts *  0.9 FRONTPAGE RAW: Frontpage used to create the message *  0.3 HTML_TITLE_SUBJ_DIFF HTML_TITLE_SUBJ_DIFFReceived: (qmail 10557 invoked from network); 10 Sep 2006 18:17:08 -Received: from  (HELO kini12.com) (208.53.131.241) by 64.34.193.12 with SMTP; 10 Sep 2006 18:17:08 -Message-Id: [EMAIL PROTECTED]Mime-Version: 1.0Content-Type: text/html; charset="windows-1255"Content-Transfer-Encoding: quoted-printableLines: 124  להגיע למיליון לקוחות ?גם אתם רוצים    נא לחצו כאן מתנצלים אם גרמנו להפרעה, להסרה מרשימת הדיוורנמען נכבד, אנו  לחץלהסרה לחצו כאן 

Re: LOG: Re: Drink it, forget it ! .... bug in _check_date_diff

2006-09-10 Thread Daryl C. W. O'Shea

Robert Nicholson wrote:
If you converted all times to GMT and compared them against now and if 
they were  now how often would that be FPing?


I suppose that the spam hit rate would go up a little for the 
DATE_IN_FUTURE_* rules, while the ham hit rate (caused by the thousands 
of people who don't know how to set their system's time) would stay the 
same.


For DATE_IN_PAST_* rules the spam hit rate would probably go up a little 
too, but the ham hit rate would sky rocket.



Daryl


Re: LOG: Re: Drink it, forget it ! .... bug in _check_date_diff

2006-09-10 Thread Robert Nicholson
I personally am probably not interested in mail from people who don't  
know how to set their system's time but you could implement it using  
a threshold. To me that's a lot better than assuming an n hour  
difference b/w Received and Date: etc which the sender can easily  
forge. Unless the Received header it's checking is one that's  
guaranteed to be outside of the senders network.


On Sep 10, 2006, at 5:11 PM, Daryl C. W. O'Shea wrote:


Robert Nicholson wrote:
If you converted all times to GMT and compared them against now  
and if they were  now how often would that be FPing?


I suppose that the spam hit rate would go up a little for the  
DATE_IN_FUTURE_* rules, while the ham hit rate (caused by the  
thousands of people who don't know how to set their system's time)  
would stay the same.


For DATE_IN_PAST_* rules the spam hit rate would probably go up a  
little too, but the ham hit rate would sky rocket.



Daryl


Re: SPF Scores

2006-09-10 Thread BG Mahesh
On 9/9/06, Daryl C. W. O'Shea [EMAIL PROTECTED] wrote:
Michael Scheidell wrote:Bug 5077 includes a one line patch to fix this.It'll be included in3.1.6 but is trivial to apply by hand now.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5077I seem to have multiple versions of SPF.pm on my box,./perl5/site_perl/5.8.6/Mail/SpamAssassin/Plugin/SPF.pm./perl5/site_perl/5.8.6/i386-linux-thread-multi/Net/DNS/RR/SPF.pm
./perl5/vendor_perl/5.8.6/Mail/SpamAssassin/Plugin/SPF.pmShould I be deleting perl5/vendor_perl/5.8.6/Mail/SpamAssassin/Plugin/SPF.pm which seems to be very old when compared to perl5/site_perl/5.8.6/Mail/SpamAssassin/Plugin/SPF.pm ?
-- --B.G. Maheshhttp://www.greynium.com/http://www.oneindia.in/http://www.click.in/
 - Free Indian Classifieds