Re: Hmm - a server I manage is triggering Botnet

2007-01-28 Thread Thomas Bolioli

John Rudd wrote:
If you think there is a case where Botnet breaks down for 
multiple/virtual mail domains, where DNS and rDNS are properly set up, 
put your money where your mouth is and give a real world example.  
Give the IP address(es), and the mail domains that go with them that 
you think will have a problem.

I have, to this list and you never responded... See below.

Alumni connections is a forwarder service. uptilt is sending email for 
nashbar.com


Message-ID: [EMAIL PROTECTED]
Date: Sun, 31 Dec 2006 09:29:46 -0500
From: Thomas Bolioli [EMAIL PROTECTED]
User-Agent: Thunderbird 2.0b1 (Macintosh/20061206)
MIME-Version: 1.0
To: Spamassassin Users List users@spamassassin.apache.org
Subject: Re: Botnet 0.7 Plugin is available
References: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL 
PROTECTED]
In-Reply-To: [EMAIL PROTECTED]
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

See below for content. I forgot to send this to the list.
John Rudd wrote:

Thomas Bolioli wrote:

It seems to have an issue with mail sent through forwarders like 
alumni accounts and one mail type systems. I am sending you a note 
off line with the details.



No... it doesn't look that way at all.

If you read the spam report headers, it clearly states what the 
problem is with _BOTH_ of the messages you sent me:


   *  0.1 BOTNET_BADDNS Relay doesn't have full circle DNS

BOTNET is triggering because the relay which is submitting the message 
to you doesn't have full circle DNS (the hostname returned by the PTR 
lookup doesn't resolve back to the IP address that is submitting the 
message).  It's not because BOTNET has a problem with mail forwarding 
services (not indicated at all by the first message you sent me), nor 
is it because it's a server initiated message (the second message; the 
presence of BOTNET_SERVERWORDS should have scored -0.1, and would have 
served to prevent BOTNET_CLIENT from triggering ... which it did: 
BOTNET_CLIENT doesn't show up in that message's spam report).


In that regard, neither of these is a false positive.  BOTNET is told 
to flag messages that have Bad DNS configurations, and these two 
mail relays have bad dns configurations, so BOTNET flagged them.


I can't tell you if the messages themselves were spam or not... the 
2nd one definitely looks like spam to me, but the 
sender/recipient/subject of the first one doesn't look like spam.  If 
you say that they're ham, then I would give you a few courses of action:



1) add the domain name in a botnet_pass_domains entry in Botnet.cf:

For the first message:

 * 
[botnet_baddns,ip=198.212.10.108,rdns=permemail05.alumniconnections.com]


becomes:

botnet_pass_domains alumniconnections\.com

For the second message:

 * [botnet_baddns,ip=208.66.204.41,rdns=mail31.uptilt.com]

becomes:

botnet_pass_domains uptilt\.com


2) for the second message, either do something like the above, or add 
the IP address, in the botnet report, to Botnet.cf as a botnet_pass_ip:


For the first message:

 * 
[botnet_baddns,ip=198.212.10.108,rdns=permemail05.alumniconnections.com]


becomes:

botnet_pass_ip ^198\.212\.10\.108$

For the second message:

 * [botnet_baddns,ip=208.66.204.41,rdns=mail31.uptilt.com]

becomes:

botnet_pass_ip ^208\.66\.204\.41$


3) send email to abuse@ hostmaster@ and postmaster@ each of the 
domains, showing them the headers of the message they sent you, 
including the spam report headers, and informing them that their DNS 
misconfigurations make their mail servers appear to be potential spam 
sources, and that they should fix this by having the hostnames 
returned by any of their PTR records actually resolve back to the IP 
address that the PTR record is attached to.



IMO: the 3rd one is the thing that should happen (the mail servers 
should have their DNS configurations fixed).  I'll think about adding 
alumniconnections.com to the centrally distributed Botnet.cf.  But, 
given the content of the message from uptilt.com, I really don't think 
I'd add them to the centrally distributed Botnet.cf.


I agree that the third should happen but I am a little confused. Why are
these failing rdns lookups?
I do the lookups and I get this:
Sailfish:~ tbolioli$ host permemail05.alumniconnections.com
permemail05.alumniconnections.com has address 198.212.10.108
Sailfish:~ tbolioli$ host 198.212.10.108
108.10.212.198.in-addr.arpa domain name pointer
permemail05.alumniconnections.com.
Sailfish:~ tbolioli$ host mail31.uptilt.com
mail31.uptilt.com has address 208.66.204.41
Sailfish:~ tbolioli$ host 208.66.204.41
41.204.66.208.in-addr.arpa domain name pointer mail31.uptilt.com.
Sailfish:~ tbolioli$ host 208.66.204.40

Is there something I am missing or that I am doing wrong in my lookups?
I want to get these entities to change but I am not sure what to tell
them to do.
Thanks,
Tom






Re: cbl RBL

2007-01-28 Thread Thomas Bolioli

Theo Van Dinter wrote:

On Sat, Jan 27, 2007 at 06:52:29PM -0500, Thomas Bolioli wrote:
  
/etc/procmail and it is fired off with a user .forward file |IFS=' '  
exec /usr/bin/procmail || exit 75 #tpblists. Still looking into Net::DNS.



A few ideas.  First, do DROPPRIVS=yes if you haven't already.  Second, why are
you using a .forward file?  Just set procmail as the MDA.

  
DROPPRIVS is already set to yes. In answer to the second question, 
legacy. This machine has an upgrade legacy of 6 years. I set it up this 
way because I was not having SA do checks for every account and I was 
experimenting when I first setup spam filtering. Changing that may 
become my sunday morning task...


I am still at a complete loss to explain why some users (when running SA 
from the cmdline) can do rbl checks and others can't. I have set the 
user_prefs files to be exactly the same, eliminating any config deltas 
from potentially causing this. I have confirmed though that the problem 
is that the DNS queries are definitely timing out and upping the timeout 
to 60 secs does nothing but delay the inevitable. I was mistaken that it 
was the SPF* tests zeroed out that was causing the issue. But it looked 
that way for a while.


Now, the only thing clustering the groups (ie; those that work, and 
those that do not) is the two accounts (there may be more but I will not 
be digging into my clients email accounts) that do not successfully 
check RBLs get by far the most amount of spam compared to the others 
that work.


Anyone with ideas, they would be greatly appreciated but right now I 
need to determine if it is SA that is having issues with the lookups or 
are the accounts screwed up in some way. bind does not seem to be 
throttled either so the volume of queries should not be the issue either.


Re: cbl RBL (RESOLVED)

2007-01-28 Thread Thomas Bolioli

Thomas Bolioli wrote:
Anyone with ideas, they would be greatly appreciated but right now I 
need to determine if it is SA that is having issues with the lookups 
or are the accounts screwed up in some way. bind does not seem to be 
throttled either so the volume of queries should not be the issue either.


After doing a diff between the home dirs of some of these users, I found 
.resolv.conf files in the offending users directories. I am not sure how 
they got there (they were ~2-3 yrs old and formatted in such a way it 
leads me to believe they were put there by an application) but they were 
pointing at older DNS servers that went offline about a month or two 
ago. I removed them and now the spam coming in is firing off on one or 
more rbls. Somehow the presence of these did not interfere with non-DNS 
specific requests. ie; GET would work with this there.

Thanks for the help everyone.
Tom


Re: Botnet 0.7 syslog entry: Use of uninitialized value

2007-01-28 Thread Thomas Bolioli

Yves Goergen wrote:

Hi,

I have installed Botnet 0.7 from the previous announcements on this
list. This is a syslog entry I got today (and maybe already before):

Jan 28 09:01:04 mond spamd[12174]: Use of uninitialized value in string
eq at /etc/mail/spamassassin/Botnet.pm line 564, GEN1380 line 93.

Is that a problem?

  
Probably not. That error is a common perl programming info message. The 
plugin author is a regular on this list so I am sure he will see your 
note and check it out but I would not lose sleep over it.

Tom


Re: cbl RBL

2007-01-27 Thread Thomas Bolioli

Alexis Manning wrote:

[EMAIL PROTECTED] wrote:
  

I am trying to get lookups against cbl (http://cbl.abuseat.org/) and
it does not seem to be working.



Not a direct answer to your rules question, but isn't the CBL already
included in the XBL check?

-- A.
  
Right you are... Then I have another issue. My RBL checks are not firing 
off...


Lint issues

2007-01-27 Thread Thomas Bolioli

I am running sa w/lint and it never sees the email I am passing to it.
the cmd line is:
spamassassin -D --lint  email
and the output is always:

snip...
[29845] dbg: check: is spam? score=2.216 required=6
[29845] dbg: check: 
tests=MISSING_HEADERS,MISSING_SUBJECT,NO_REAL_NAME,NO_RECEIVED,NO_RELAYS,TO_CC_NONE
[29845] dbg: check: 
subtests=__HAS_MSGID,__MSGID_OK_DIGITS,__MSGID_OK_HOST,__NONEMPTY_BODY,__SANE_MSGID,__UNUSABLE_MSGID



Which leads me to believe the email is not getting read by SA. Any 
thoughts? Am I doing this wrong?

Tom


Re: cbl RBL

2007-01-27 Thread Thomas Bolioli

Alexis Manning wrote:

Thomas Bolioli [EMAIL PROTECTED] wrote:
  

Right you are... Then I have another issue. My RBL checks are not firing
off...



If you're not seeing *any* BLs ever firing in your SA-marked up mails then
it'd sound like a DNS issue, e.g. misconfigured firewall or router.

If you're seeing some intermittently then perhaps your DNSBL checks are
timing out and you'd need to increase rbl_timeout in your local.cf

-- A. 
  
DNS is working. I am running queryperf right now to see what impact 
timeouts could be having. The machine is a DNS server and I am sure it 
is working. I also saw lint output that was able to lookup intel.com and 
the other network tests are firing. I do not think they are intermittent.


Re: cbl RBL

2007-01-27 Thread Thomas Bolioli

Theo Van Dinter wrote:

On Sat, Jan 27, 2007 at 09:19:40PM -, Alexis Manning wrote:
  

If you're not seeing *any* BLs ever firing in your SA-marked up mails then
it'd sound like a DNS issue, e.g. misconfigured firewall or router.



Or you've disabled rules, or disabled rbl checks, or you're running in
local mode, or ...

  

Definitely not disabled (rules or rbl checks). local mode What is that?


Re: Hmm - a server I manage is triggering Botnet

2007-01-27 Thread Thomas Bolioli

Josh Trutwin wrote:

On Fri, 26 Jan 2007 16:43:17 -0800
John Rudd [EMAIL PROTECTED] wrote:

  

X-Envelope-From: [EMAIL PROTECTED]
Received: from netbits.us ([209.18.107.89])
  by 0 ([192.168.0.3])
  with SMTP via SSL; 25 Jan 2007 23:47:53 -
  

That would seem to be your problem.  I bet SA thinks that means
the machine has no reverse DNS.  And netbits.us has a completely
different IP address than that.


SA or Botnet?
  
SA.  SA is the one that interprets the headers.  Botnet reads the 
interpreted headers.



This is only scoring a 5.1 though - I posted the SA report in a
previous message, my only bad hit is from Botnet:

Content analysis details:   (5.1 points, 5.0 required)


 0.0 DK_POLICY_SIGNSOME Domain Keys: policy says domain signs
some mails 
5.0 BOTNET Relay might be a spambot or

virusbot
[botnet0.7,ip=209.18.107.89,hostname=netbits.us,maildomain=davidtrutwin.com,baddns]
1.5 RCVD_NUMERIC_HELO  Received: contains an IP address used
for HELO 
-0.2 BAYES_40   BODY: Bayesian spam probability is 20
to 40% [score: 0.3696] 
-1.2 AWLAWL: From: address is in the auto

white-list


I'm curious to see if changing the PTR records will help.

Josh
  
Yeah, this is the problem with the Botnet ruleset. I had to stop using 
it. It assumes that one IP, one domain with regards to mail. If your 
mail server handles multiple domains, whichever domain the rDNS points 
to will be fine. Any others will fire off. There is an exception list 
built into the plugin but I am philosophically opposed to manually 
managing lists like that on a per machine basis. If you want to stop the 
bot net mails heading into your inbox, make sure your RBL lookups are 
working. Those are much better than the botnet plugin.




Re: cbl RBL

2007-01-27 Thread Thomas Bolioli

Theo Van Dinter wrote:

On Sat, Jan 27, 2007 at 09:19:40PM -, Alexis Manning wrote:
  

If you're not seeing *any* BLs ever firing in your SA-marked up mails then
it'd sound like a DNS issue, e.g. misconfigured firewall or router.



Or you've disabled rules, or disabled rbl checks, or you're running in
local mode, or ...

  

This is really odd...
The RBL checks fired off from the command line (while a queryperf was 
running against the DNS server...) but not when postfix passes the email 
off through procmail as the same users ID. This is stumping me. Any ideas?


Re: cbl RBL

2007-01-27 Thread Thomas Bolioli

Thomas Bolioli wrote:

Theo Van Dinter wrote:

On Sat, Jan 27, 2007 at 09:19:40PM -, Alexis Manning wrote:
  

If you're not seeing *any* BLs ever firing in your SA-marked up mails then
it'd sound like a DNS issue, e.g. misconfigured firewall or router.



Or you've disabled rules, or disabled rbl checks, or you're running in
local mode, or ...

  

This is really odd...
The RBL checks fired off from the command line (while a queryperf was 
running against the DNS server...) but not when postfix passes the 
email off through procmail as the same users ID. This is stumping me. 
Any ideas? 
Actually, this is getting even odder. There is one account on the system 
that the RBL checks do not fail to execute when run through postfix 
su'd. That is acct x and it uses nothing special and has a blank 
user_prefs (plain vanilla account). Accounts a-y are a mix of plain 
vanilla ones and customized ones. Yet, account x is the only one that 
RBL lookups is working on. Is there anything in how SA deals with DNS 
lookups that could cause this?

Tom


Re: cbl RBL

2007-01-27 Thread Thomas Bolioli

Theo Van Dinter wrote:

On Sat, Jan 27, 2007 at 04:52:23PM -0500, Thomas Bolioli wrote:
  
The RBL checks fired off from the command line (while a queryperf was 
running against the DNS server...) but not when postfix passes the email 
off through procmail as the same users ID. This is stumping me. Any ideas?



/etc/procmailrc or .procmailrc?  What does it look like?

  
/etc/procmail and it is fired off with a user .forward file |IFS=' '  
exec /usr/bin/procmail || exit 75 #tpblists. Still looking into Net::DNS.




Re: cbl RBL

2007-01-27 Thread Thomas Bolioli

Theo Van Dinter wrote:

On Sat, Jan 27, 2007 at 05:25:59PM -0500, Thomas Bolioli wrote:
  
vanilla ones and customized ones. Yet, account x is the only one that 
RBL lookups is working on. Is there anything in how SA deals with DNS 
lookups that could cause this?



SA calls Net::DNS, which as far as I know just looks at resolv.conf,
then makes queries.  I'd probably run a mail through spamassassin in debug
mode to see what these other accounts are doing.

  
resolve.conf is fine. When I run them using su as those users, it works 
fine. It appears to be something with how procmail runs them.


Re: cbl RBL

2007-01-27 Thread Thomas Bolioli

Thomas Bolioli wrote:

Theo Van Dinter wrote:

On Sat, Jan 27, 2007 at 05:25:59PM -0500, Thomas Bolioli wrote:
  
vanilla ones and customized ones. Yet, account x is the only one that 
RBL lookups is working on. Is there anything in how SA deals with DNS 
lookups that could cause this?



SA calls Net::DNS, which as far as I know just looks at resolv.conf,
then makes queries.  I'd probably run a mail through spamassassin in debug
mode to see what these other accounts are doing.

  
resolve.conf is fine. When I run them using su as those users, it 
works fine. It appears to be something with how procmail runs them.
Actually, I stand corrected. There are some accounts which reliably do 
the rbl checks and others that do not. The ones that do not do it had 
SPF tests zero'd out. I am into new and unchartered territory but does 
that seem like a bug?




process for getting plugins included into the core dist

2007-01-03 Thread Thomas Bolioli
I was curious what the process was for plugins that get included into 
the core distribution. Also, how are the scores determined? Is is best 
guess or is there actually a statistical analysis done with a corpus to 
determine the most efficient scoring for a particular rule set? Also, 
does that scoring change over time as the corpus changes?

Thanks,
Tom


Re: process for getting plugins included into the core dist

2007-01-03 Thread Thomas Bolioli

Thomas Bolioli wrote:
I was curious what the process was for plugins that get included into 
the core distribution. Also, how are the scores determined? Is is best 
guess or is there actually a statistical analysis done with a corpus 
to determine the most efficient scoring for a particular rule set? 
Also, does that scoring change over time as the corpus changes?

Thanks,
Tom
Forget it. Doh! I should have checked the faqs instead of relying on 
google and my own perceptions on what words would be used to describe 
what I was looking for...

Thanks anyway,
Tom


Re: Botnet 0.7 Plugin is available

2006-12-31 Thread Thomas Bolioli

See below for content. I forgot to send this to the list.
John Rudd wrote:

Thomas Bolioli wrote:

It seems to have an issue with mail sent through forwarders like 
alumni accounts and one mail type systems. I am sending you a note 
off line with the details.



No... it doesn't look that way at all.

If you read the spam report headers, it clearly states what the 
problem is with _BOTH_ of the messages you sent me:


   *  0.1 BOTNET_BADDNS Relay doesn't have full circle DNS

BOTNET is triggering because the relay which is submitting the message 
to you doesn't have full circle DNS (the hostname returned by the PTR 
lookup doesn't resolve back to the IP address that is submitting the 
message).  It's not because BOTNET has a problem with mail forwarding 
services (not indicated at all by the first message you sent me), nor 
is it because it's a server initiated message (the second message; the 
presence of BOTNET_SERVERWORDS should have scored -0.1, and would have 
served to prevent BOTNET_CLIENT from triggering ... which it did: 
BOTNET_CLIENT doesn't show up in that message's spam report).


In that regard, neither of these is a false positive.  BOTNET is told 
to flag messages that have Bad DNS configurations, and these two 
mail relays have bad dns configurations, so BOTNET flagged them.


I can't tell you if the messages themselves were spam or not... the 
2nd one definitely looks like spam to me, but the 
sender/recipient/subject of the first one doesn't look like spam.  If 
you say that they're ham, then I would give you a few courses of action:



1) add the domain name in a botnet_pass_domains entry in Botnet.cf:

For the first message:

 * 
[botnet_baddns,ip=198.212.10.108,rdns=permemail05.alumniconnections.com]


becomes:

botnet_pass_domains alumniconnections\.com

For the second message:

 * [botnet_baddns,ip=208.66.204.41,rdns=mail31.uptilt.com]

becomes:

botnet_pass_domains uptilt\.com


2) for the second message, either do something like the above, or add 
the IP address, in the botnet report, to Botnet.cf as a botnet_pass_ip:


For the first message:

 * 
[botnet_baddns,ip=198.212.10.108,rdns=permemail05.alumniconnections.com]


becomes:

botnet_pass_ip ^198\.212\.10\.108$

For the second message:

 * [botnet_baddns,ip=208.66.204.41,rdns=mail31.uptilt.com]

becomes:

botnet_pass_ip ^208\.66\.204\.41$


3) send email to abuse@ hostmaster@ and postmaster@ each of the 
domains, showing them the headers of the message they sent you, 
including the spam report headers, and informing them that their DNS 
misconfigurations make their mail servers appear to be potential spam 
sources, and that they should fix this by having the hostnames 
returned by any of their PTR records actually resolve back to the IP 
address that the PTR record is attached to.



IMO: the 3rd one is the thing that should happen (the mail servers 
should have their DNS configurations fixed).  I'll think about adding 
alumniconnections.com to the centrally distributed Botnet.cf.  But, 
given the content of the message from uptilt.com, I really don't think 
I'd add them to the centrally distributed Botnet.cf.


I agree that the third should happen but I am a little confused. Why are
these failing rdns lookups?
I do the lookups and I get this:
Sailfish:~ tbolioli$ host permemail05.alumniconnections.com
permemail05.alumniconnections.com has address 198.212.10.108
Sailfish:~ tbolioli$ host 198.212.10.108
108.10.212.198.in-addr.arpa domain name pointer
permemail05.alumniconnections.com.
Sailfish:~ tbolioli$ host mail31.uptilt.com
mail31.uptilt.com has address 208.66.204.41
Sailfish:~ tbolioli$ host 208.66.204.41
41.204.66.208.in-addr.arpa domain name pointer mail31.uptilt.com.
Sailfish:~ tbolioli$ host 208.66.204.40

Is there something I am missing or that I am doing wrong in my lookups?
I want to get these entities to change but I am not sure what to tell
them to do.
Thanks,
Tom



Re: Botnet 0.7 Plugin is available

2006-12-27 Thread Thomas Bolioli

John Rudd wrote:


Botnet 0.7 is up and available.

http://people.ucsc.edu/~jrudd/spamassassin/Botnet-0.7.tar


Botnet is a SpamAssassin plugin which attempts to identify hosts which 
are likely to be spambot/virusbot hosts, using various DNS 
fingerprints of the submitting relay.




New things in 0.7:


1) BOTNET_SOHO -- If the sender's (chosen from Envelope-From, 
Return-Path, or From, in that order) mail domain (the part after the 
@ sign) resolves back to the relay's IP address, or has an MX host 
which resolves back to the IP address, AND the sender's mail domain 
does NOT match the PTR record for the relay, then we'll assume this 
is a small office/home office mail server.  We'll exempt them from 
BOTNET being triggered.  (note: someone suggested that this check 
also try to resolve the HELO string, I make a note in my code as to 
why this is an extremely bad idea, and have a commented out block of 
code there for anyone who wants to go down that path ... but, really, 
don't)



2) Botnet API -- want to include the Botnet.pm module in other Perl 
code?  Maybe call check_botnet from mimedefang-filter so you can 
block before a message gets to SpamAssassin?  I've made an API for 
it.  The routines that SA calls use this API, so it's the 
_exact_same_ code. There's now an included perl program Botnet.pl 
which takes an IP address CLI argument, and an optional main-domain 
CLI argument.  It will tell you which rules do and don't get 
triggered.  It also serves as an example of using the API.  (you will 
still need to have SpamAssassin installed in order to use Botnet.pm 
in this fashion, even if you're using the API in a program that 
doesn't call SA)


The file Botnet.api.txt also describes the API somewhat.


3) BOTNET_CLIENT and BOTNET are now actual rules instead of meta 
rules.  The individual rules are still there, just with zero'd 
scores.  You can now easily pick between 1 big rule (BOTNET doing 
eval:botnet()), meta rules (detailed in the file 
Botnet.variations.txt), or piece-meal calling of the individual 
checks (also detailed in Botnet.variations.txt).



4) config option: botnet_pass_trusted (all|public|private|ignore)
This defaults to public.  If you have any public IP addresses in 
your relays-trusted list, then Botnet wont trigger.  Private means 
any private IP addresses, where that includes 127.*, 10.*, etc..  
All means either of those two.  Ignore means do what Botnet used to 
do: not even look at the trusted relays, just look past them.  The 
idea is: if you got this from a trusted relay, we can assume it 
wasn't a Botnet.



5) botnet_pass_auth now looks at the trusted relays.  It probably 
should have been doing that all along.  It no longer looks at the 
untrusted relays.



6) Rules that get triggered now use $permsgstatus-test_log to record 
information.  The individual rules just list 
[rulename,ip=$ip,hostname=$host,maildomain=$domain] or an 
appropriate subset of that based on which rule it is.  BOTNET_CLIENT 
and BOTNET also include a list of sub-rule names that were 
triggered.  So, you might see this:



[botnet0.7,ip=1.2.3.4,host=dsl-1-2-3-4.isp.net,maildomain=spammer.com,baddns,ipinhostname,clientwords,client] 



or

[botnet_nordns,ip=2.3.4.5]

or

[botnet_soho,ip=3.4.5.6,hostname=3.4.5.6.isp.net,maildomain=non-spammer-soho.org] 




7) shawcable.net and ocn.ne.jp seem to also be botnet sources, but 
their hostnames don't fit any of my other patterns.  Luckily, they DO 
fit some pattern, and it's simple enough to not need a code based 
rule, just a regular conventional expression based rule.  I've 
created BOTNET_SHAWCABLE and BOTNET_OCNNEJP rules to cover these two.



8) The file Botnet.variations.txt exists now with different suggested 
alternative ways to do Botnet rules.



9) Botnet.credits.txt exists



10) There's now a $VERSION variable within Botnet.pm.  You'll see its 
value in the test_log() output for check_botnet (you can see it in the 
example above), and in the SpamAssassin debug output (spamassassin 
-D) as the module is loaded and instantiated (new is called).




I think that's everything...
It seems to have an issue with mail sent through forwarders like alumni 
accounts and one mail type systems. I am sending you a note off line 
with the details.

Tom


roaming users sending mail internally and dynamic IPs issue

2006-12-18 Thread Thomas Bolioli
Whenever our users travel outside the internal networks and send email 
to each other, the emails get tagged by the below reports (yes, I 
cranked up the default scores because of the botnet crap out there) 
because they are on dyn IPs and sending direct to the receiving MTA.


I see a couple of ways that this can be remedied, most of which is 
acceptable. a) Whitelist all of the users (or the entire domain) for 
every domain on the system [obviously bad since it allows spammers to 
spoof from headers with impunity even with SPF setup]. b) set up second 
machine to be a second MTA and have users send email from machine 2 
which then relays to machine 1 [waste of a machine and energy to run 
that machine]. or c) there is some configuration I am missing. Does 
anyone know what I can do to fix this?


Thanks,
Tom

   *  0.7 RCVD_IN_SORBS_DUL RBL: SORBS: sent directly from dynamic IP 
address

   *  [xx.xx.xx.xx listed in dnsbl.sorbs.net]
   *  2.5 RCVD_IN_NJABL_DUL RBL: NJABL: dialup sender did non-local SMTP
   *  [xx.xx.xx.xx listed in combined.njabl.org]


Re: Botnet 0.6 plugin for Spam Assassin availabile

2006-12-18 Thread Thomas Bolioli

Chris Lear wrote:

* Oliver Schulze L. wrote (18/12/06 15:42):
  

Nice stats!
How do you generate them in SA 3.1.7 ?



I use this: http://www.rulesemporium.com/programs/sa-stats-1.0.txt

Chris
  

Does this require using spamd instead of invoking spamassassin?
Thanks,
Tom


Re: roaming users sending mail internally and dynamic IPs issue

2006-12-18 Thread Thomas Bolioli

Dan Horne wrote:
I see a couple of ways that this can be remedied, most of 
which is acceptable. a) Whitelist all of the users (or the 
entire domain) for every domain on the system [obviously bad 
since it allows spammers to spoof from headers with impunity 
even with SPF setup]. b) set up second machine to be a second 
MTA and have users send email from machine 2 which then 
relays to machine 1 [waste of a machine and energy to run 
that machine]. or c) there is some configuration I am 
missing. Does anyone know what I can do to fix this?





Set up SMTP AUTH and require your users to log in to send email.  If I
understand correctly Spamassassin automatically trusts mails sent via
SMTP AUTH.
  
Thanks for the response. SMTP auth is set up so there must be something 
I need to do to tell SA that it was auth'd.

Any ideas?
Thanks,
Tom


Re: roaming users sending mail internally and dynamic IPs issue

2006-12-18 Thread Thomas Bolioli

Dan Horne wrote:
I see a couple of ways that this can be remedied, most of 
which is acceptable. a) Whitelist all of the users (or the 
entire domain) for every domain on the system [obviously bad 
since it allows spammers to spoof from headers with impunity 
even with SPF setup]. b) set up second machine to be a second 
MTA and have users send email from machine 2 which then 
relays to machine 1 [waste of a machine and energy to run 
that machine]. or c) there is some configuration I am 
missing. Does anyone know what I can do to fix this?





Set up SMTP AUTH and require your users to log in to send email.  If I
understand correctly Spamassassin automatically trusts mails sent via
SMTP AUTH.
  
Thanks for the response. SMTP auth is set up so there must be something 
I need to do to tell SA that it was auth'd.

Any ideas?
Thanks,
Tom


Re: roaming users sending mail internally and dynamic IPs issue

2006-12-18 Thread Thomas Bolioli

Dan Barker wrote:

Another issue you'll run into with road warriors is blocks on port 25. They
may not be ABEL to authenticate with your server. They'll have to use port
587 (submission) on some connections. This is so common, that I even support
587 inside my firewall so the client setup doesn't need to change when my
laptop comes home.

Dan
  
Yeah, I have had this setup myself after running into the issue @ a 
hotel last year.

Thanks,
Tom


Re: SPF is hopelessly broken and must die!

2006-12-13 Thread Thomas Bolioli

John Rudd wrote:

Spam Assassin wrote:
Why was this topic not started on the SPF list? Was the original 
poster of

this topic looking to get MORE attention on the SpamAssassin list?



Whether you and the other amateur-topic-police* like it or not, the 
subject is related to the more general subject matter of the list 
(fighting spam) even if it doesn't relate to the more focused subject 
matter of the list (spamassassin specifically).  And, even then, I 
would say that since there is an SPF module that comes with the base 
SA packaging, the subject does have a bearing on the more focused 
subject matter.


I hope I am speaking for those of us who are not completely anal about 
mailing list topics when I say: quit it with the attacks on 
only-partially-off-topic message threads.  You're worse than the 
threads themselves.



(* for people who are actual maintainers of the list, and thus are 
actual-topic-police, if any of them want to correct me, contradict 
me, etc., no problem ... but I am more weary of the 
amateur-topic-police than I am of the highly charged/highly biased 
agenda oriented message threads)
You are speaking for me... This became a very relevant topic when the 
spf tests were packaged with SA by default. As someone who is having a 
major issue with spf, it is very important that those making these 
decisions here about the issues that most are having with SPF. It was 
not ready for primetime, and as such, it should have been rolled out 
differently.


SPF test issue

2006-12-06 Thread Thomas Bolioli
I am using the latest and greatest production ver of SA. In it, there is 
an SPF test and I am having issues with what it is comparing to. Below 
is the email and the spf record. My emails fail when I remove this 
ip4:10.1.3 but pass when I put it in. My issue is why is SA looking at 
the original sending host (the self reported IP to boot and not the 
actual external IP). Laptop users could have any IP and for SPF to work, 
you need to focus on the mail servers. They are the only ones that 
matter in this.

Am I wrong here? Is my mail server putting the wrong headers in?
Tom

v=spf1 ip4:70.90.48.20 ip4:70.90.48.21 ip4:10.1.3 a mx ptr 
a:nova.terranovum.com a:crampon.terranovum.com a:smtp.terranovum.com 
mx:mail.terranovum.com ~all




Return-Path: [EMAIL PROTECTED]
X-Spam-Checker-Version: SpamAssassin 3.1.5 (2006-08-29) on 
nova.terranovum.com

X-Spam-Level: **
X-Spam-Status: No, score=2.7 required=4.0 tests=BLANK_LINES_70_80,
   SPF_SOFTFAIL autolearn=disabled version=3.1.5
X-Original-To: [EMAIL PROTECTED]
Delivered-To: [EMAIL PROTECTED]
Received: from permemail08.alumniconnections.com 
(permemail08.alumniconnections.com [198.212.10.55])

   by nova.terranovum.com (Postfix) with ESMTP id EE3A2356559
   for [EMAIL PROTECTED]; Wed,  6 Dec 2006 08:51:47 -0500 (EST)
Received: from permemail08.alumniconnections.com (localhost [127.0.0.1])
   by permemail08.alumniconnections.com (Postfix) with ESMTP id E88FE70B1
   for [EMAIL PROTECTED]; Wed,  6 Dec 2006 08:44:00 -0500 (EST)
Received: from brandy.adelphi.edu (brandy.adelphi.edu [192.147.12.5])
   by permemail08.alumniconnections.com (Postfix) with ESMTP id 924436AB8
   for [EMAIL PROTECTED]; Wed,  6 Dec 2006 08:43:39 
-0500 (EST)
Received: from brandy.adelphi.edu (127.0.0.1) by brandy.adelphi.edu 
(MlfMTA v3.2r1b3) id her3pk0171sh for 
[EMAIL PROTECTED]; Wed, 6 Dec 2006 08:35:03 -0500 
(envelope-from [EMAIL PROTECTED])

Received: from nova.terranovum.com ([70.90.48.21])
   by brandy.adelphi.edu (Adelphi University)
   with ESMTP; Wed, 06 Dec 2006 08:34:59 -0500
Received: from [10.0.1.3] (katahdin.terranovum.com [70.90.48.17])
   by nova.terranovum.com (Postfix) with ESMTP id 758C5356595
   for [EMAIL PROTECTED]; Wed,  6 Dec 2006 08:48:52 
-0500 (EST)

Mime-Version: 1.0 (Apple Message framework v752.3)
Content-Transfer-Encoding: 7bit
Message-Id: [EMAIL PROTECTED]
Content-Type: text/plain; charset=US-ASCII; format=flowed
To: Thomas Bolioli [EMAIL PROTECTED]
From: Thomas Bolioli [EMAIL PROTECTED]
Subject: test email spf
Date: Wed, 6 Dec 2006 08:48:43 -0500
X-Mailer: Apple Mail (2.752.3)
X-Mlf-Threat: nothreat
X-Mlf-Threat-Detailed: nothreat;none;none;none
X-Mlf-UniqueId: i200612061334590051206
X-Virus-Scanned: ClamAV using ClamSMTP

this is a test of the new spf records





sa-update

2006-12-06 Thread Thomas Bolioli
when I run sa-update it puts new copies of the tests in 
/var/lib/spamassassin/3.001005/updates_spamassassin_org which I 
understand from the docs is the correct location. However, the default 
tests remain in /usr/share/spamassassin/ and I believe they are still 
being used. How is this supposed to work? Am I supposed to manually move 
them into /usr/share? I do not see any reference to the updated tests in 
the cf files anywhere.

Tom


Re: new mailman spam???

2006-06-01 Thread Thomas Bolioli




I definitely did not see an approval request. And I can now confirm
that there are some people who are trying to opt out of the list saying
they did not subscribe. I already have sent postmaster but I am not
optimistic.
Tom

Benny Pedersen wrote:

  

  I have included the mailing in it's entirety below. Is this an old trick
I just have not seen or is this something new using mailman to send
spam. I assure you I neither signed up nor confirmed a submission for this mailing list. Is this just a
poorly configured mailman install? Tom
  

  
  
could be a mail forge that did go to the mailman, now you did see the requst to be added, just ignore it

but send the mail to the postmaster at that domain might help to solve it

  
  
It's either some weird kind of spam or a very strange list gone rogue.
I'm finding these mesages showing up in spamtraps that I've setup using
accounts that have been inactive for 10 or more years.

  
  
nice

  





new mailman spam???

2006-05-31 Thread Thomas Bolioli
I have included the mailing in it's entirety below. Is this an old trick 
I just have not seen or is this something new using mailman to send 
spam. I assure you I neither signed up nor confirmed a submission for 
this mailing list. Is this just a poorly configured mailman install?

Tom


Return-Path: [EMAIL PROTECTED]
X-Original-To: [EMAIL PROTECTED]
Delivered-To: [EMAIL PROTECTED]
Received: from permemail02.alumniconnections.com 
(permemail02.alumniconnections.com [198.212.10.117])

   by nova.terranovum.com (Postfix) with ESMTP id E7AD93567E3
   for [EMAIL PROTECTED]; Wed, 31 May 2006 20:07:55 -0400 (EDT)
Received: from brandy.adelphi.edu (brandy.adelphi.edu [192.147.12.5])
   by permemail02.alumniconnections.com (8.13.4/8.13.4) with ESMTP id 
k5107sOG012089
   for [EMAIL PROTECTED]; Wed, 31 May 2006 20:07:54 
-0400 (EDT)
Received: from brandy.adelphi.edu (127.0.0.1) by brandy.adelphi.edu 
(MlfMTA v3.1r24) id hfoths0171sd for 
[EMAIL PROTECTED]; Wed, 31 May 2006 20:55:46 -0400 
(envelope-from [EMAIL PROTECTED])

Received: from sodiconmx.org.mx ([206.225.91.17])
   by brandy.adelphi.edu (Adelphi University)
   with ESMTP; Wed, 31 May 2006 20:55:46 -0400
Received: (qmail 29676 invoked from network); 31 May 2006 20:07:53 -0500
Received: from localhost (HELO sodiconmx.org.mx) (127.0.0.1)
 by localhost with SMTP; 31 May 2006 20:07:53 -0500
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Subject: Welcome to the Newsl2 mailing list
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
X-No-Archive: yes
Message-ID: [EMAIL PROTECTED]
Date: Wed, 31 May 2006 18:08:54 -0500
Precedence: bulk
X-BeenThere: [EMAIL PROTECTED]
X-Mailman-Version: 2.1.5
List-Id: newsl2.nueva-alianza.org.mx
X-List-Administrivia: yes
Sender: [EMAIL PROTECTED]
Errors-To: [EMAIL PROTECTED]
X-Mlf-Threat: nothreat
X-Mlf-Threat-Detailed: nothreat;none;none;none
X-Mlf-UniqueId: 200606010055460009014
X-Virus-Scanned: ClamAV 0.88/1504/Wed May 31 15:59:14 2006 on permemail02
X-Virus-Status: Clean
X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on 
nova.terranovum.com

X-Spam-Level:
X-Spam-Status: No, score=0.2 required=4.0 
tests=FORGED_RCVD_HELO,NO_REAL_NAME

   autolearn=disabled version=3.0.4

Welcome to the [EMAIL PROTECTED] mailing list!

To post to this list, send your email to:

 [EMAIL PROTECTED]

General information about the mailing list is at:

 http://lists.nueva-alianza.org.mx/mailman/listinfo/newsl2

If you ever want to unsubscribe or change your options (eg, switch to
or from digest mode, change your password, etc.), visit your
subscription page at:

 
http://lists.nueva-alianza.org.mx/mailman/options/newsl2/thomas_bolioli%40alumni.adelphi.edu



You can also make such adjustments via email by sending a message to:

 [EMAIL PROTECTED]

with the word `help' in the subject or body (don't include the
quotes), and you will get back a message with instructions.

You must know your password to change your options (including changing
the password, itself) or to unsubscribe.  It is:

 ekhuku

Normally, Mailman will remind you of your nueva-alianza.org.mx mailing
list passwords once every month, although you can disable this if you
prefer.  This reminder will also include instructions on how to
unsubscribe or change your account options.  There is also a button on
your options page that will email your current password to you.



Something new to fool SURBL

2005-03-06 Thread Thomas Bolioli
As received (relevant snippet):
a hrefthrivedhref=http://Taiwanese.com href=
http://pickup-card.com;pickup-card.com/a
Now here is the SA report on it:
X-Spam-Checker-Version: SpamAssassin 3.0.0 (2004-09-13) on 
nova.terranovum.com
X-Spam-Level: *
X-Spam-Status: No, score=1.2 required=4.0 tests=BAYES_50,HTML_20_30,
   HTML_FONT_LOW_CONTRAST,HTML_MESSAGE,MIME_HTML_ONLY autolearn=no
   version=3.0.0

I can post the original email in it's entirety if anyone needs it.
Tom


smime.p7s
Description: S/MIME Cryptographic Signature


Re: surbl not reporting on any incoming email

2005-02-17 Thread Thomas Bolioli




I had not upgraded from a 2.6x install with Spam Cop. It was a totally
stock install and it is still 3.0.0. I have since discovered that when
I run spamassassin as any user except root, the network tests do not
work. When I run it as root, all the network tests work just fine. I
have tried to run network based things as other users before and there
does not appear to be any restrictions on network access for those
users. I checked /etc/resolve.conf and it is read only to the world and
is configured properly. Something may be wrong with Net::DNS::Resolver
and it is not seeing the /etc/resolve.conf file when run as other
users. This morning's chore is to create links to ~/.resolve.conf for a
few users and get it owned by them and see what happens. 
Will advise. 
Tom

Jeff Chan wrote:

  On Wednesday, February 16, 2005, 2:25:52 PM, Thomas Bolioli wrote:
  
  
Hence my problem.
 From my local.cf which is not overridden anywhere
skip_rbl_checks 0
dns_available yes

  
  
  
  
 From etc/procmailrc
SPAMC="/usr/bin/spamassassin"
:0f
|$SPAMC

  
  
  
  
but the surbl checks only occur when I do spamassassin -t  file_w_msg
and not when procmail does the forwarding.
I am at a loss. This has never worked since I install 10.1 (SA 3.0.0).

  
  
SA 2.63/2.64 used a separate patch called SpamCopURI, and SA
3.x uses the built in program urirhssub for SURBL lookups.

If you had SpamCopURI before, did you get rid of the it and the
rules for it, as you should for 3.X?   (3.X versions have SURBL
rules set up by default).

Did you perhaps upgrade from 3.0.0 to later versions?  If so, did
you remember to change the rule type from "header" to "body" as
mentioned at: 

  http://www.surbl.org/faq.html#body

Jeff C.
  





smime.p7s
Description: S/MIME Cryptographic Signature


Re: surbl not reporting on any incoming email

2005-02-17 Thread Thomas Bolioli




Ok. I created copies of the /etc/resolv.conf file in the user's home
dirs and made sure the copies were owned by those users and no go. It
is still not executing network tests for any user other than root. Can
anybody confirm they are getting network tests performed on a 3.0.0
setup with procmail executing /usr/bin/spamassassin (not a spamc/spamd
setup)? I know I have all the correct settings as other emails in this
thread can show. 
Tom

Thomas Bolioli wrote:

  
I had not upgraded from a 2.6x install with Spam Cop. It was a totally
stock install and it is still 3.0.0. I have since discovered that when
I run spamassassin as any user except root, the network tests do not
work. When I run it as root, all the network tests work just fine. I
have tried to run network based things as other users before and there
does not appear to be any restrictions on network access for those
users. I checked /etc/resolve.conf and it is read only to the world and
is configured properly. Something may be wrong with Net::DNS::Resolver
and it is not seeing the /etc/resolve.conf file when run as other
users. This morning's chore is to create links to ~/.resolve.conf for a
few users and get it owned by them and see what happens. 
Will advise. 
Tom
  
Jeff Chan wrote:
  
On Wednesday, February 16, 2005, 2:25:52 PM, Thomas Bolioli wrote:
  

  Hence my problem.
 From my local.cf which is not overridden anywhere
skip_rbl_checks 0
dns_available yes



  

   From etc/procmailrc
SPAMC="/usr/bin/spamassassin"
:0f
|$SPAMC



  

  but the surbl checks only occur when I do spamassassin -t  file_w_msg
and not when procmail does the forwarding.
I am at a loss. This has never worked since I install 10.1 (SA 3.0.0).



SA 2.63/2.64 used a separate patch called SpamCopURI, and SA
3.x uses the built in program urirhssub for SURBL lookups.

If you had SpamCopURI before, did you get rid of the it and the
rules for it, as you should for 3.X?   (3.X versions have SURBL
rules set up by default).

Did you perhaps upgrade from 3.0.0 to later versions?  If so, did
you remember to change the rule type from "header" to "body" as
mentioned at: 

  http://www.surbl.org/faq.html#body

Jeff C.
  
  





smime.p7s
Description: S/MIME Cryptographic Signature


Re: surbl not reporting on any incoming email

2005-02-17 Thread Thomas Bolioli




I have new info. I changed the dns_available setting to test and I got
this. 
Failed to run DNS_FROM_AHBL_RHSBL RBL SpamAssassin test, skipping:
 (Can't call method "bgsend" on an undefined value at
/usr/lib/perl5/vendor_perl/5.8.5/Mail/SpamAssassin/Dns.pm line 112.
)
Failed to run NO_DNS_FOR_FROM RBL SpamAssassin test, skipping:
 (Can't call method "bgsend" on an undefined value at
/usr/lib/perl5/vendor_perl/5.8.5/Mail/SpamAssassin/Dns.pm line 141.
)
Failed to run __RFC_IGNORANT_ENVFROM RBL SpamAssassin test, skipping:
 (Can't call method "bgsend" on an undefined value at
/usr/lib/perl5/vendor_perl/5.8.5/Mail/SpamAssassin/Dns.pm line 112.
)
Any ideas?
Tom

Thomas Bolioli wrote:

  
  
Ok. I created copies of the /etc/resolv.conf file in the user's home
dirs and made sure the copies were owned by those users and no go. It
is still not executing network tests for any user other than root. Can
anybody confirm they are getting network tests performed on a 3.0.0
setup with procmail executing /usr/bin/spamassassin (not a spamc/spamd
setup)? I know I have all the correct settings as other emails in this
thread can show. 
Tom
  
Thomas Bolioli wrote:
  

I had not upgraded from a 2.6x install with Spam Cop. It was a totally
stock install and it is still 3.0.0. I have since discovered that when
I run spamassassin as any user except root, the network tests do not
work. When I run it as root, all the network tests work just fine. I
have tried to run network based things as other users before and there
does not appear to be any restrictions on network access for those
users. I checked /etc/resolve.conf and it is read only to the world and
is configured properly. Something may be wrong with Net::DNS::Resolver
and it is not seeing the /etc/resolve.conf file when run as other
users. This morning's chore is to create links to ~/.resolve.conf for a
few users and get it owned by them and see what happens. 
Will advise. 
Tom

Jeff Chan wrote:

  On Wednesday, February 16, 2005, 2:25:52 PM, Thomas Bolioli wrote:
  
  
Hence my problem.
 From my local.cf which is not overridden anywhere
skip_rbl_checks 0
dns_available yes

  
  
  
  
 From etc/procmailrc
SPAMC="/usr/bin/spamassassin"
:0f
|$SPAMC

  
  
  
  
but the surbl checks only occur when I do spamassassin -t  file_w_msg
and not when procmail does the forwarding.
I am at a loss. This has never worked since I install 10.1 (SA 3.0.0).

  
  
SA 2.63/2.64 used a separate patch called SpamCopURI, and SA
3.x uses the built in program urirhssub for SURBL lookups.

If you had SpamCopURI before, did you get rid of the it and the
rules for it, as you should for 3.X?   (3.X versions have SURBL
rules set up by default).

Did you perhaps upgrade from 3.0.0 to later versions?  If so, did
you remember to change the rule type from "header" to "body" as
mentioned at: 

  http://www.surbl.org/faq.html#body

Jeff C.
  

  





smime.p7s
Description: S/MIME Cryptographic Signature


Re: surbl not reporting on any incoming email

2005-02-17 Thread Thomas Bolioli




Thanks for the heads up but the problem is starting to look like perl.
When I run perl as root I have the same @INC path as when I run
non-privileged. However, only as root am I able to find most of the
modules in site_perl. When I run as other than root, I can not get
access to the modules I need. It appears some permission problems have
crept up after doing an update a few days ago. I am in the process of
fixing them right now and hopefully that will hold and not get
automatically "fixed" by some bots Mandrake has to fixperms on an
hourly basis.
Thanks,
Tom


Bob McClure Jr wrote:

  On Thu, Feb 17, 2005 at 11:58:04AM -0500, Thomas Bolioli wrote:
  
  
Ok. I created copies of the /etc/resolv.conf file in the user's home 
dirs and made sure the copies were owned by those users and no go. It is 
still not executing network tests for any user other than root. Can 
anybody confirm they are getting network tests performed on a 3.0.0 
setup with procmail executing /usr/bin/spamassassin (not a spamc/spamd 
setup)? I know I have all the correct settings as other emails in this 
thread can show.
Tom

  
  
Don't know if this relates to your problem.  About two weeks ago I
started having a lot of spam slipping through, even the obvious C-drug
ones.  Following a recent posting (may have been yours), I ran the
spam through "spamassassin -D" and it scored much higher, even enough
to qualify for summary punting, mostly thanks to SURBL scores that
weren't in the original scan by spamc/spamd.  After some thought, I
remembered recently fixing a problem in my /etc/resolv.conf in which a
now-dead IP address had been at the top of the list.  So I restarted
spamd, and now things are once again wonderful.

Apparently, spamd reads /etc/resolv.conf at startup and uses only the
first entry.  If that's busted, forget about all the SURBL stuff.

  
  
Thomas Bolioli wrote:



  I had not upgraded from a 2.6x install with Spam Cop. It was a totally 
stock install and it is still 3.0.0. I have since discovered that when 
I run spamassassin as any user except root, the network tests do not 
work. When I run it as root, all the network tests work just fine. I 
have tried to run network based things as other users before and there 
does not appear to be any restrictions on network access for those 
users. I checked /etc/resolve.conf and it is read only to the world 
and is configured properly. Something may be wrong with 
Net::DNS::Resolver and it is not seeing the /etc/resolve.conf file 
when run as other users. This morning's chore is to create links to 
~/.resolve.conf for a few users and get it owned by them and see what 
happens.
Will advise.
Tom

Jeff Chan wrote:

  
  
On Wednesday, February 16, 2005, 2:25:52 PM, Thomas Bolioli wrote:




  Hence my problem.
  

From my local.cf which is not overridden anywhere


  skip_rbl_checks 0
dns_available yes
  

  



From etc/procmailrc


  SPAMC="/usr/bin/spamassassin"
:0f
|$SPAMC
  

  





  but the surbl checks only occur when I do spamassassin -t  file_w_msg
and not when procmail does the forwarding.
I am at a loss. This has never worked since I install 10.1 (SA 3.0.0).
  

  

SA 2.63/2.64 used a separate patch called SpamCopURI, and SA
3.x uses the built in program urirhssub for SURBL lookups.

If you had SpamCopURI before, did you get rid of the it and the
rules for it, as you should for 3.X?   (3.X versions have SURBL
rules set up by default).

Did you perhaps upgrade from 3.0.0 to later versions?  If so, did
you remember to change the rule type from "header" to "body" as
mentioned at: 

http://www.surbl.org/faq.html#body

Jeff C.

  

  
  
Cheers,
  





smime.p7s
Description: S/MIME Cryptographic Signature


Re: spam ham ratio for bayes filter

2005-02-17 Thread Thomas Bolioli
Interesting but what happens in the case where someone, like me, is 
getting 250+ spam a day and only about ten or so legitimate emails? This 
is not counting this account that my mailing lists go to which I have 
far better bayes performance on (1:100 spam/ham ratio instead of 10:1 or 
lower with my other accounts). With autotraining turned on, that means 
far more spam will get trained. Even if I turned off auto training, and 
trained only the ham that came through, it would simply allow changes in 
spam to begin to defeat the bayes filter over time, is that not so? 
Doesn't that mean that the expiration system that SA employs solves that 
problem?
Tom

Thomas Arend wrote:
Hello,
a lot of questions in this list are about the spam : ham ratio to be trained 
and how much mails should be trained. One continuously read myth is the 1 : 1 
ratio.

I read an article about the best ratio as 1 :  1 and it was expirienced by a 
test and later on derived from the bayesian theorem. Unfortunately I didn't 
copy this article and can't remember enough to find the article by googling.

The problem is the conclusion of the article was wrong.
What I will try to show in the next steps - which unfortunately require a 
little bit algebra - is: Train bayes filter in accordance with your real spam 
ham ratio and train as much as possible. But never train to less ham or train 
only spam!

Here my argument follows:
In short the bayes theorem says 

P(Spam|Token) = P(Token|Spam)*P(Spam)/P(Token) 

that means: the probability of a message being Spam under the contition that  
a token is in the message 
is equal to 
the propability of the Token contained in a Spam message 
multiplied by 
the propability of a message being spam 
devided by the propability of any message containig the token.

So if you have received s spam messages and h ham messages where the token is 
in S spams and in H ham messages then you get: 

s = number of spam messages
h = number of ham messages
S = number of spam messages containing the token
H = number of ham messages containing the token
s+h= number of messages
S+H = number of messages containing the token
Therefor 

	P(Spam) = s/(s+h) 

is an aproximation of the probability of a random message being spam.
And for:
P(Token) = (S+H)/(s+h)
P(Token|Spam) = S/s
that leads to
P(Spam|Token)   = S/s *s/(s+h) / ((S+H)/(s+h))
= S / (S+H)
That means, that the probability of a given message being spam when it 
contians a token is independend of the number of messages trained. 

Lets say in your real spam ham ratio is 10 to 1 and your messge body contains 
1100 messages. 100 spam and also 50 ham messages should contain a certain 
token. Lets say [EMAIL PROTECTED]@. 

Total Messages: 1100
Spam (trained): 1000
Ham: 100
[EMAIL PROTECTED]@: in 100 spam and 50 ham 

If you train all messages you will get a propability of 100 / (100+50) = 66.6% 
for the next message containing the token of being spam. Which isn't a high 
probability but works fine for this example.

If you train only 10% of your spam to get the spam ham ration of 1:1 you will 
supposably count only 10 spam messages with the token. 

Spam (trained): 100
Ham: 100
[EMAIL PROTECTED]@: in 10 (=10% of 100) spam and 50 ham 

Which leads to a spam probability of only 10 /(10+50) = 16.6%
Which is a little bit low.
What happens when you train less ham? 

Lets assume you train only 50% of your ham but all your spam. You will 
supposably count only 25 ham messages with the token. 

Spam (trained): 1000
Ham (50% trained): 50
[EMAIL PROTECTED]@: in 100 spam and 25 (= 50% of 50) ham 

Which leads to a spam probability of 100 /(100+25) = 80%.
What happens when your ham spam ratio is 10 to 1?
Ham = 1000
Spam = 100
[EMAIL PROTECTED]@: in 100 ham and 50 spam 

= 50 / (50+100) = 33.3%
Ham (10% trained) = 100
Spam = 100
[EMAIL PROTECTED]@: in 10 (=10% of 100) ham and 50 spam 

= 50 / (50+10) = 83.3%
OOps!!!
So if you train to less spam you will get a higher False Negative rate, if you 
train to less ham you will get a higher False Positive rate.

Because a False Positive is more harmfull than a False Negative my conclusion 
is:
	train iaw your real spam ham ratio, train as much as possible (= train all 
	messages), but never train to less ham or train only spam!

(BTW: The risk of a False Positives is the reason why Paul Graham multiplied 
his token counts for ham with 2)

Another lesson should be: Never train whitelisted mails as ham!!!
Best regards 

Thomas Arend
PS: I hope I made no mistakes.
 



smime.p7s
Description: S/MIME Cryptographic Signature


Re: surbl not reporting on any incoming email

2005-02-17 Thread Thomas Bolioli




Yup, this fixed it. There is something wrong with Mandrake dists where
/usr/lib/perl5/site_perl gets chmod' to 700 and that is not the proper
behavior. This is a v10 - 10.1 upgraded machine with 5.8.3 upgraded
to 5.8.5 in recent days as part of urpmi.update. It looks like the
vendor package maintainer decided to chmod all of the 5.8.3 site_perl
locations 700 to avoid clashes with 5.8.5 in lieu of deleting the
directories outright. Good in theory but he/she just went one dir to
high when they did it. That's what I get for patching...
Tom

Thomas Bolioli wrote:

  
Thanks for the heads up but the problem is starting to look like perl.
When I run perl as root I have the same @INC path as when I run
non-privileged. However, only as root am I able to find most of the
modules in site_perl. When I run as other than root, I can not get
access to the modules I need. It appears some permission problems have
crept up after doing an update a few days ago. I am in the process of
fixing them right now and hopefully that will hold and not get
automatically "fixed" by some bots Mandrake has to fixperms on an
hourly basis.
Thanks,
Tom
  
  
Bob McClure Jr wrote:
  
On Thu, Feb 17, 2005 at 11:58:04AM -0500, Thomas Bolioli wrote:
  

  Ok. I created copies of the /etc/resolv.conf file in the user's home 
dirs and made sure the copies were owned by those users and no go. It is 
still not executing network tests for any user other than root. Can 
anybody confirm they are getting network tests performed on a 3.0.0 
setup with procmail executing /usr/bin/spamassassin (not a spamc/spamd 
setup)? I know I have all the correct settings as other emails in this 
thread can show.
Tom



Don't know if this relates to your problem.  About two weeks ago I
started having a lot of spam slipping through, even the obvious C-drug
ones.  Following a recent posting (may have been yours), I ran the
spam through "spamassassin -D" and it scored much higher, even enough
to qualify for summary punting, mostly thanks to SURBL scores that
weren't in the original scan by spamc/spamd.  After some thought, I
remembered recently fixing a problem in my /etc/resolv.conf in which a
now-dead IP address had been at the top of the list.  So I restarted
spamd, and now things are once again wonderful.

Apparently, spamd reads /etc/resolv.conf at startup and uses only the
first entry.  If that's busted, forget about all the SURBL stuff.

  

  Thomas Bolioli wrote:


  
I had not upgraded from a 2.6x install with Spam Cop. It was a totally 
stock install and it is still 3.0.0. I have since discovered that when 
I run spamassassin as any user except root, the network tests do not 
work. When I run it as root, all the network tests work just fine. I 
have tried to run network based things as other users before and there 
does not appear to be any restrictions on network access for those 
users. I checked /etc/resolve.conf and it is read only to the world 
and is configured properly. Something may be wrong with 
Net::DNS::Resolver and it is not seeing the /etc/resolve.conf file 
when run as other users. This morning's chore is to create links to 
~/.resolve.conf for a few users and get it owned by them and see what 
happens.
Will advise.
Tom

Jeff Chan wrote:

  

  On Wednesday, February 16, 2005, 2:25:52 PM, Thomas Bolioli wrote:



  
Hence my problem.
  
  
  From my local.cf which is not overridden anywhere

  
skip_rbl_checks 0
dns_available yes
  

  
  
  
From etc/procmailrc

  
SPAMC="/usr/bin/spamassassin"
:0f
|$SPAMC
  

  
  
  

  
but the surbl checks only occur when I do spamassassin -t  file_w_msg
and not when procmail does the forwarding.
I am at a loss. This has never worked since I install 10.1 (SA 3.0.0).
  

  
  
  SA 2.63/2.64 used a separate patch called SpamCopURI, and SA
3.x uses the built in program urirhssub for SURBL lookups.

If you had SpamCopURI before, did you get rid of the it and the
rules for it, as you should for 3.X?   (3.X versions have SURBL
rules set up by default).

Did you perhaps upgrade from 3.0.0 to later versions?  If so, did
you remember to change the rule type from "header" to "body" as
mentioned at: 

http://www.surbl.org/faq.html#body

Jeff C.


  


Cheers,
  
  





smime.p7s
Description: S/MIME Cryptographic Signature


Re: Continued problems with RBL

2005-02-17 Thread Thomas Bolioli




That version of Net::DNS is too old. Upgrade that and see if it fixes
it. 
Tom

Austin Weidner wrote:

  
According to the docs:

On UNIX systems the defaults are read from the following files, in the
order indicated:

 /etc/resolv.conf
 $HOME/.resolv.conf
 ./.resolv.conf

What OS is this server running?

Try running this:

perl -MNet::DNS -e '$r=Net::DNS::Resolver-new;print join "
",$r-nameservers, "\n";'

You should get the same ip addresses that are in your /etc/resolv.conf

  
  
I think we might be on to something here.

Running: RHEL3

I have two servers in my resolv.conf file. The first one is acting up, the
2nd one is working fine. When I do a dig or nslookup, it is using the 2nd
one all the time. One poster suggested that Net::DNS isn't very good at
going to the #2 server if the #1 one screws up.

So I took out the #1 server completely (so there is only one server in
resolv.conf right now, nslookup and dig still working normal). Then tried
Net::DNS stuff and got the same results. However, I ran your code and much
to my surprise, it listed BOTH servers!

So I thought maybe Net::DNS looks at it when it is first installed. Tried
to reinstall Net::DNS from source, still nothing. Wonder if I need to remove
the RPM version before installing from source? I have:

[EMAIL PROTECTED] tmp]# rpm -qa | grep DNS
perl-Net-DNS-0.31-3.1

Installed. At any rate, I think (or maybe I should say I HOPE) the problem
is Net::DNS is looking at a server that isn't reliable, and not going to the
backup. Now I am not sure how to get Net::DNS to recognize that there is
only 1 server to use (or at least make it so the GOOD server is used first).

Thanks again, hope we are on to something here!

- Austin




  





smime.p7s
Description: S/MIME Cryptographic Signature


Mysterious AWL entries

2005-02-17 Thread Thomas Bolioli
I have this one test that shows up in email headers from time to time 
(all on subscribe.ru addresses) but I have no idea where it is coming 
from. It is saying the address is in the AWL (From: address is in the 
auto white-list) but I am 99% positive it is not. I am using SQL and 
have navicat open at this moment. select * from `awl`  limit 0,1000 is 
returning 0 records. Any idea where this is happening?
Tom

Return-Path: [EMAIL PROTECTED]
X-Original-To: [EMAIL PROTECTED]
Delivered-To: [EMAIL PROTECTED]
Received: from permemail06.alumniconnections.com 
(permemail06.alumniconnections.com [198.212.10.109])
   by smtp.terranovum.com (Postfix) with ESMTP id D825D3E68D7
   for [EMAIL PROTECTED]; Thu, 17 Feb 2005 15:54:17 -0500 (EST)
Received: (from [EMAIL PROTECTED])
   by permemail06.alumniconnections.com (8.12.11/8.12.11) id j1HKsHMm024233
   for [EMAIL PROTECTED]; Thu, 17 Feb 2005 15:54:17 -0500 (EST)
Received: from 200-161-114-78.dsl.telesp.net.br(200.161.114.78) by 
permemail06 via smap (V2.1)
   id xma_23746_1108673612; Thu, 17 Feb 05 15:53:32 -0500
Received: from [81.9.34.176] (port=25 helo=cat176.subscribe.ru)
by mx20.mail.ru with esmtp
id 1D0Uve-000AEU-00
for [EMAIL PROTECTED]; Mon, 14 Feb 2005 04:21:34 +0300
Received: id 6D55D22025A; Mon, 14 Feb 2005 04:20:49 +0300
X-Felis-Queue-Id: 20050214042037
Precedence: normal
List-Id: tech.auto.greatimes.subscribe.ru
List-Help: http://subscribe.ru/catalog/tech.auto.greatimes
List-Subscribe: mailto:[EMAIL PROTECTED]
List-Unsubscribe: mailto:[EMAIL PROTECTED]
List-Archive:  http://subscribe.ru/archive/tech.auto.greatimes
List-Owner: mailto:[EMAIL PROTECTED]
List-Post: NO
Message-Id: [EMAIL PROTECTED]
Date: Mon, 14 Feb 2005 04:20:13 +0300
From: Juliya Cvetkova [EMAIL PROTECTED]
To: tech.auto.greatimes [EMAIL PROTECTED](5837856)
Subject: =?koi8-r?B?UmU6IPDPzMnH0sHGydE=?=
MIME-Version: 1.0
Content-Language: ru
Content-Type: multipart/related;
boundary==_NextPart_000_0629_01C51544.F866C010;
type=multipart/alternative
Content-Transfer-Encoding: 8bit
X-Spam-Flag: YES
X-Spam-Checker-Version: SpamAssassin 3.0.0 (2004-09-13) on 
nova.terranovum.com
X-Spam-Level: **
X-Spam-Status: Yes, score=6.8 required=4.0 tests=AWL,BAYES_99,HTML_90_100,
   HTML_IMAGE_ONLY_04,HTML_MESSAGE,MIME_HTML_MOSTLY,MPART_ALT_DIFF,
   RCVD_IN_SORBS_DUL autolearn=no version=3.0.0
X-Spam-Report:
   *  1.0 MIME_HTML_MOSTLY BODY: Multipart message mostly text/html MIME
   *  0.0 HTML_MESSAGE BODY: HTML included in message
   *  0.1 MPART_ALT_DIFF BODY: HTML and text parts are different
   *  1.9 BAYES_99 BODY: Bayesian spam probability is 99 to 100%
   *  [score: 1.]
   *  0.0 HTML_90_100 BODY: Message is 90% to 100% HTML
   *  3.3 HTML_IMAGE_ONLY_04 BODY: HTML: images with 0-400 bytes of words
   *  2.0 RCVD_IN_SORBS_DUL RBL: SORBS: sent directly from dynamic IP 
address
   *  [200.161.114.78 listed in dnsbl.sorbs.net]
   * -1.5 AWL AWL: From: address is in the auto white-list



smime.p7s
Description: S/MIME Cryptographic Signature


Re: Odd issue with a few mailing lists..

2005-02-16 Thread Thomas Bolioli
They're S/MIME digital signatures. Eudora has a habit of automatically 
extracting (and severing) attachments and plopping them on the drive 
somewhere of the user's choosing, providing a link to it in the email. 
Most of us use clients that behave radically different. Eudora should 
probably not extract attachments at all or be more selective in what it 
extracts. Note: I am not trashing Eudora, I used if for 6+ years on both 
Windoze and Mac. Just that one thing always got to me. I wish they would 
make an option to turn off that behavior since the attachments can 
easily get lost and unlinked from the original email. Like when I moved 
to Apple Mail, I have thousands of emails missing the attachments...
Tom

Evan Platt wrote:
I only seem to have this problem on this list and the mrtg lists... 
however a number of messages come with attachments.  Looking at them, 
they appear to generally be PGP keys.  Not a major issue, but now I 
have dozens of them (well, more).

Not to pick on people, but just in the last few days, I see it from 
Theo Van Dinter, Michael Parker, Thomas Bolioli, and that seems to be 
it for the past week or so.

I'm using Eudora Windows 6.2.1.2. Any way to turn this off? The entire 
contents of the attached file appear as below: (This from Theo)

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.6 (GNU/Linux)
iD8DBQFCE8iE0n2PaNPSwnMRAuOTAKCXsRD0TxFAFaj3+I4dx45u8RY92gCgtOI5
qiTu8706EiipoyH+Rx5SXOU=
=Xdgp
-END PGP SIGNATURE-
As a test, I e-mailed myself the contents of the attached file, and it 
did not convert it to an attachment.

Any ideas?
Thanks.
Evan



smime.p7s
Description: S/MIME Cryptographic Signature


Disappearing body of email

2004-10-21 Thread Thomas Bolioli
I recieved an email this morning that I get every morning but today it 
was missing the entire body past the headers. Notice though that SA' 
report on it leads one to believe it had analyzed the email at some 
point. I run SA through procmail so I have two places to look and what 
makes this worse is this is the only one to have shown this problem. Has 
anyone seen this before? Any info someone else can provide that can 
focus my attention in the right place faster would be greatly appreciated.
Tom


Received: (qmail 26396 invoked by uid 514); 21 Oct 2004 08:05:12 -
Date: 21 Oct 2004 08:05:11 -
Message-ID: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
From: [EMAIL PROTECTED]
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Subject: Calvin and Hobbes 10/21/93
MIME-Version: 1.0
Errors-To: [EMAIL PROTECTED]
X-Uclick-UID: 023feb92-69fe-e729-e6ef641e5d6ba424
X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on 
nova.terranovum.com
X-Spam-Level:
X-Spam-Status: No, hits=-3.2 required=4.0 tests=BAYES_00,
HTML_MIME_NO_HTML_TAG,MIME_HTML_ONLY,NO_REAL_NAME autolearn=no
version=2.63


Re: sa-learn --ham not running from horde/imp.

2004-10-13 Thread Thomas Bolioli
What is likely happening is that sa-learn is running as root, with 
nobody's permissions since apache su's itself to nobody by default on RH 
9/FC1 (I am assuming this version of linux from the LC_ALL/LANG issue, 
although mac osx is a possibility). When you click the link in horde, it 
is executing that code ('/usr/local/bin/sa-learn --spam') as the user 
that the webserver is running under (nobody), not the user you are 
logged into horde with (sahil?). Therfore, you are not actually learning 
against the user's (sahil) bayes db. When apache su's to nobody, it 
looses rights to root's resources but su is notorius for not fully 
assuming the su'd (nobody) persona. Part of that is intentional, part of 
it is not. That is why sa-learn is picking up root's id as the one to 
try and run under but fails to have the required perms to accomplish 
anything. Oh yeah, I forgot to elucidate that sa-learn figures out the 
.spamassassin directory from the currently logged in user's home dir as 
reported by the env vars. Those are effected by su'ing to a new user. 
One example is that the 2.6X versions do not run properly under sudo but 
do under su. The reason, although I have never looked into it directly, 
are likely the difference between the behavior of su and sudo in how 
they set env variables.
Possible fixes:
1) don't use hordes reporting util (recommended)
2) run the webserver as root and use the command ('su -c 
/usr/local/bin/sa-learn --spam $user' or something like it. Do man su or 
su --help for more info) where $user is horde's global variable for the 
currently logged on user. (*NOT* recommended. Major security issues there)
3) attempt to run the command from 2 under the su'd nobody user. It may 
work since sometimes su is broken depending on your build of perl, etc 
(although it has been 6 months since I used RH, I do not believe their 
stock perl build was broken). It may revert to the parent processes 
rights instead of seeing nobody and su without a password needed. This 
is highly unlikely of working but it is worth a shot. It will take no 
time to try and can't hurt if it works. Since if it works, you have 
bigger problems that you can do little to fix. Although I stress, I do 
not think it will work.
I hope that helps,
Tom

Sahil Tandon wrote:
I understand my problem might be rooted in Horde, amavisd-new, or 
Postfix.  However, I want to be sure it's not a fundamental 
misunderstanding (on my part) of how SA should be setup.

Postfix filters mail via amavisd-new (which calls SA).  Everything 
runs smoothly except the Report as Spam link for users viewing 
messages via webmail.  When clicked, it successfully sends a message 
to postmaster, and unsuccessfully calls sa-learn.  This is what I see 
in my logs (sorry for line wraps):

lock: 94539 cannot create tmp lockfile 
/root/.spamassassin/bayes.lock.sphinx.hamla.org.94539 for 
/root/.spamassassin/bayes.lock: Permission denied
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LC_ALL = (unset),
LANG = en_US
are supported and installed on your system.
perl: warning: Falling back to the standard locale (C).
bayes expire_old_tokens: lock: 95562 cannot create tmp lockfile 
/root/.spamassassin/bayes.lock.sphinx.hamla.org.95562 for 
/root/.spamassassin/bayes.lock: Permission denied

lock: 95562 cannot create tmp lockfile 
/root/.spamassassin/bayes.lock.sphinx.hamla.org.95562 for 
/root/.spamassassin/bayes.lock: Permission denied

The relevant line in my IMP conf.php:
$conf['spam']['program'] = '/usr/local/bin/sa-learn --spam';
I googled for the error but cannot find a proper solution.  Right now, 
/root/.spamassassin is a symlink to /var/amavis/.spamassassin; the 
files therein (i.e. the bayes_* files) are chown'd vscan:vscan.  They 
are updated when SA *itself* notices spam above a certain threshold, 
rejects those messages, and auto-learns their spammy existence.

How to get 'sa-learn --spam' from webmail to co-exist peacefully with 
my current setup?

--
Sahil Tandon



Re: Public SA Corpus

2004-10-12 Thread Thomas Bolioli
Gerry Doris wrote:
I managed to destroy my bayes database...don't ask.
Since I only run a home system and don't receive a heavy flow of spam I
really like to skip the wait for bayes to get up to speed.  Is it
recommended to use the public corpus on the SA website or is it too old
for proper training?  Is there a better source of ham/spam to be used for
training?
Gerry
 

The public spam db should be broad enough for you in the interim,
although I just checked and it is a little long in the tooth (circa
2/2003). Spam is in large part generic these days, public/generic could
get you up and going quick. As time goes by, the older spam will be
retired and be replaced with things coming in. Don't bother with public
ham though. Feeding it ham should be up to you. If you get that little
spam, then you should have no problem training it on that side.
On a side note, I have a 55K message spam database from email addresses
used in the music industry, environmental and educational markets (not
to mention /. ;-}) and should be a broad reach. It has been culled of
all virii and mailing list mail. It could make a decent analysis corpus
for those who want it. Also gerry, If you want, I can forward along or
post the most recent spam, about 2-5K worth for you to train on. That
should be all you need.
Tom



Re: SA-Learn script

2004-10-02 Thread Thomas Bolioli




It is not fully tested yet but here it is. NB that I changed the USER
env variable to USERNAME. I do not know if this is common on all
flavors of linux but USER does not transliterate under su conditions to
the child id but stays the parent. The var USERNAME does change to
reflect the child username. Also, this script is still localized
somewhat since it assumes all Junk folders are prefixed with Junk and I
did not adjust the courier IMAP code with my changes since I had no
system to test against. It should provide for some interesting ideas
nonetheless. 
New features include cross version compatibility, higher speed (using
bayes journals), debugging and error controls, wider bayes training and
most importantly support for UWash based IMAP and mbox format
mailboxes. 
Tom

Rubin Bennett wrote:

  Hello all...
I figure I've asked enough questions of this list that it's about time I
gave something back... You may not want it,but here it is anyway :)

I've written a bash script that takes will run sa-learn against the
administrator specified False-Postive and False-Negative folders.

Run this script from cron, and have your users drag n' drop emails that
get misclassified by SA to the appropriate folders.  The script will act
in 2 ways:

1.) Run it as root, and it will parse the administrator specified
USERLIST and run the internally defined autoLearn() function as each
user.
2.) Run it as an ordinary user and it will only learn from that user's
email.

I wrote it this way so that I could have a wrapper around sa-learn that
would make sure that the directories exist, create them if they don't
using maildirmake++, and not try to learn from directories with no
messages in them.

This is written to work with Courier IMAP and Maildir; I have not tried
it with anything else.

Someday I may get around to rewriting it in php and using php-imap to do
the moving around etc, but as a dirty hack this works ok.  It also
doesn't need passwords etc. in config files...

I hope this benefits someone out there... if there's enough interest,
I'll put it on my website and do a proper CVS for it.

If anyone has ideas for making it better (or suck less), let me know. 
Patches are always welcome...
  
  

#!/bin/bash

# Copyright (c) 2004 by Rubin Bennett [EMAIL PROTECTED]
# All Rights reserved.

#This program is free software; you can redistribute it and/or
#modify it under the terms of the GNU General Public License
#as published by the Free Software Foundation; either version 2
#of the License, or (at your option) any later version.
#
#This program is distributed in the hope that it will be useful,
#but WITHOUT ANY WARRANTY; without even the implied warranty of
#MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#GNU General Public License for more details.
#
#You should have received a copy of the GNU General Public License
#along with this program; if not, write to the Free Software
#Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA  02111-1307, USA.


# Usage: IMAP users can move misclassified emails into the "False Negative"
# or "Flase Positive" folders, and this script will learn from them and put
# them where they belong.
# Spam will be moved to the designated Spam folder, and Ham will be moved to
# the user's Inbox.

# This script should be called by CRON or a similar scheduler.


# Requires:
#	 Maildir style email storage (i.e. Courier IMAP) and IMAP server

# Settings - tweak as necessary.
MAILDIR="/home/$USER/Maildir"
FALSE_NEG_FOLDER="Undetected Spam"
FALSE_POS_FOLDER="Not Spam"
SPAMFOLDER="Spam"

# List of users to run the autoLearn funtcion as (space separated)...
USERLIST=""



autoLearn() {
	# Checks to see if the specified FALSE_NEG_FOLDER and FALSE_POS_FOLDER exist,
	# and creates them if necessary.
	[ -d "${MAILDIR}/.${FALSE_NEG_FOLDER}" ] || /usr/bin/maildirmake++ -f "${FALSE_NEG_FOLDER}" "${MAILDIR}"
	[ -d "${MAILDIR}/.${FALSE_POS_FOLDER}" ] || /usr/bin/maildirmake++ -f "${FALSE_POS_FOLDER}" "${MAILDIR}"
	# Parses the designated Ham folder and then moves it's contents to the Inbox
	hamCount=`find "${MAILDIR}/.${FALSE_POS_FOLDER}/cur" | wc -l`
	if [ $hamCount -gt 2 ]
	then
	  echo "Learning from $hamCount HAM's"
  	  sa-learn --ham "${MAILDIR}/.${FALSE_POS_FOLDER}/cur/*"
  	  mv "${MAILDIR}/.${FALSE_POS_FOLDER}/cur/"* ${MAILDIR}/cur/
	fi
	
	# Parses the "Undetected Spam" folder and then moved it's contents to Spam
	spamCount=`find "${MAILDIR}/.${FALSE_NEG_FOLDER}/cur" | wc -l`
	if [ $spamCount -gt 2 ]
	then
	  echo "Learning from $spamCount SPAM's"
  	  sa-learn --spam "${MAILDIR}/.${FALSE_NEG_FOLDER}/cur/*"
  	  mv "${MAILDIR}/.${FALSE_NEG_FOLDER}/cur/"* ${MAILDIR}/.${SPAMFOLDER}/cur/
	fi
}

### End of function declaration ###
if [ "${USER}" == "root" ]
then
  for USER in $USERLIST;
  do
	echo "learning for $USER"
  	su - $USER -c sa-autolearn
  done
else
  autoLearn
fi

  




#!/bin/bash

# Copyright (c) 2004 by Rubin Bennett [EMAIL PROTECTED]
# All 

Re: SA-Learn script

2004-10-01 Thread Thomas Bolioli




This is exactly the kind of starting point I needed to get me to get in
gear and write something similar for my system. For me however, I am
using the std UWash based IMAP and a few other differences but the
important difference/addition is that I want to automatically train my
users emails accross all of their boxes including inbox and train on
junk that gets picked up but not auto learned. This way things that
pass the spam test but do not get auto trained will get picked up and
trained and vice versa. Even if some things are falsely trained on
because the script ran before they manually classified their FP/FN
mail, when they use the FP/FN boxes sa-learn is smart enough to relearn
things so this should work. When I get the script done I will post it
back for you to merge in with yours. 
Thanks,
Tom

Rubin Bennett wrote:

  Hello all...
I figure I've asked enough questions of this list that it's about time I
gave something back... You may not want it,but here it is anyway :)

I've written a bash script that takes will run sa-learn against the
administrator specified False-Postive and False-Negative folders.

Run this script from cron, and have your users drag n' drop emails that
get misclassified by SA to the appropriate folders.  The script will act
in 2 ways:

1.) Run it as root, and it will parse the administrator specified
USERLIST and run the internally defined autoLearn() function as each
user.
2.) Run it as an ordinary user and it will only learn from that user's
email.

I wrote it this way so that I could have a wrapper around sa-learn that
would make sure that the directories exist, create them if they don't
using maildirmake++, and not try to learn from directories with no
messages in them.

This is written to work with Courier IMAP and Maildir; I have not tried
it with anything else.

Someday I may get around to rewriting it in php and using php-imap to do
the moving around etc, but as a dirty hack this works ok.  It also
doesn't need passwords etc. in config files...

I hope this benefits someone out there... if there's enough interest,
I'll put it on my website and do a proper CVS for it.

If anyone has ideas for making it better (or suck less), let me know. 
Patches are always welcome...
  
  

#!/bin/bash

# Copyright (c) 2004 by Rubin Bennett [EMAIL PROTECTED]
# All Rights reserved.

#This program is free software; you can redistribute it and/or
#modify it under the terms of the GNU General Public License
#as published by the Free Software Foundation; either version 2
#of the License, or (at your option) any later version.
#
#This program is distributed in the hope that it will be useful,
#but WITHOUT ANY WARRANTY; without even the implied warranty of
#MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#GNU General Public License for more details.
#
#You should have received a copy of the GNU General Public License
#along with this program; if not, write to the Free Software
#Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA  02111-1307, USA.


# Usage: IMAP users can move misclassified emails into the "False Negative"
# or "Flase Positive" folders, and this script will learn from them and put
# them where they belong.
# Spam will be moved to the designated Spam folder, and Ham will be moved to
# the user's Inbox.

# This script should be called by CRON or a similar scheduler.


# Requires:
#	 Maildir style email storage (i.e. Courier IMAP) and IMAP server

# Settings - tweak as necessary.
MAILDIR="/home/$USER/Maildir"
FALSE_NEG_FOLDER="Undetected Spam"
FALSE_POS_FOLDER="Not Spam"
SPAMFOLDER="Spam"

# List of users to run the autoLearn funtcion as (space separated)...
USERLIST=""



autoLearn() {
	# Checks to see if the specified FALSE_NEG_FOLDER and FALSE_POS_FOLDER exist,
	# and creates them if necessary.
	[ -d "${MAILDIR}/.${FALSE_NEG_FOLDER}" ] || /usr/bin/maildirmake++ -f "${FALSE_NEG_FOLDER}" "${MAILDIR}"
	[ -d "${MAILDIR}/.${FALSE_POS_FOLDER}" ] || /usr/bin/maildirmake++ -f "${FALSE_POS_FOLDER}" "${MAILDIR}"
	# Parses the designated Ham folder and then moves it's contents to the Inbox
	hamCount=`find "${MAILDIR}/.${FALSE_POS_FOLDER}/cur" | wc -l`
	if [ $hamCount -gt 2 ]
	then
	  echo "Learning from $hamCount HAM's"
  	  sa-learn --ham "${MAILDIR}/.${FALSE_POS_FOLDER}/cur/*"
  	  mv "${MAILDIR}/.${FALSE_POS_FOLDER}/cur/"* ${MAILDIR}/cur/
	fi
	
	# Parses the "Undetected Spam" folder and then moved it's contents to Spam
	spamCount=`find "${MAILDIR}/.${FALSE_NEG_FOLDER}/cur" | wc -l`
	if [ $spamCount -gt 2 ]
	then
	  echo "Learning from $spamCount SPAM's"
  	  sa-learn --spam "${MAILDIR}/.${FALSE_NEG_FOLDER}/cur/*"
  	  mv "${MAILDIR}/.${FALSE_NEG_FOLDER}/cur/"* ${MAILDIR}/.${SPAMFOLDER}/cur/
	fi
}

### End of function declaration ###
if [ "${USER}" == "root" ]
then
  for USER in $USERLIST;
  do
	echo "learning for $USER"
  	su - $USER -c sa-autolearn
  done
else
  autoLearn
fi

  






Problem with Bayes and AutoLearning

2004-09-24 Thread Thomas Bolioli
I am having a problem with 2.63 not using bayes. (NB: setup is using 
individual data and triggering using .4ward, procmail and postfix with 
no individual .sa and .procmail files) I have trained each of three 
accounts with over 1000 ham and some 48K spam messages. SA is working 
and tagging spam based on all tests other than bayes. I make changes to 
the global SA conf and those changes are acted upon so I know that spamd 
is seeing my global conf (below). Also below is a sample header w/ 
report. Needless to say, the auto learn feature is not working as well. 
That is how I knew something was going on. The machine is a standard 
Mandrake 10 setup with regards to SA.
Thanks in advance,
Tom

My Conf:
auto_whitelist_path/var/spool/spamassassin/auto-whitelist
auto_whitelist_file_mode   0666
use_bayes 1
bayes_path ~/.spammer
bayes_file_mode 0700
bayes_use_hapaxes 1
bayes_expiry_max_db_size 150
#bayes_learn_to_journal 1
bayes_auto_learn 1
bayes_auto_learn_threshold_nonspam 1
bayes_auto_learn_threshold_spam 6
rewrite_subject 0
report_safe 0
skip_rbl_checks 1
# How many hits before a message is considered spam.
required_hits   3.0
## Optional Score Increases
#score BAYES_99 4.300
#score BAYES_90 3.500
#score BAYES_80 3.000
Sample Header:
Return-Path: [EMAIL PROTECTED]
X-Original-To: [EMAIL PROTECTED]
Delivered-To: [EMAIL PROTECTED]
Received: from g66dc.g.pppool.de (g66dc.g.pppool.de [80.185.102.220])
   by smtp.terranovum.com (Postfix) with SMTP id 708503E6F9B
   for [EMAIL PROTECTED]; Fri, 24 Sep 2004 13:54:40 -0400 (EDT)
Original-Encoded-Information-Types: multipart/alternative
Language: English
Disclose-Recipients: No
Reply-To: Lillian Fitzpatrick [EMAIL PROTECTED]
From: Lillian Fitzpatrick [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: no more red light tickets!
Date: Fri, 24 Sep 2004 14:40:57 -0500
MIME-Version: 1.0
Content-Type: multipart/alternative;
   boundary=--58012207185158267337
Message-Id: [EMAIL PROTECTED]
X-Spam-Flag: YES
X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on 
nova.terranovum.com
X-Spam-Level: ***
X-Spam-Status: Yes, hits=7.3 required=3.0 
tests=CLICK_BELOW,FORGED_YAHOO_RCVD,
   HTML_50_60,HTML_FONTCOLOR_RED,HTML_FONT_INVISIBLE,HTML_IMAGE_ONLY_04,
   HTML_LINK_CLICK_HERE,HTML_MESSAGE,MIME_HTML_ONLY,MIME_HTML_ONLY_MULTI,
   MSGID_FROM_MTA_SHORT autolearn=no version=2.63
X-Spam-Report:
   *  0.1 HTML_LINK_CLICK_HERE BODY: HTML link text says click here
   *  0.0 HTML_MESSAGE BODY: HTML included in message
   *  0.1 MIME_HTML_ONLY BODY: Message only has text/html MIME parts
   *  0.4 HTML_FONT_INVISIBLE BODY: HTML font color is same as background
   *  0.2 HTML_50_60 BODY: Message is 50% to 60% HTML
   *  0.1 HTML_FONTCOLOR_RED BODY: HTML font color is red
   *  1.5 HTML_IMAGE_ONLY_04 BODY: HTML: images with 200-400 bytes of words
   *  3.3 MSGID_FROM_MTA_SHORT Message-Id was added by a relay
   *  0.5 FORGED_YAHOO_RCVD 'From' yahoo.com does not match 'Received' 
headers
   *  0.0 CLICK_BELOW Asks you to click below
   *  1.1 MIME_HTML_ONLY_MULTI Multipart message only has text/html 
MIME parts




Re: Problem with Bayes and AutoLearning

2004-09-24 Thread Thomas Bolioli
I do not believe that is an issue. It only puts the bayes databases at 
~/.spammer_toks and ~/.spammer_seen. sa-learn has not had a problem 
loading the databases. They have grown everytime I have used it. I can't 
see why spamd would have a problem with it.
Tom

Matt Kettler wrote:
At 03:40 PM 9/24/2004, Thomas Bolioli wrote:
bayes_path ~/.spammer

This statement is invalid if a directory named .spammer exists in 
the user's home..

Please read the docs on bayes_path VERY carefully. Despite being named 
path it's really path, plus filename prefix.

Thus bayes_path should be something like ~/.spammer/bayes
However, why over-ride it at all? it defaults to ~/.spamassassin/bayes



Re: Problem with Bayes and AutoLearning

2004-09-24 Thread Thomas Bolioli
I changed the path just in case. It was that way as a mistake anyhow. 
Here is the output of lint. (it is exactly the same as with the other 
paths so I am sure that is not the issue.) Note that it works there. 
Although not when run through procmail. I think your idea about users is 
on to something.
My .forward file is
|IFS=' '  exec /usr/bin/procmail || exit 75 #webmaster
Quotes and all. Is that correct?
Tom

[EMAIL PROTECTED] webmaster]$ spamassassin -D --lint
debug: Score set 0 chosen.
debug: running in taint mode? yes
debug: Running in taint mode, removing unsafe env vars, and resetting PATH
debug: PATH included '/sbin', keeping.
debug: PATH included '/usr/sbin', keeping.
debug: PATH included '/bin', keeping.
debug: PATH included '/usr/bin', keeping.
debug: PATH included '/usr/X11R6/bin', which doesn't exist, dropping.
debug: PATH included '/usr/local/bin', keeping.
debug: PATH included '/usr/local/sbin', keeping.
debug: Final PATH set to: 
/sbin:/usr/sbin:/bin:/usr/bin:/usr/local/bin:/usr/local/sbin
debug: ignore: using a test message to lint rules
debug: using /usr/share/spamassassin for default rules dir
debug: using /etc/mail/spamassassin for site rules dir
debug: using /home/webmaster/.spamassassin for user state dir
debug: using /home/webmaster/.spamassassin/user_prefs for user prefs file
debug: bayes: 28490 tie-ing to DB file R/O 
/home/webmaster/.spamassassin/bayes_toks
debug: bayes: 28490 tie-ing to DB file R/O 
/home/webmaster/.spamassassin/bayes_seen
debug: bayes: found bayes db version 2
debug: Score set 3 chosen.
debug: Initialising learner
debug: running header regexp tests; score so far=0
debug: running body-text per-line regexp tests; score so far=2.077
debug: bayes corpus size: nspam = 47336, nham = 1028
debug: uri tests: Done uriRE
debug: tokenize: header tokens for *F = U*ignore 
D*compiling.spamassassin.taint.org D*spamassassin.taint.org D*taint.org 
D*org
debug: tokenize: header tokens for *m =  1096056335 lint_rules 
debug: bayes token 'TextCat' = 0.0489090909090909
debug: bayes token 'somewhat' = 0.095669124722507
debug: bayes token 'H*F:D*org' = 0.122005426957751
debug: bayes: score = 0.0118746978798883
debug: bayes: 28490 untie-ing
debug: bayes: 28490 untie-ing db_toks
debug: bayes: 28490 untie-ing db_seen
debug: Razor2 is not available
debug: running raw-body-text per-line regexp tests; score so far=2.077
debug: running uri tests; score so far=2.077
debug: uri tests: Done uriRE
debug: running full-text regexp tests; score so far=2.077
debug: Razor2 is not available
debug: Current PATH is: 
/sbin:/usr/sbin:/bin:/usr/bin:/usr/local/bin:/usr/local/sbin
debug: Pyzor is not available: pyzor not found
debug: DCCifd is not available: no r/w dccifd socket found.
debug: DCC is not available: no executable dccproc found.
debug: all '*From' addrs: [EMAIL PROTECTED]
debug: all '*To' addrs:
debug: is Net::DNS::Resolver available? no
debug: is DNS available? 0
debug: running meta tests; score so far=2.077
debug: is spam? score=0.553 required=3 
tests=BAYES_01,DATE_MISSING,NO_REAL_NAME

Matt Kettler wrote:
At 04:10 PM 9/24/2004, Thomas Bolioli wrote:
I do not believe that is an issue. It only puts the bayes databases 
at ~/.spammer_toks and ~/.spammer_seen. sa-learn has not had a 
problem loading the databases. They have grown everytime I have used 
it. I can't see why spamd would have a problem with it.

Fair enough. Like I said, it's a syntax error if a directory named 
~/.spammer/ exists. However, if it doesn't exist, it's fine.

Are you sure spamc is being invoked as the proper user, and not as root?
spamd will fall back to nobody if it finds itself still running as 
root after setuiding to the client user. You could try copying a set 
of files into the path of nobody's home-dir and see if bayes starts 
running.