Re: Shades-of-greylisting: custom rule creation?

2009-01-08 Thread Don Levey
McDonald, Dan wrote:
> On Wed, 2009-01-07 at 09:43 -0500, Don Levey wrote:
>> [I've not yet found in the archives something which addresses this
>> question; if I missed it I apologise]
>>
>> I am trying to set up a system whereby I can create lists of
>> addresses/servers of varying degrees of spamminess.  For example, I
>> might have three lists:
> [...]
>> How would I go about creating such rules?  If this has been covered
>> before or elsewhere, a pointer would be welcome.  Thank you!
> 
> Amavisd-new has "soft black/white listing" that is very similar to what
> you want to do.
> 
> 

I'm looking into it now, thanks!
 -Don


Shades-of-greylisting: custom rule creation?

2009-01-07 Thread Don Levey
[I've not yet found in the archives something which addresses this
question; if I missed it I apologise]

I am trying to set up a system whereby I can create lists of
addresses/servers of varying degrees of spamminess.  For example, I
might have three lists:

From server constantcontact.com:+2
From server roving.com: +4
To user nos...@example.com: +10

I find myself in the situation in which most, but not all, of the mail
from certain entities is spam.  Simply blacklisting them would generate
false positives, and I need to avoid that.  However, absent other
considerations (like a low Bayes score) I want the message to be tagged
if not rejected outright (I'm using spamass-milter) based upon its
score.  Being able to set up categories such as above would help in this.

Likewise, there are certain users on my system that get mostly spam, but
occasionally legitimate messages.  For example, I hope my 8 year old son
doesn't need Viagra, but he does want to read his email from Grandma.
Increasing the score of messages destined to his mailbox, while not
tripping the spam flag in and of itself, would contribute to such a
designation when combined with other indicators.

How would I go about creating such rules?  If this has been covered
before or elsewhere, a pointer would be welcome.  Thank you!

 -Don


RE: Help with Stupid Viagra/Calis spams

2006-04-18 Thread Don Levey
Let's put it this way - here are the rules your message hit on my system:

Carl Chipman wrote:
> 
> Content analysis details:   (8.2 points, 5.0 required)
>
>  3.5 SUBJECT_DRUG_GAP_VIA   
> -4.9 BAYES_00   
>  0.4 URIBL_AB_SURBL 
>  1.5 URIBL_WS_SURBL 
>  3.2 URIBL_OB_SURBL 
>  4.3 URIBL_SC_SURBL 
>  0.2 DRUGS_ERECTILE 

 -Don



RE: Delirium...

2006-04-03 Thread Don Levey
Philip Prindeville wrote:

> 
> What if we had a TXT Record in the DNS for a domain that looked like:
> 
> @IN TXT   "XYZZY 123 456  (C) Copyright 2006 Redfish
> Solutions, LLC"
> 
> And then had hosts participating in this scheme generate outgoing
> mail as: 
> 
> X-Yes-Its-Really-Me: XYZZY 123 456 (C) Copyright 2006 Redfish
> Solutions, LLC"
> 
> and uses the presence of this copywritten key to match the appropriate
> string
> in the DNS as proof that the sender is who he says he is.
> 
> -Philip

Reminds me of Habeus...
 -Don


RE: Autolearn: works from command-line, not via milter

2006-01-16 Thread Don Levey
Don Levey wrote:
> Don Levey wrote:
>> Jim Maul wrote:
>>>
>>> Failed means it didnt work for some reason.  No means it simply
>>> didnt even try to autolearn (score wasnt high enough, spam/ham
>>> threshold not reached, etc.)  In short, failed points to a
>>> potential problem, whereas no doesnt.
>>>
>>> -Jim
>>
>> That I understood; I'm mentioning the "no" because that means the
>> autolearn is functioning in at least some cases.  The only
>> differences I'm seeing between the failure and functioning cases are
>> the actual spam scores.  Those with "autolearn=no" seem to score at
>> least -3.7 or higher, while the "autolearn=failed" show up as -4.8 or
>>  -4.9. -Don
>
> Whoops, spoke too soon.  I see an "autolearn=no" with -4.8, and
> "autolearn=failed" with -2.1, so it's not a problem with a certain
> score cutoff, or (from what I can tell) specific rules hits.  I'm
> checking logs again...
>
> Thanks!
>  -Don

I think I may have it, though it's a little too soon to tell.
The permissions on the Bayes DB files were just fine, and owned by the SA
ID.  The permissions on the directory housing those files were OK - but the
owner was not.  I feel a bit stupid for not having checked this when I
looked at the rest, but it seems to be working now.
 -Don


RE: Autolearn: works from command-line, not via milter

2006-01-16 Thread Don Levey
Don Levey wrote:
> Jim Maul wrote:
>>
>> Failed means it didnt work for some reason.  No means it simply didnt
>> even try to autolearn (score wasnt high enough, spam/ham threshold
>> not reached, etc.)  In short, failed points to a potential problem,
>> whereas no doesnt.
>>
>> -Jim
>
> That I understood; I'm mentioning the "no" because that means the
> autolearn is functioning in at least some cases.  The only
> differences I'm seeing between the failure and functioning cases are
> the actual spam scores.  Those with "autolearn=no" seem to score at
> least -3.7 or higher, while the "autolearn=failed" show up as -4.8 or
>  -4.9. -Don

Whoops, spoke too soon.  I see an "autolearn=no" with -4.8, and
"autolearn=failed" with -2.1, so it's not a problem with a certain score
cutoff, or (from what I can tell) specific rules hits.  I'm checking logs
again...

Thanks!
 -Don



RE: Autolearn: works from command-line, not via milter

2006-01-16 Thread Don Levey
Jim Maul wrote:
> Don Levey wrote:
>> Don Levey wrote:
>>
>>> Messages coming in and autoscanned via spamass-milter/spamd all fail
>>> autolearn.  To pick one example from this list (full headers
>>> available if it will help):
>>>
>>>
>>> X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00
>>> autolearn=failed version=3.0.4
>>> X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on
>>> davinci.the-leveys.us Status: O
>>> X-UID: 7374
>>> Content-Length: 570
>>> X-Keywords:
>>>
>> ...
>>
>> As a followup: I've noticed that at least *some* messages are coming
>> in with "autolearn=no".  I've not yet found the difference.  -Don
>>
>>
>>
>
> Failed means it didnt work for some reason.  No means it simply didnt
> even try to autolearn (score wasnt high enough, spam/ham threshold not
> reached, etc.)  In short, failed points to a potential problem,
> whereas no doesnt.
>
> -Jim

That I understood; I'm mentioning the "no" because that means the autolearn
is functioning in at least some cases.  The only differences I'm seeing
between the failure and functioning cases are the actual spam scores.  Those
with "autolearn=no" seem to score at least -3.7 or higher, while the
"autolearn=failed" show up as -4.8 or -4.9.
 -Don


RE: Autolearn: works from command-line, not via milter

2006-01-16 Thread Don Levey
Don Levey wrote:

>
> Messages coming in and autoscanned via spamass-milter/spamd all fail
> autolearn.  To pick one example from this list (full headers
> available if it will help):
>
>
> X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00
>   autolearn=failed version=3.0.4
> X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on
>   davinci.the-leveys.us
> Status: O
> X-UID: 7374
> Content-Length: 570
> X-Keywords:
>
...

As a followup: I've noticed that at least *some* messages are coming in with
"autolearn=no".  I've not yet found the difference.
 -Don



Autolearn: works from command-line, not via milter

2006-01-16 Thread Don Levey
I just moved/upgraded my home server this weekend, leading to less hair on
my head and more ulcers elsewhere. While I've worked my way through many
problems in the past few days, this one seems to be eluding me and googling,
archives, etc haven't yet helped me.  Here's the scoop:

Messages coming in and autoscanned via spamass-milter/spamd all fail
autolearn.  To pick one example from this list (full headers available if it
will help):


X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00 autolearn=failed
version=3.0.4
X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on
davinci.the-leveys.us
Status: O
X-UID: 7374
Content-Length: 570
X-Keywords:


Running the same message through "spamassassin -D --mbox < msgfile" result
in the following:


X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on
davinci.the-leveys.us
X-Spam-Level:
X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00 autolearn=ham
version=3.0.4


So clearly two different things are happening here.  The full text of the
above didn't seem to have any errors, nor do I see any running
"spamassassin --lint -D".  I am getting one interesting line in my maillog
file:


Jan 16 10:49:28 davinci sendmail[29241]: k0GFnSqe029241: SYSERR(sa-milt):
hash map "Alias0": unsafe map file /etc/aliases.db: Permission denied


and this seems to lead to:

Jan 16 10:49:29 davinci spamd[27845]: handle_user: unable to find user
'user'!
Jan 16 10:49:29 davinci spamd[27845]: processing message  for user:515.


I've deleted and recreated my alias database (/etc/aliases ->
/etc/aliases.db) but permissions are the same.  Interestingly, 'user' should
be an alias to an account which is uid 515, and in group gid 515, but I'm
not seeing the match here.

Any suggestions on what to check next?

Thanks in advance,
 -Don


RE: Spam, Block: Good to know my representative is spamming..

2005-11-09 Thread Don Levey
List Mail User wrote:
>> ...
>
>
> i.e. the "Reply-To:" line is valid - [EMAIL PROTECTED]
>
>   The Message-ID and leaking the RFC1918 IP address are just bad IT
> management, you can't blame her for that; But for political "spam"
> (assuming it wasn't personalized or signed up for), you could report
> it to the FTC and "Cc" her office.  And then there is sending it via
> a commercial service whose contact telephone number is 999 999 
> (i.e. listrak.com) is probably not very wise for a goverment
> representative (Their *real* telephone number would seem to be 717
> 627-6080).
>
>   Paul Shupak
>

Didn't congress exempt itself from the I-CAN-SPAM laws anyway?  What would
reporting to  the FTC do?  More effective would be to report to the local
paper(s), with copied to her inbox.
 -Don



RE: [OTAnn] Feedback

2005-11-08 Thread Don Levey
Duncan Hill wrote:
> On Tuesday 08 Nov 2005 16:38, shenanigans wrote:
>> I was interested in getting feedback from current mail group users.
>> 
>> We have mirrored your mail list in a new application that provides a
>> more aggregated and safe environment which utilizes the power of
>> broadband. 
>> 
>> Roomity.com v 1.5 is a web 2.01 community webapp. Our newest version
>> adds broadcast video and social networking such as favorite authors
>> and an html editor.
> 
> This mail has hit several lists I'm on.  The full-disclosure list had
> a bit of a field day with the concept of a java app required to see
> the content - considering the security problems that might imply.
> 
> And I have to wonder what the 'power of broadband' has to do with
> mailing lists...

The better to serve ads with, my dear:
"*Your* clubs, no sign up to read, ad supported; try broadband internet. 
~~1131467917258~~"

Guaranteeing a "safe environment", presumably free from spam, is ironic when 
posted to an anti-spam list.  Any reason why this is *not* spam?
 -Don


RE: I am NOT a spammer

2005-07-14 Thread Don Levey
aecioneto wrote:
>> I think a few people have already mentioned it, but your IP is
>> listed > in SORBS:
>>
>> Remote host said: 550 Dynamic IP Addresses See:
>> http://www.sorbs.net/lookup.shtml?200.207.18.245
>>
>> Best bet is to get it removed.
>
> Great!
> I will start making money this way:
> 1. I start listing a whole bunch of IPs as dynamic.
> 2. When they complain about it, they have to pay me to clean them up.
>
That would be interesting.  According to SORBS, there is no charge for
removal:
http://www.us.sorbs.net/overview.shtml
"Note: Use of this service is currently free of charge. References to the
SORBS fine refers ONLY to the database of received spam. There is no charge
for removal from the proxy, vulnerablility, relay, zombie or DUHL
databases."

You're in the DUHL database.


>
> That's why I am aguing about such a great list setting up to use such
> a stupid blocking listing services.
>
The problem, as many others have told you, is the ISP.  The SORBS owners
have you down in *two* levels of block - apparently because dynamic IPs are
allocated in that area by both your ISP and their upstream.  SORBS (rightly,
IMO) decided that the cost of NOT listing that netblock is much higher.  My
personal experience, after receiving many thousands of spam attempts from
Brazilian (dynamic) IPs, and after sending hundreds of documented
complaints, is that the ISPs are uninterested in fixing their spam problem.


> So, here is my point: take my ip 200.207.18.245.
> I want someone to *prove* that it is a dsl static IP.
> No way a bad SORBS entry is enough.
>
> Regards.

The entity that needs to *prove* that you've got a static IP is the *owner*
of that IP - the ISP.  That is the entity that SORBS (or other reputable
blocking lists) will talk to about the problem.  Unfortunately, they seem to
be uninterested in doing anything above the bare minimum needed to maintain
IP connectivity.

Railing against the list, or the list owners, won't do anything except piss
people off.  The reason why the list (and many others) uses SORBS is because
*it works*.  It has been shown, over a period of years, to be reliable.
Demanding that the list stop using SORBS would not be an efficient use of
your time.  Demanding that your ISP act like a responsible entity would be
more useful.  In my opinion, here are the things they should do:

1) Segregate dynamic IPs into one netblock, static IPs into another.
2) Publish/make SORBS aware of those blocks - both static and dynamic.
3) Make sure that the truly dynamic block does not permit outbound port 25
access beyond their network.
4) As a followup to #3, this would require all dynamic IPs who want to run
their own mail server to smarthost their outgoing mail through the ISP, who
can the throttle based upon load, spamminess, etc.
5) Respond quickly, and assertively, to spam complaints.

Any guesses on how many they'll end up doing?  Don't all answer at once...
 -Don


RE: it's getting worse again

2005-04-06 Thread Don Levey
Niek wrote:
> On 4/6/2005 8:29 PM +0100, Florin Andrei wrote:
>> I guess something has to change. "Then change it yourself" type of
>> advices will go straight to /dev/null, thank you, because as far as
>> SA is concerned, i'm just a user. I am merely pointing out the
>> problem.
>
> Users should complain at their systems administrators.
>
> Niek

Someone can be a sysadmin, and not be a programmer.  While the skill sets
overlap, they're not necessarily one and the same.  Perhaps he meant user as
in consumer?
 -Don


RE: it's getting worse again

2005-04-06 Thread Don Levey
Florin Andrei wrote:
> I'm using SA since... well, a long time ago, and one thing that i
> noticed was a pattern in the way its efficiency varies: it's pretty
> good soon after a new release, then it gets continuously worse; then
> a new release and all of a sudden it's good again, then it starts
> "decaying" again...
>
> Well, it's been a while since the last release, and it's already
> noticeably worse. I know this has been discussed before, i am aware of
> the VirusScannerTypeUpdates FAQ entry, but you know what, from an end-
> user's point of view, it does not matter. All that matters is that,
> despite brilliant technical discussions, the efficiency is going down
> and, if a new version is not released soon enough, the users start to
> complain. This is what's happening right now.
>
> I guess something has to change. "Then change it yourself" type of
> advices will go straight to /dev/null, thank you, because as far as SA
> is concerned, i'm just a user. I am merely pointing out the problem.

How do you mean "getting worse"?
Are you saying that it's suddenly letting through messages that it would
have stopped previously?  Or that the spammers and their obfuscation
techniques are changing and now getting around the rulesets you're using?  I
understand that you're not in a position to make the code changes yourself,
but those that are need the details of your problem in order to be able to
fix it - or even diagnose it.

Perhaps you simply need new rules, or to update the rulesets you're already
using?  I'm not a coder either, but I may start down that road anyway.  Then
again, now that I've finally upgraded to 3.0.2, I'm finding/tagging over 99%
of the spam I'm getting, and blocking outright almost 50% (with a block
threshhold of 18, no less).

 -Don


RE: Update on Autolearn, SA/SA-milter ID problem, etc

2005-04-05 Thread Don Levey
Craig McLean wrote:
> Don, some thoughts inline..
>
> Don Levey wrote:

>>
>>> From what I see now, this is because if root is running it then the
>>> user
>> shifts to 'nobody'. This is damn inconvenient. So, I've tried to
>> shift to
>> using user 'spamassassin' by using the "-u spamassassin" switch on
>> both
>> spamd and spamass-milter. When I do this, though, I can't actually
>> read the
>> user_prefs file for user root. But why am I even trying to open it
>> for root,
>> when spamassassin is the UID?
>
> Why not combine the user_prefs and the local.cf, and move the
> whitelist somewhere where 'spamassassin' user can read/write to it?
>
The latest in my quest to get SA to work properly...

I've made sure that the whitelist and Bayes DB can be written to and be read
by 'spamassassin'.  I've set the '-u spamassassin' flag for both the
/etc/sysconfig/spamassassin and /etc/sysconfig/spamass-milter startup files.
I've restarted spamd, spamass-milter, and sendmail.

My ps list shows that 'spamassassin' is running spamd, and 'root' is running
spamass-milter.  In my maillog file, I am getting errors:
* for 'named' accounts, spamd can't find the user_prefs file
* for 'aliased' accounts, spamd can't find the username.

I know that I can solve the latter by putting the '-x' flag on the
spamass-milter startup line.  Do I need to worry about the former?  That is,
am I causing any problems by running this way, or am I simply now set up so
that I can run user-specific rules in addition to the site-wide ones?

 -Don

p.s. In case it wasn't clear, SpamAssassin really rocks!  I don't want my
current frustration to get in the way of the appreciation and adulation the
developers deserve.


RE: Update on Autolearn, SA/SA-milter ID problem, etc

2005-04-04 Thread Don Levey
Don Levey wrote:
> Craig McLean wrote:
>> -BEGIN PGP SIGNED MESSAGE-
>> Hash: SHA1
>>
>> Don Levey wrote:
>>
>> [snip]
>> [Spamassassin rejecting mail above a certain score]
>>> Not only that, but it seems to be happening now!  I vaguely remember
>>> seeing which config file would control this, but re-Googling for it
>>> doesn't turn anything up now.  Damn this memory!
>>
>> AHA! It came to me, it's the spamass-milter. There is a startup
>> option (-r ) where n is the score to reject at.
>> Also, check that it's not running with -m/-M, that would screw thing
>> up. In fact, it's probably worth checking the whole milter config
>> against the man page.
>>
>> Cheers!
>> Craig.
>
> Craig,
> I think that's it!  This was being set in the /etc/init.d startup
> script, -r
> 15.
> Also, -m was set, which (according to the man page) would disable
> subject/body rewriting.
> Of all the things I played around with, THAT one was from the stock
> files.
>
> Thanks for all your help;  I'll make sure that this works correctly
> (awaiting the next spam message now), take a snapshot, and then start
> playing around with the dedicated user ID.
>
>  -Don

That was indeed the problem.  I was just able to check, and my spam is now
being tagged.
Thanks again!
 -Don



RE: Update on Autolearn, SA/SA-milter ID problem, etc

2005-04-04 Thread Don Levey
Craig McLean wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> Don Levey wrote:
>
> [snip]
> [Spamassassin rejecting mail above a certain score]
>> Not only that, but it seems to be happening now!  I vaguely remember
>> seeing which config file would control this, but re-Googling for it
>> doesn't turn anything up now.  Damn this memory!
>
> AHA! It came to me, it's the spamass-milter. There is a startup option
> (-r ) where n is the score to reject at.
> Also, check that it's not running with -m/-M, that would screw thing
> up. In fact, it's probably worth checking the whole milter config
> against the man page.
>
> Cheers!
> Craig.

Craig,
I think that's it!  This was being set in the /etc/init.d startup script, -r
15.
Also, -m was set, which (according to the man page) would disable
subject/body rewriting.
Of all the things I played around with, THAT one was from the stock files.

Thanks for all your help;  I'll make sure that this works correctly
(awaiting the next spam message now), take a snapshot, and then start
playing around with the dedicated user ID.

 -Don


RE: Update on Autolearn, SA/SA-milter ID problem, etc

2005-04-04 Thread Don Levey
Craig McLean wrote:

>>
>>  * The spamd/spamass-milter processes should not run as root (user
>> 'spamassassin').
>
> I gather from your previous mail that you already run this as
> "spamassassin". Make sure it owns the bayes files defined by
> bayes_path. I created a subdirectory owned by the user and let SA get
> on with it.
>
I had tried running as 'spamassassin', but ran into difficulties.  In
particular, it kept giving errors that it couldn't open
/root/.spamassassin/user_prefs for writing, even when I made the file and
the directory wide-open (777).  Since I seem to recall seeing somewhere that
I should make changed to the user_prefs and not the local.cf (as that might
be updated and overwritten with upgrades), I had been using the user_prefs
instead.  I even went to the point of setting up a wide-open user_prefs file
in a wide open directory, and linking to that for all users, but that didn't
help (it still looked only for the one in the root home dir)

>>  * I want a single set of user preferences/bayes DB.
>>While additional user preferences could in theory be OK,
>>I want only one Bayes DB.
>
> OK, the prefs in /etc/mail/spamassassin/*.cf and the bayes BD in
> bayes_path then.
>
I think I'm there now; when I tried to use the -u flag on the startup
command for spamassassin and spamass-milter, I got checks to each individual
user.

>>  * As the above may mention, I want to use the Bayes DB for learning
>> and auto-learning.
>
> Should work fine as long as the user running spamd owns the
> directory/files used by bayes.
>
So far this seems to be working.

>>  * I want tagged spam to rewrite the subject.
>>  * I want to attch the original message to the report.
>
> looks like that's set up fine, judging by your local.cf
>
I'm getting header tags, but I'm not getting message rewriting/attachment,
or a subject rewrite.

>>  * I want to use RBLs for things not covered otherwise in sendmail
>>(i.e. for URLs in the messages)
>
> Make sure you have the perl Net::DNS stuff installed. Check with
> 'spamassassin -D --lint, look for:
> debug: is Net::DNS::Resolver available? yes
>
I *think* this is set up correctly; I'm not currently getting any errors
that I can see.  That line is indeed present.

>
>>  * Eventually, I may drop egregious spam examples,
>>but I'm not sure I want to do that yet.
>
> Well, it can be done if you choose to.
>
Not only that, but it seems to be happening now!  I vaguely remember seeing
which config file would control this, but re-Googling for it doesn't turn
anything up now.  Damn this memory!

>> What seems to happen is that I can get some subset of these things,
>> but not
>> all at once. Additionally, while I often think I've got things
>> working
>> correctly, they appear to change randomly from working to
>> non-working.
>
> Can you be more specific? What's not working? Any error messages in
> messages/maillog/&c.
>
At this particular moment, the big problem is the subject/message rewriting.
But then I'm still running as root (or, apparently, 'nobody') and I'm not
sure this is the best thing to do.

>> The last point, on dropping spam, seems to be happening anyway. From
>> what I can
>> tell, anything with a score greater than 15 is being rejected
>> automatically.
>> This is seriously reducing my spam load.
>
> That may well be a function of how SA/sendmail are configured on
> Fedora?
>
It could be - but that wasn't happening as of Friday.  I was seeing scores
into the 20s come through - but tagged/rewritten.

>> As I mentioned last week, I was getting "autolearn=failed" when
>> BAYES_00 was
>> the only rule that hit. If I got ANY other rule that also hit,
>> autolearn did
>> not fail. At least part of the problem there had to do with creating
>> the
>> lock file for the Bayes DB; Even though I thought I was running as
>> root, and
>> root owned the directory in question (/etc/mail/spamassassin) I
>> needed to
>> open the permissions in order for things to work correctly.
>
> I'd imagine that spamd runs as root only for long enough to create the
> priv'd socket it needs, and then drops privs. I have everything in
> /var/bayesdb/bayes_* and /var/bayesdb is 755 owned by 'spam' user
> (which runs the milter/spamd). /etc/mail/spamassassin is 755 owned by
> root. No problems..
>
I've tried to move things off to a new directory /SA-shared.  The Bayes DB
is there now.  but I'm still back to running as root, to avoid the
user_prefs errors mentioned above.

>>
>>> From what I see now, this is because if root is running it then the
>>> user
>> shifts to 'nobody'. This is damn inconvenient. So, I've tried to
>> shift to
>> using user 'spamassassin' by using the "-u spamassassin" switch on
>> both
>> spamd and spamass-milter. When I do this, though, I can't actually
>> read the
>> user_prefs file for user root. But why am I even trying to open it
>> for root,
>> when spamassassin is the UID?
>
> Why not combine the user_prefs and

Update on Autolearn, SA/SA-milter ID problem, etc

2005-04-04 Thread Don Levey
If the definition of insanity is doing the same thing multiple times and
expecting a different result, what is it when you're doing the same thing
multiple times, expecting the same result, and you get DIFFERENT results?
That's what seems to be happening to me.  I've been trying to search the
archives for any helpful information, but I'm having difficulty in
extracting anything that might be of use.

For some reason, I can't seem to get all the features on SpamAssassin to
work at the same time. Allow me to elaborate - here's what I've got:

* OS: Fedora Core 2
* MTA: Sendmail 8.12.11-4.6
* Spamassassin: 3.0.2-1.1.fc2.rf
* Spamass-milter: 0.3.0-1.1.fc2.rf

Here's what I want to do with them:

* The spamd/spamass-milter processes should not run as root (user
'spamassassin').
* I want a single set of user preferences/bayes DB.
  While additional user preferences could in theory be OK,
  I want only one Bayes DB. * As the above may mention, I want to
  use the Bayes DB for learning and auto-learning.
* I want tagged spam to rewrite the subject.
* I want to attch the original message to the report.
* I want to use RBLs for things not covered otherwise in sendmail
  (i.e. for URLs in the messages)
* I want to use Razor/Pyzor
* Eventually, I may drop egregious spam examples,
  but I'm not sure I want to do that yet.

What seems to happen is that I can get some subset of these things, but not
all at once. Additionally, while I often think I've got things working
correctly, they appear to change randomly from working to non-working. The
last point, on dropping spam, seems to be happening anyway. From what I can
tell, anything with a score greater than 15 is being rejected automatically.
This is seriously reducing my spam load.

As I mentioned last week, I was getting "autolearn=failed" when BAYES_00 was
the only rule that hit. If I got ANY other rule that also hit, autolearn did
not fail. At least part of the problem there had to do with creating the
lock file for the Bayes DB; Even though I thought I was running as root, and
root owned the directory in question (/etc/mail/spamassassin) I needed to
open the permissions in order for things to work correctly.

>From what I see now, this is because if root is running it then the user
shifts to 'nobody'. This is damn inconvenient. So, I've tried to shift to
using user 'spamassassin' by using the "-u spamassassin" switch on both
spamd and spamass-milter. When I do this, though, I can't actually read the
user_prefs file for user root. But why am I even trying to open it for root,
when spamassassin is the UID?

The biggest problem right now is that for some reason message rewriting has
stopped for spam messages.  The header is tagged correctly, but the message
is never rewritten.  From my local.cf file (below), it looks like this
should be happening.  I don't know of any change I made which could account
for this, and indeed this seemed to happen overnight, when I didn't do
anything.

[local.cf]
required_score  5
rewrite_header Subject  *** SPAM: _SCORE_ points ***
#subject_tag[SPAM?]
report_safe 1
#use_terse_report   0
use_bayes   1
#bayes_path /etc/mail/spamassassin/bayes_db
bayes_path  /SA-shared/bayes_db
bayes_file_mode 0666
bayes_auto_learn1
skip_rbl_checks 0
use_razor2  1
use_dcc 1
use_pyzor   1
trusted_networks192.168/16 127/8
ok_languagesen he ru yi
ok_locales  en ru

[user_prefs for root]
# How many hits before a mail is considered spam.
 required_score 5

# Whitelist and blacklist addresses are now file-glob-style patterns, so
# "[EMAIL PROTECTED]", "[EMAIL PROTECTED]", or "*.domain.net" will all work.
# whitelist_from[EMAIL PROTECTED]
auto_whitelist_path /etc/mail/spamassassin/auto-whitelist
auto_whitelist_file_mode0666


bayes_auto_learn1


score BIZ_TLD   4.5
score RCVD_IN_SORBS_DUL 0.1
score RCVD_IN_SORBS_WEB 0.5
score SUBJECT_DRUG_GAP_C3.5
score SUBJECT_DRUG_GAP_L3.5
score SUBJECT_DRUG_GAP_VIA  3.5
score VIA_GAP_GRA   3.5
score FORGED_YAHOO_RCVD 1.5
score GAPPY_SUBJECT 2.5
score HTML_IMAGE_ONLY_043.5

score   BAYES_00 0 0 -4.901 -4.900
score   BAYES_05 0 0 -0.925 -2.599
score   BAYES_20 0 0 -0.730 -1.951
score   BAYES_40 0 0 -0.276 -1.096
score   BAYES_50 0 0  1.567  0.001
score   BAYES_60 0 0  3.515  1.592
score   BAYES_80 0 0  3.608  2.087
score   BAYES_95 0 0  3.514  3.514
score   BAYES_99 0 0  4.070  5.400

I don't want to clog up the bandwidth with too many files in-line that may
not be of use, so I've got:

http://www.eruditer.org:6080/spamassassin/local.cf
http://www.eruditer.org:6080/spamassassin/root-user_prefs
http://www.eruditer.o

[Possibly OT]: User ID running spamass-milter?

2005-04-01 Thread Don Levey
I'm *almost* done setting this up (again), and am trying to sprt out the
user ID problem.  I'd previously been running everything as root, and it all
worked.  I'm uncomfortable doing this, though, and so am trying to run
everything as user spamassassin.  I've got spamd running properly, I think,
but for some reason spamass-milter is looking in
~root/.spamassassin/user_prefs instead of the file in its own home
directory:

Apr  1 12:27:13 davinci spamd[22997]: Creating default_prefs
[/root/.spamassassin/user_prefs]
Apr  1 12:27:13 davinci spamd[22997]: Cannot write to
/root/.spamassassin/user_prefs: Permission denied
Apr  1 12:27:13 davinci spamd[22997]: Couldn't create readable default_prefs
for [/root/.spamassassin/user_prefs]

and yet in the /etc/sysconfig/spamass-milter file I've got:

SM_EXTRA_FLAGS="-u spamassassin"

I figure there must be another configuration file somewhere, as I see the
following in ps:

root 23469 1  0 11:47 ?00:00:00 spamass-milter -p
/var/run/spamass.sock -f -m -r 15

My invocation from sendmail.cf is:
Xclmilter, S=local:/var/run/clamav/clamav-milter.sock, F=, T=C:1m;S:4m;R:4m

(in sendmail.mc it is:
INPUT_MAIL_FILTER(`clmilter', `S=local:/var/run/clamav/clamav-milter.sock,
F=, T=C:1m;S:4m;R:4m')dnl)


I was under the impression that the -u flag in SM_EXTRA_FLAGS would be
picked up, but apparently not.  Where have I gone wrong?

Thanks again,
 -Don


RE: Autolearn=failed when BAYES_00 is only rule hit

2005-04-01 Thread Don Levey
Don Levey wrote:
> Please forgive me if this is in the archives; I'm having trouble
> finding it.
>
> I've just finished training my Bayes DB using sa-learn (perversely,
> when I was trying to collect 200 spam messages, the spammers decided
> to stop sending to me).  Now that the DB is usable, it's interesting
> that while most ham messages produce at least one small rule hit and
> a negative Bayes score that results in "Autolearn=no", when BAYES_00
> is the ONLY rule that hits I get "Autolearn=failed".
>
> Two quick questions:
> 1) What should I do about this, and
> 2) Should I worry, or just ignore it?
>
> TIA,
>  -Don

I may have found at least part of the problem, at least as far as the
"autolearn=no" portion of the question.  Running a message through
"spamassassin -D --mbox < msgfile" gives me the following last few lines:

debug: running body-text per-line regexp tests; score so far=8.886
debug: running uri tests; score so far=8.886
debug: running raw-body-text per-line regexp tests; score so far=8.886
debug: running full-text regexp tests; score so far=8.886
debug: auto-learn: currently using scoreset 3, recomputing score based on
scoreset 1.
debug: auto-learn: message score: 8.886, computed score for autolearn: 7.223
debug: auto-learn? ham=0.1, spam=12, body-points=3.1, head-points=3.64,
learned-points=-1.096
debug: auto-learn? no: inside auto-learn thresholds, not considered ham or
spam
debug: is spam? score=8.886 required=5
debug:
tests=BAYES_40,DATE_IN_FUTURE_03_06,FORGED_YAHOO_RCVD,MIME_HEADER_CTYPE_ONLY
,NO_OBLIGATION,SUBJ_LIFE_INSURANCE,URIBL_OB_SURBL,URIBL_WS_SURBL
debug:
subtests=__BAT_BOUNDARY,__CT,__CTYPE_HAS_BOUNDARY,__HAS_MSGID,__HAS_SUBJECT,
__MSGID_OK_DIGITS,__MSGID_OK_HEX,__MSGID_OK_HOST,__RCVD_IN_NJABL,__RCVD_IN_S
O
RBS,__RFC_IGNORANT_ENVFROM,__SANE_MSGID


So somewhere I've got set that in order to autolearn as spam, I must have a
score of 12, and to learn as ham the score must be less than 0.1.  This
particular message scored 11.9.

The next step was to try a message that had a score greater than 12.  I saw
that on the example I chose, I also got "autolearn=failed" in the header.
Running the same debug command line, I got:

debug: running body-text per-line regexp tests; score so far=15.837
debug: running uri tests; score so far=15.837
debug: running raw-body-text per-line regexp tests; score so far=15.837
debug: running full-text regexp tests; score so far=15.837
debug: auto-learn: currently using scoreset 3, recomputing score based on
scoreset 1.
debug: auto-learn: message score: 15.837, computed score for autolearn:
13.387
debug: auto-learn? ham=0.1, spam=12, body-points=11.404, head-points=5.843,
learned-points=0.001
debug: auto-learn? yes, spam (13.387 > 12)
debug: Learning Spam

debug: bayes: 20664 untie-ing
debug: bayes: 20664 untie-ing db_toks
debug: bayes: 20664 untie-ing db_seen
debug: bayes: files locked, now unlocking lock
debug: unlock: 20664 unlink /etc/mail/spamassassin/bayes_db.lock
debug: is spam? score=15.837 required=5
debug:
tests=BAYES_50,FORGED_YAHOO_RCVD,MIME_HEADER_CTYPE_ONLY,RCVD_IN_BL_SPAMCOP_N
ET,RCVD_IN_XBL,URIBL_OB_SURBL,URIBL_SBL,URIBL_SC_SURBL,URIBL_WS_SURBL
debug:
subtests=__BAT_BOUNDARY,__CT,__CTYPE_HAS_BOUNDARY,__HAS_MSGID,__HAS_SUBJECT,
__MSGID_OK_HOST,__RCVD_IN_SBL_XBL,__RFC_IGNORANT_ENVFROM,__SANE_MSGID

As should be clear here, it says that the message WAS autolearned.  And I
see that in the message headers generated from this run, I did get
"autolearn=spam".  I am doing this as the same user as is running spamd
(platform is Fedora, where the spamassassin "service" run is spamd).

I had been hoping to get debug messages from the above, but everything was
fine.  Checking in my maillog, however, hit a bit of paydirt:

Apr  1 09:40:01 davinci spamd[9864]: connection from davinci.example.com
[127.0.0.1] at port 41609
Apr  1 09:40:01 davinci spamd[9864]: info: setuid to root succeeded
Apr  1 09:40:01 davinci spamd[9864]: Still running as root: user not
specified with -u, not found, or set to root.  Fall back to nobody.
Apr  1 09:40:01 davinci spamd[9864]: processing message
<[EMAIL PROTECTED]> for root:99.
Apr  1 09:40:01 davinci spamd[9864]: bayes expire_old_tokens: lock: 9864
cannot create tmp lockfile
/etc/mail/spamassassin/bayes_db.lock.davinci.example.com.9864 for
/etc/mail/spamassassin/bayes_db.lock: Permission denied
Apr  1 09:40:01 davinci spamd[9864]: cannot write to
/etc/mail/spamassassin/bayes_db_journal, Bayes db update ignored: Permission
denied
Apr  1 09:40:07 davinci spamd[9864]: clean message (-4.9/5.0) for root:99 in
6.1 seconds, 3079 bytes.
Apr  1 09:40:07 davinci spamd[9864]: result: . -4 - BAYES_00
scantime=6.1,size=3079,mid=<[EMAIL PROTECTED]>,
bayes=0,autolearn=failed


Note that I am getting a permissions error creating the lock file.  This
seems to be because the permi

Autolearn=failed when BAYES_00 is only rule hit

2005-03-31 Thread Don Levey
Please forgive me if this is in the archives; I'm having trouble finding it.

I've just finished training my Bayes DB using sa-learn (perversely, when I
was trying to collect 200 spam messages, the spammers decided to stop
sending to me).  Now that the DB is usable, it's interesting that while most
ham messages produce at least one small rule hit and a negative Bayes score
that results in "Autolearn=no", when BAYES_00 is the ONLY rule that hits I
get "Autolearn=failed".

Two quick questions:
1) What should I do about this, and
2) Should I worry, or just ignore it?

TIA,
 -Don


RE: SPEWS still sucks

2005-01-27 Thread Don Levey
Daryl C. W. O'Shea wrote:
> Don Levey wrote:
>> An informal check does show that the IPs are indeed listed.  As many
>> of them should be - there are many people using cable modems and DSL
>> who are listed in dynablocks because they are supposed to be using
>> their ISP's mail server. But in a situation where they do that, if
>> the ISP records the originating IP the message still gets flagged.
>> 
>> This is not strictly list-based problem, either.  If a listed IP
>> appears *anywhere* in the header, it seems to still get flagged. 
>> But short of forbidding anyone in a dynablock from ever sending
>> email to me, I'm trying to find another answer.  Simply not using
>>  the lists (SORBS, Spamcop, etc) seems... a bit much to me. -Don
> 
> You've got a broken trust path.  SpamAssassin, for valid reason, can
> not automatically configure the trust path when the SpamAssassin
> machine is NATed.
> 
> Add the appropriate  trusted_networks  lines to your local.cf.  See
> man Mail::SpamAssassin::Conf for more info on trusted_networks.
> 
> 
> Daryl

Ah, excellent - thanks!
 -Don


RE: SPEWS still sucks

2005-01-27 Thread Don Levey
martin smith wrote:
>> -Original Message-----
>> Don Levey wrote:
>
>>
>> It was pointed out to me that SURBL lists only check URLs - I
>> apologise for that.  I *am* getting the problem described
>> above with hits on Spamcop and SORBS.  Additionally,
>> apparently even the mere text mention of a .biz address
>> triggers that flag - even though it talks about a URL.  For
>> example, on one mailing list there is a poster who posts from
>> a .biz address.  Any thread to which he posts is automatically
>> contaminated, because his address is included in the text of
>> the message - even though these are NOT URLs.
>>
>
> Just a thought but have you manualy checked these URL's against the
> SURBL list, there have been cases reported of false positives by
> spamassassin, when in fact the SURBL dosent have them listed.
> I think a bugzilla was opened on this.
>
> Martin


An informal check does show that the IPs are indeed listed.  As many of them
should be - there are many people using cable modems and DSL who are listed
in dynablocks because they are supposed to be using their ISP's mail server.
But in a situation where they do that, if the ISP records the originating IP
the message still gets flagged.

This is not strictly list-based problem, either.  If a listed IP appears
*anywhere* in the header, it seems to still get flagged.  But short of
forbidding anyone in a dynablock from ever sending email to me, I'm trying
to find another answer.  Simply not using the lists (SORBS, Spamcop, etc)
seems... a bit much to me.
 -Don



RE: Whitelisting Groups/Lists

2005-01-27 Thread Don Levey
Rob McEwen wrote:
> Jdow said:
>>> "I have found, in general, that whitelisting mailing lists
>>> is not a very good idea" ... "I also find spams appear
>>> on unmoderated Yahoo Groups." ... "a blanket white list of
>>> the sort you propose would likely turn me white with anger"...
>
> Thanks for the reply... but that is why I said in my original post:
>
>>> "but without whitelisting other real spam"
>
> Also, I'm more worried about SURBL (& other URI checker) hits on
> these than rules hits.
>
> Still, do you find such spam coming from those lists which are 100%
> opt-in? If Yahoo 100% opt-in?
>
I am a member of quite a few Yahoo groups.  They all seem to be opt-in only,
but the problem is that even with the moderated groups the larger ones
occasionally admit spammers.  Thus I do get real spam from some of these
groups.  Likewise, messages are tagged as spam which aren't, due to their
originating IP.

 -Don




RE: SPEWS still sucks

2005-01-27 Thread Don Levey
Don Levey wrote:
> Rick Macdougall wrote:
>> Daniel Quinlan wrote:
>>> Raymond Dijkxhoorn <[EMAIL PROTECTED]> writes:
>>>
>>>
>>>> Ohw well, lists.surbl.org also. At some point they hopefully
>>>> understand that list will completely useless, and indeed insain for
>>>> people to actually use it. Sadly, people still do.
>>>
>>>
>>> Whatever your unstated reasons are, I beg to differ.  Weekly
>>> mass-check results for SURBL:
>>
>> Perhaps he means spews lists lists.surbl.org.  I can't see anyone
>> having issues with any of the SURBL RBL's.
>>
> I must not have things set up correctly then.
> I get many MANY false positives from the SURBL lists, in the case
> where the server that actually sent me the message records the IP
> from which they received it.
>
> For example, [EMAIL PROTECTED] sends me email.  It goes from his PC to
> the MTA of fubar.isp, and from there to my server.  Fubar.isp records
> the PC's IP address in the headers, and passes the message; on my
> server, Spamassassin sees that the original IP is listed, and tags
> it.  Never mind that it came to me via a reputable server, the
> original IP is "bad".
>
> How, then, do I fix this so that the lists are more useful: so that
> they check the most recent hop, and not (necessarily) all hops in the
>  chain? -Don

It was pointed out to me that SURBL lists only check URLs - I apologise for
that.  I *am* getting the problem described above with hits on Spamcop and
SORBS.  Additionally, apparently even the mere text mention of a .biz
address triggers that flag - even though it talks about a URL.  For example,
on one mailing list there is a poster who posts from a .biz address.  Any
thread to which he posts is automatically contaminated, because his address
is included in the text of the message - even though these are NOT URLs.

 -Don



RE: SPEWS still sucks

2005-01-27 Thread Don Levey
Rick Macdougall wrote:
> Daniel Quinlan wrote:
>> Raymond Dijkxhoorn <[EMAIL PROTECTED]> writes:
>>
>>
>>> Ohw well, lists.surbl.org also. At some point they hopefully
>>> understand that list will completely useless, and indeed insain for
>>> people to actually use it. Sadly, people still do.
>>
>>
>> Whatever your unstated reasons are, I beg to differ.  Weekly
>> mass-check results for SURBL:
>
> Perhaps he means spews lists lists.surbl.org.  I can't see anyone
> having issues with any of the SURBL RBL's.
>
I must not have things set up correctly then.
I get many MANY false positives from the SURBL lists, in the case where the
server that actually sent me the message records the IP from which they
received it.

For example, [EMAIL PROTECTED] sends me email.  It goes from his PC to the MTA
of fubar.isp, and from there to my server.  Fubar.isp records the PC's IP
address in the headers, and passes the message; on my server, Spamassassin
sees that the original IP is listed, and tags it.  Never mind that it came
to me via a reputable server, the original IP is "bad".

How, then, do I fix this so that the lists are more useful: so that they
check the most recent hop, and not (necessarily) all hops in the chain?
 -Don



RE: extreme measures, postmaster.rfci & comcast.net

2005-01-21 Thread Don Levey
Matt Kettler wrote:
>
> Also, please in future mails please cite only abuse caused by properly
> relayed mail from comcast's MTA's, not stuff directly sent from
> clients that would be easily cleaned up by using a dynablock type
> list. (ie: the spamhaus reference you included is all client nodes,
> take a look)
>
> http://www.spamhaus.org/sbl/listings.lasso?isp=comcast.net

Their SBL does NOT list all Comcast dynablocks.  I use their SBL, and have
had to manually block large ranges of Comcast space because Spamhaus doesn't
pick them up.  For example, 68.85.198.87.  They're listen in their XBL, but
NOT their SBL.

Now that I know about their XBL, I can start using it (I think they set that
up after I had configured the main parts of my mail server).

 -Don