Re: Newbie Questions.

2007-04-23 Thread Rajkumar S

On 4/23/07, Grant Peel <[EMAIL PROTECTED]> wrote:

5. Dare I ask if some wants to summerize the Razor installation process?


http://wiki.apache.org/spamassassin/RazorSiteWide

This works for me.

raj


RE: Newbie Questions.

2007-04-23 Thread Bowie Bailey
Grant Peel wrote:
> 
> 1. sa-update. After wrestling with a number of required modules, I
> ran sa-update with the --checkonly option, but it did not return any
> messages. I then ran it using only the switch --updatedir. It created
> a couple of what appear to be accounting files, and a new
> directory'updates_spamassassin_org" witch appear to contain the new
> rulesets. Are these rulesets simply meant to replace the ones in my
> site wide conf? (/usr/local/share/spamassassin). If so, is there a
> way to tell the command line to simply out them in there instead of
> creating the new directory and forcing me to copy them over?

SA will automatically use the rules from the new directory created by
sa-update.  You don't have to do anything.

> 2. Someone suggested I enable (or check to see if it is enabled) DNS
> tests. What config file/option can I look for to see if this is
> indeed enabled?  

If you are using spamd, check that the startup does NOT use the '-L'or
'--local' options.

> 3. A while back, I turned off bayes database due to the huge files it
> creates in the users directories. Is there a way to manage the size
> of these files?  

Bayes should manage these files itself.  If they are getting huge, the
expiration may not be running properly.  Check the list archives for
instructions on disabling the auto-expiration and running it manually
from a cron job.

> 4. On this Spamassassin site, under optimization,. it reccomends that
> people not use "sa-blacklist" and "sa-blacklist-uri" for performance
> reasons. Where might I check to see if they are enabled? How would I
> turn them off?   

If you have them, they will be in your local rules directory
(/etc/mail/spamassassin for me).

> 5. Dare I ask if some wants to summerize the Razor installation
> process? (I keep going in circles on this). According to me, I have
> Razor installed, but no not how to intigrate it with SA.  

Two basic steps.

1) Install Razor (and the required Perl modules) per the instructions on
   razor.sourceforge.net.
2) Enable Razor in SA by uncommenting the loadplugin line for it in
   v310.pre and adding a "use_razor2 1" line in your local.cf file (both
   found in your local rules directory.

After this, run SA on a test message and look for the Razor messages to
see if it is working.

spamassassin -D razor2 < test.msg

-- 
Bowie


Re: Newbie Questions.

2007-04-23 Thread Grant Peel

Hi Bowie,

Thanks for taking the time to answer!

Please read comments below:

- Original Message - 
From: "Bowie Bailey" <[EMAIL PROTECTED]>

To: 
Sent: Monday, April 23, 2007 1:14 PM
Subject: RE: Newbie Questions.



Grant Peel wrote:


1. sa-update. After wrestling with a number of required modules, I
ran sa-update with the --checkonly option, but it did not return any
messages. I then ran it using only the switch --updatedir. It created
a couple of what appear to be accounting files, and a new
directory'updates_spamassassin_org" witch appear to contain the new
rulesets. Are these rulesets simply meant to replace the ones in my
site wide conf? (/usr/local/share/spamassassin). If so, is there a
way to tell the command line to simply out them in there instead of
creating the new directory and forcing me to copy them over?


SA will automatically use the rules from the new directory created by
sa-update.  You don't have to do anything.



I suppose I would need to restart spamd for it to see the changed files? 
Just curious, how does spamd know to read from the new directory? 
(/usr/local/share/spamassassin/updates_spamassassin_org) is it hardcoded to 
look for this dir?




2. Someone suggested I enable (or check to see if it is enabled) DNS
tests. What config file/option can I look for to see if this is
indeed enabled?


If you are using spamd, check that the startup does NOT use the '-L'or
'--local' options.



Looking at the startup script, (/usr/local/etc/rc.d/sa-spamd.sh) I see no 
mention of the flags (-L or --local). Also, I see nothing of them in the 
output of ps as., so I suppose it is already set to use DNS Correct?




3. A while back, I turned off bayes database due to the huge files it
creates in the users directories. Is there a way to manage the size
of these files?


Bayes should manage these files itself.  If they are getting huge, the
expiration may not be running properly.  Check the list archives for
instructions on disabling the auto-expiration and running it manually
from a cron job.


4. On this Spamassassin site, under optimization,. it reccomends that
people not use "sa-blacklist" and "sa-blacklist-uri" for performance
reasons. Where might I check to see if they are enabled? How would I
turn them off?


If you have them, they will be in your local rules directory
(/etc/mail/spamassassin for me).



My local.cf file is very small, all I have added is the use_bayes no and 1 
custom rule, so I suppose the "sa-blacklist" and "sa-blacklist-uri" are not 
being used.



5. Dare I ask if some wants to summerize the Razor installation
process? (I keep going in circles on this). According to me, I have
Razor installed, but no not how to intigrate it with SA.


Two basic steps.

1) Install Razor (and the required Perl modules) per the instructions on
  razor.sourceforge.net.
2) Enable Razor in SA by uncommenting the loadplugin line for it in
  v310.pre and adding a "use_razor2 1" line in your local.cf file (both
  found in your local rules directory.

After this, run SA on a test message and look for the Razor messages to
see if it is working.

   spamassassin -D razor2 < test.msg


I am using spamassassin 3.1.8, so, do I even need to do anything with 
v310.pre ? (I am thinking is is a relec config at this point, no?).


Thanks again,

-Grant




--
Bowie






RE: Newbie Questions.

2007-04-23 Thread Bowie Bailey
Grant Peel wrote:
> From: "Bowie Bailey" <[EMAIL PROTECTED]>
> 
> > Grant Peel wrote:
> > > 
> > > 1. sa-update. After wrestling with a number of required modules, I
> > > ran sa-update with the --checkonly option, but it did not return
> > > any messages. I then ran it using only the switch --updatedir. It
> > > created a couple of what appear to be accounting files, and a new
> > > directory'updates_spamassassin_org" witch appear to contain the
> > > new rulesets. Are these rulesets simply meant to replace the ones
> > > in my site wide conf? (/usr/local/share/spamassassin). If so, is
> > > there a way to tell the command line to simply out them in there
> > > instead of creating the new directory and forcing me to copy them
> > > over? 
> > 
> > SA will automatically use the rules from the new directory created
> > by sa-update.  You don't have to do anything.
> > 
> 
> I suppose I would need to restart spamd for it to see the changed
> files? Just curious, how does spamd know to read from the new
> directory? (/usr/local/share/spamassassin/updates_spamassassin_org)
> is it hardcoded to look for this dir?

Yes, you need to restart spamd anytime you make a change to the rules or
other configuration files.

The location of the updated rules directory is hardcoded.  If the
directory exists, SA uses those rules instead of the original set.

> > > 2. Someone suggested I enable (or check to see if it is enabled)
> > > DNS tests. What config file/option can I look for to see if this
> > > is indeed enabled?
> > 
> > If you are using spamd, check that the startup does NOT use the
> > '-L'or '--local' options. 
> > 
> 
> Looking at the startup script, (/usr/local/etc/rc.d/sa-spamd.sh) I
> see no mention of the flags (-L or --local). Also, I see nothing of
> them in the output of ps as., so I suppose it is already set to use
> DNS Correct? 

As long as your Net::DNS module is up to date, you should be fine.  To
be sure, you can run a test and look for the DNS debug entries.

spamassassin -D dns < test.msg

Look for a couple of lines that look like this:

dbg: dns: is Net::DNS::Resolver available? yes
dbg: dns: Net::DNS version: 0.59

If you see that, you should be fine.

> > > 4. On this Spamassassin site, under optimization,. it reccomends
> > > that people not use "sa-blacklist" and "sa-blacklist-uri" for
> > > performance reasons. Where might I check to see if they are
> > > enabled? How would I turn them off?
> > 
> > If you have them, they will be in your local rules directory
> > (/etc/mail/spamassassin for me).
> > 
> 
> My local.cf file is very small, all I have added is the use_bayes no
> and 1 custom rule, so I suppose the "sa-blacklist" and
> "sa-blacklist-uri" are not being used.

Right.  If you had them, they would be separate files in the directory.

> > > 5. Dare I ask if some wants to summerize the Razor installation
> > > process? (I keep going in circles on this). According to me, I
> > > have Razor installed, but no not how to intigrate it with SA.
> > 
> > Two basic steps.
> > 
> > 1) Install Razor (and the required Perl modules) per the
> > instructions on   razor.sourceforge.net. 2) Enable Razor in SA by
> >   uncommenting the loadplugin line for it in v310.pre and adding a
> >   "use_razor2 1" line in your local.cf file (both found in your
> > local rules directory. 
> > 
> > After this, run SA on a test message and look for the Razor
> > messages to see if it is working. 
> > 
> >spamassassin -D razor2 < test.msg
> 
> I am using spamassassin 3.1.8, so, do I even need to do anything with
> v310.pre ? (I am thinking is is a relec config at this point, no?).

There are a few different .pre files, but they are all still used.  The
.pre files contain information about the plugins that are used by SA.
The .cf files contain information about rules and settings.  All of the
.pre files and all of the .cf files in your local rules directory are
read and used.

So... Yes, you still need to edit this file to enable the plugin.

-- 
Bowie


Re: Newbie Questions.

2007-04-23 Thread Grant Peel

Bowie, Thanks yet again.

Please see below.
- Original Message - 
From: "Bowie Bailey" <[EMAIL PROTECTED]>

To: 
Sent: Monday, April 23, 2007 5:16 PM
Subject: RE: Newbie Questions.



Grant Peel wrote:

From: "Bowie Bailey" <[EMAIL PROTECTED]>

> Grant Peel wrote:
> >
> > 1. sa-update. After wrestling with a number of required modules, I
> > ran sa-update with the --checkonly option, but it did not return
> > any messages. I then ran it using only the switch --updatedir. It
> > created a couple of what appear to be accounting files, and a new
> > directory'updates_spamassassin_org" witch appear to contain the
> > new rulesets. Are these rulesets simply meant to replace the ones
> > in my site wide conf? (/usr/local/share/spamassassin). If so, is
> > there a way to tell the command line to simply out them in there
> > instead of creating the new directory and forcing me to copy them
> > over?
>
> SA will automatically use the rules from the new directory created
> by sa-update.  You don't have to do anything.
>

I suppose I would need to restart spamd for it to see the changed
files? Just curious, how does spamd know to read from the new
directory? (/usr/local/share/spamassassin/updates_spamassassin_org)
is it hardcoded to look for this dir?


Yes, you need to restart spamd anytime you make a change to the rules or
other configuration files.

The location of the updated rules directory is hardcoded.  If the
directory exists, SA uses those rules instead of the original set.


Spamd restarted, and looks like we are using new rules.


> > 2. Someone suggested I enable (or check to see if it is enabled)
> > DNS tests. What config file/option can I look for to see if this
> > is indeed enabled?
>
> If you are using spamd, check that the startup does NOT use the
> '-L'or '--local' options.
>

Looking at the startup script, (/usr/local/etc/rc.d/sa-spamd.sh) I
see no mention of the flags (-L or --local). Also, I see nothing of
them in the output of ps as., so I suppose it is already set to use
DNS Correct?


As long as your Net::DNS module is up to date, you should be fine.  To
be sure, you can run a test and look for the DNS debug entries.

   spamassassin -D dns < test.msg

Look for a couple of lines that look like this:

   dbg: dns: is Net::DNS::Resolver available? yes
   dbg: dns: Net::DNS version: 0.59

If you see that, you should be fine.


Yes that line, as well as a number of others show up...all loook like good 
debug messages :-)





> > 4. On this Spamassassin site, under optimization,. it reccomends
> > that people not use "sa-blacklist" and "sa-blacklist-uri" for
> > performance reasons. Where might I check to see if they are
> > enabled? How would I turn them off?
>
> If you have them, they will be in your local rules directory
> (/etc/mail/spamassassin for me).
>

My local.cf file is very small, all I have added is the use_bayes no
and 1 custom rule, so I suppose the "sa-blacklist" and
"sa-blacklist-uri" are not being used.


Right.  If you had them, they would be separate files in the directory.


Nope, they are not there!




> > 5. Dare I ask if some wants to summerize the Razor installation
> > process? (I keep going in circles on this). According to me, I
> > have Razor installed, but no not how to intigrate it with SA.
>
> Two basic steps.
>
> 1) Install Razor (and the required Perl modules) per the
> instructions on   razor.sourceforge.net. 2) Enable Razor in SA by
>   uncommenting the loadplugin line for it in v310.pre and adding a
>   "use_razor2 1" line in your local.cf file (both found in your
> local rules directory.
>
> After this, run SA on a test message and look for the Razor
> messages to see if it is working.
>
>spamassassin -D razor2 < test.msg

I am using spamassassin 3.1.8, so, do I even need to do anything with
v310.pre ? (I am thinking is is a relec config at this point, no?).


There are a few different .pre files, but they are all still used.  The
.pre files contain information about the plugins that are used by SA.
The .cf files contain information about rules and settings.  All of the
.pre files and all of the .cf files in your local rules directory are
read and used.

So... Yes, you still need to edit this file to enable the plugin.


OK, so I only edited v210.pre and local.cf. v321.pre and init.pre left 
untouched.


Thanks again all seems well.

I have not turned on bayes, is it critical (to catching spam)?



--
Bowie






RE: Newbie Questions.

2007-04-24 Thread Bowie Bailey
Grant Peel wrote:
> 
> I have not turned on bayes, is it critical (to catching spam)?

Bayes is not critical, but it can be very useful.  For best results, I
suggest you do this:

Manually train the Bayes db with hand-sorted ham and spam at least until
you get to the 200-ham/200-spam limit.  After that, keep an eye on your
incoming mail and retrain any messages that are mis-classified.

Manual training works like this:

sa-learn --ham /directory/with/nonspam
sa-learn --spam /directory/with/spam

By default Bayes will also auto-learn incoming messages as either ham or
spam based on certain criteria.  Some people suggest adjusting the
criteria to further prevent mis-training, but I have not had any
problems with the default settings.  However, on some of my accounts, I
will disable the autolearning and manually sort and learn on all of my
incoming mail each day.

-- 
Bowie


Re: Newbie Questions.

2007-04-24 Thread Matthias Haegele

Bowie Bailey schrieb:

Grant Peel wrote:

I have not turned on bayes, is it critical (to catching spam)?


Bayes is not critical, but it can be very useful.  For best results, I
suggest you do this:


ACK. It can kick the spam over the treshold which is maybe not hit by 
other rules,
a well trained Bayes is essential i think. (And it produces no 
false-positives if bayes was:

BAYES_100 it was always right).


Manually train the Bayes db with hand-sorted ham and spam at least until
you get to the 200-ham/200-spam limit.  After that, keep an eye on your
incoming mail and retrain any messages that are mis-classified.

Manual training works like this:

sa-learn --ham /directory/with/nonspam
sa-learn --spam /directory/with/spam


You should run sa-learn with the proper user account e.g.:


 sudo -u amavis -H sa-learn --spam /path/to/spam-messages/




By default Bayes will also auto-learn incoming messages as either ham or
spam based on certain criteria.  Some people suggest adjusting the
criteria to further prevent mis-training, but I have not had any
problems with the default settings.  However, on some of my accounts, I
will disable the autolearning and manually sort and learn on all of my
incoming mail each day.


Autolearning is not failure proof i think,
especially on less restrictive Mailinglists ...


--
Greetings
MH


Dont send mail to: [EMAIL PROTECTED]
--



RE: Newbie Questions.

2007-04-24 Thread Bowie Bailey
Matthias Haegele wrote:
> Bowie Bailey schrieb:
> > Grant Peel wrote:
> > > I have not turned on bayes, is it critical (to catching spam)?
> > 
> > Bayes is not critical, but it can be very useful.  For best
> > results, I suggest you do this:
> 
> ACK. It can kick the spam over the treshold which is maybe not hit by
> other rules, a well trained Bayes is essential i think. (And it
> produces no false-positives if bayes was: BAYES_100 it was always
> right).

Very useful, yes.  Critical, no.

> > Manually train the Bayes db with hand-sorted ham and spam at least
> > until you get to the 200-ham/200-spam limit.  After that, keep an
> > eye on your incoming mail and retrain any messages that are
> > mis-classified. 
> > 
> > Manual training works like this:
> > 
> > sa-learn --ham /directory/with/nonspam
> > sa-learn --spam /directory/with/spam
> 
> You should run sa-learn with the proper user account e.g.:
> 
> >  sudo -u amavis -H sa-learn --spam /path/to/spam-messages/

Thanks, I forgot to mention this piece of (critical) information.
Training the wrong database is, unfortunately, a common problem.

> > By default Bayes will also auto-learn incoming messages as either
> > ham or spam based on certain criteria.  Some people suggest
> > adjusting the criteria to further prevent mis-training, but I have
> > not had any problems with the default settings.  However, on some
> > of my accounts, I will disable the autolearning and manually sort
> > and learn on all of my incoming mail each day.
> 
> Autolearning is not failure proof i think,
> especially on less restrictive Mailinglists ...

It's not failure proof, but most of my accounts rely on autotraining
with the default settings.  I have had one instance where I had to
recreate a corrupted database, but this was on an account that had no
manual training done at all.

-- 
Bowie


Re: newbie questions: sought, sa-learn, rule weights

2015-10-18 Thread Reindl Harald



Am 18.10.2015 um 06:35 schrieb frede...@ofb.net:

I'm concerned that the BAYES_* rules aren't showing up in my spam
headers


you pretty sure train the wrong bayes instead the one of the user SA is 
running



and would like to know if there's a good way to look at the
tokens in the database


there is no way at all, stripped hashes


When I do "sa-learn --dump data", I see a file
with lines like this:

0.987  1  0 1436496897  0315e1da7f
0.016  0  1 1410284743  0320ba06ef
0.987  1  0 1393199297  0329ec4e6e
0.003  0  5 1268403253  03541effbc
0.008  0  2 1398222936  038d6e997d
0.016  0  1 1429567309  041cabf4ef
0.016  0  1 1431638107  041d441c1b

Is that normal?


yes


How do I get at the actual tokens?


you don't


How do I see how it scores a test message, just the Bayesian part?


you see BAYES_00 - BAYES_999 in the mailheaders


I find that I get a lot
of spam with exactly the same lines in the body of the message, and
the Bayesian classifier doesn't seem to register it.


as said above: you train the wrong bayes


Here's the output of sa-learn --dump magic:

0.000  0  3  0  non-token data: bayes db version
0.000  0  15466  0  non-token data: nspam
0.000  0  30317  0  non-token data: nham
0.000  01733267  0  non-token data: ntokens
0.000  0 1098575745  0  non-token data: oldest atime
0.000  0 1441160002  0  non-token data: newest atime
0.000  0  0  0  non-token data: last journal sync atime
0.000  0 1441160455  0  non-token data: last expiry atime
0.000  0  0  0  non-token data: last expire atime delta
0.000  0  0  0  non-token data: last expire reduction 
count


FROM WHAT USER?


I couldn't find a sample output on your Wiki, with which to compare
this; I'm worried about the 0.000 lines and other zeroes.


they are normal


I'm also thinking that I should employ some kind of sender address
whitelisting using e.g. TxRep. Most of my spam is stuff that I'm
receiving for the first time from a particular sender, and there are a
lot of strings that I can say for sure I'd never find in a Subject
line of a message from a friend who is emailing me for the first time:
"ATTN", "stock tip"... All of the mail I send is Bcc'ed to myself, is
there a way to get Spamassassin to notice when this comes in and
automatically whitelist the recipients for me?


no need to do so and for sure you don't want it automatically, you 
*think* you want it - a blind whitelisting is easy to trick out with 
forged senders, whitelist_auth is based on DKIM/SPF precence


but tyically you don't need much whitelisting except you are a hosting 
provier and care about your load (combining whitelist_auth and shortcircuit)




signature.asc
Description: OpenPGP digital signature


Re: newbie questions: sought, sa-learn, rule weights

2015-10-18 Thread John Hardin

On Sat, 17 Oct 2015, frede...@ofb.net wrote:


I'm getting a lot of spam, perhaps 25 messages/day, and about half of
it gets through Spamassassin. I'm trying to figure out how to fix the
situation.


Care to post the rules hits for some of the FNs? That should be in their 
headers. That might let is provide more specific advice, for instance: are 
you hitting URIBL_BLOCKED?



I tried using the "sought" ruleset following instructions from
http://taint.org/2007/08/15/004348a.html, but didn't see much
difference.


Sadly that's gone stale and may not help much with current spam. The last 
time I saw an update was March 2014.



I'm concerned that the BAYES_* rules aren't showing up in my spam
headers


The two most common causes for that are, insufficient tokens learned and 
learning under teh wrong user.



Here's the output of sa-learn --dump magic:

0.000  0  15466  0  non-token data: nspam
0.000  0  30317  0  non-token data: nham


You have plenty of tokens, so it's likely you're training Bayes as a 
different user than SA is running under, and you don't have a site-wide 
user-independent Bayes configured.



Relatedly, if I create rules for e.g. ATTN, "stock tip",


Funny you should mention that particular one. I just noticed it had popped 
up to the top of the masscheck corpora hits, and I've pushed a scored 
rule for it. Hopefully that will start getting points tomorrow.



--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  I'll have that son of a bitch eating out of dumpsters in less than
  two years.   -- MS CEO Steve Ballmer, on RedHat CEO Matt Szulik
---


Re: newbie questions: sought, sa-learn, rule weights

2015-10-19 Thread frederik
Hi Reindl,

Thanks for your reply.

I replied separately to John about my Bayes setup - you were right,
wrong user.

Thanks for the advice about whitelisting being unnecessary. I hope
that getting the Bayesian part working will make my setup effective
without this.

Thanks,

Frederick

On Sun, Oct 18, 2015 at 11:29:16AM +0200, Reindl Harald wrote:
> 
> 
> Am 18.10.2015 um 06:35 schrieb frede...@ofb.net:
> >I'm concerned that the BAYES_* rules aren't showing up in my spam
> >headers
> 
> you pretty sure train the wrong bayes instead the one of the user SA is
> running
> 
> >and would like to know if there's a good way to look at the
> >tokens in the database
> 
> there is no way at all, stripped hashes
> 
> >When I do "sa-learn --dump data", I see a file
> >with lines like this:
> >
> >0.987  1  0 1436496897  0315e1da7f
> >0.016  0  1 1410284743  0320ba06ef
> >0.987  1  0 1393199297  0329ec4e6e
> >0.003  0  5 1268403253  03541effbc
> >0.008  0  2 1398222936  038d6e997d
> >0.016  0  1 1429567309  041cabf4ef
> >0.016  0  1 1431638107  041d441c1b
> >
> >Is that normal?
> 
> yes
> 
> >How do I get at the actual tokens?
> 
> you don't
> 
> >How do I see how it scores a test message, just the Bayesian part?
> 
> you see BAYES_00 - BAYES_999 in the mailheaders
> 
> >I find that I get a lot
> >of spam with exactly the same lines in the body of the message, and
> >the Bayesian classifier doesn't seem to register it.
> 
> as said above: you train the wrong bayes
> 
> >Here's the output of sa-learn --dump magic:
> >
> >0.000  0  3  0  non-token data: bayes db version
> >0.000  0  15466  0  non-token data: nspam
> >0.000  0  30317  0  non-token data: nham
> >0.000  01733267  0  non-token data: ntokens
> >0.000  0 1098575745  0  non-token data: oldest atime
> >0.000  0 1441160002  0  non-token data: newest atime
> >0.000  0  0  0  non-token data: last journal sync 
> >atime
> >0.000  0 1441160455  0  non-token data: last expiry atime
> >0.000  0  0  0  non-token data: last expire atime 
> >delta
> >0.000  0  0  0  non-token data: last expire 
> >reduction count
> 
> FROM WHAT USER?
> 
> >I couldn't find a sample output on your Wiki, with which to compare
> >this; I'm worried about the 0.000 lines and other zeroes.
> 
> they are normal
> 
> >I'm also thinking that I should employ some kind of sender address
> >whitelisting using e.g. TxRep. Most of my spam is stuff that I'm
> >receiving for the first time from a particular sender, and there are a
> >lot of strings that I can say for sure I'd never find in a Subject
> >line of a message from a friend who is emailing me for the first time:
> >"ATTN", "stock tip"... All of the mail I send is Bcc'ed to myself, is
> >there a way to get Spamassassin to notice when this comes in and
> >automatically whitelist the recipients for me?
> 
> no need to do so and for sure you don't want it automatically, you *think*
> you want it - a blind whitelisting is easy to trick out with forged senders,
> whitelist_auth is based on DKIM/SPF precence
> 
> but tyically you don't need much whitelisting except you are a hosting
> provier and care about your load (combining whitelist_auth and shortcircuit)
> 




Re: newbie questions: sought, sa-learn, rule weights

2015-10-19 Thread RW
On Mon, 19 Oct 2015 10:57:43 -0700
frede...@ofb.net wrote:


> I guess I need to use "spamc -L" rather than "sa-learn"? I tried
> "spamc -L" but it seems rather slow, about two messages per second,
> only slightly faster when the messages have already been seen. Is
> "sa-learn" faster than "spamc -L"? It seems to do closer to 8 message
> per second, although they were all "seen" messages.
> 
> Perhaps I should just run spamd as my user, rather than user spamd?
> It's a single-user system... Or if it would be easier to point the
> global spamd to ~/.spamassassin/, but that seems messy...

run sa-learn as the user spamd using su


Re: Newbie Questions: Different Results for the same message

2008-12-03 Thread Karsten Bräckelmann
On Wed, 2008-12-03 at 02:00 -0800, Björn K wrote:
> Hello,
> 
> I am relatively new to SpamAssassin and have some problems with email which
> seems to get completely different scores when I check them manually than
> when the automatic check upon reception by the Exim mail server is
> performed.
> 
> Before we use an own spam filter the mail was put into an imap folder for an
> external mail service to be read (GMX), filtered and forwarded back to
> another mail box. That system is still working for parts. When a mail is

Despite mentioning IMAP folders -- I assume this involves forwarding to
another SMTP or polling by GMX? If so, SA likely can not detect all this
properly and thus tests some of these "internal" forwarding relays
against blacklists, instead of the actually handing over external one.
As a result, quite a lot of DNSBLs will not trigger and your SA performs
less effective than it could.

You can fix this by tweaking trusted_networks and internal_networks. But
that wasn't your question. :)

> transferred like this I can see the spam score being evaluated twice. For
> example there was a mail containing only a link to dagwizhua -dot- com,
> which is a bad address. It received 6.8 on first run, 3.6 on the second run
> only for a few additional headers added by the external mail service.

This difference might actually be due to the trust path outlined above.
If GMX does polling, they could have correctly tested the external
handing over relay against blacklists.

Your local run doesn't show any such hits.

> However, when I copied the mail into a text file and used spamc to send it
> the /same/ spamd process I got this result:
> [EMAIL PROTECTED]:~$ LANG=C spamc -lR < spam-mail.txt | recode latin1..utf8
> 12.9/5.0

For some better evaluation, we'd need the full X-Spam headers, both as
inserted by your local SA on the first run *and* the manual second run.
Don't have that, so here's a guess.

> Pkte Regelname  Beschreibung
>  -- --
>  0.6 NO_REAL_NAME   Kein vollständiger Name in Absendeadresse
>  1.8 INVALID_DATE   Datumskopfzeile nicht standardkonform zu RFC 2822
>  0.0 UNPARSEABLE_RELAY  Informational: message has unparseable relay lines
>  1.3 RCVD_IN_BL_SPAMCOP_NET RBL: Transportiert via Rechner in Liste von
> www.spamcop.net
>[Blocked - see ]

This is about 3.6 (assuming some rounding), the score your first run
ended up with.

>  3.3 URIBL_AB_SURBL Enthält URL in AB-Liste (www.surbl.org)
> [URIs: dagwizhua -dot- com]
>  2.6 URIBL_OB_SURBL Enthält URL in OB-Liste (www.surbl.org)
> [URIs: dagwizhua -dot- com]
>  3.6 URIBL_SC_SURBL Enthält URL in SC-Liste  (www.surbl.org)
> [URIs: dagwizhua -dot- com]

These are moving targets. It is entirely possible that the URI
blacklists haven't caught up when you initially scanned the mail -- and
thus they didn't hit on the first run, but later only.

> -0.2 AWLAWL: From: address is in the auto white-list

Computed based on the sender/IP-block history.

> How can the results be so very different on the same spam process? Why would
> a few additional headers make a difference if the Bayes does not seem to add
> anything to the mail and there is no particular rule for those headers? And
> why does a manual scan produce a completely different result if the service
> that creates the actual results is the same process?

See above. It's likely not about the headers, but timing -- that URI
simply hasn't been on the blacklists before.

The difference to the GMX score probably is due to the trust path. Plus
the SA version used and thus the scores per rule. Don't remember
off-hand which SA version GMX uses, but I do see you're running an old
version, aren't you? The scores (and rules, mind you) don't match a
recent SA 3.2.x.

  guenther


-- 
char *t="[EMAIL PROTECTED]";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}



Re: Newbie Questions: Different Results for the same message

2008-12-03 Thread Kai Schaetzl
Björn K wrote on Wed, 3 Dec 2008 02:00:32 -0800 (PST):

> How can the results be so very different on the same spam process?

Too many whys ;-) Comparing overall scores doesn't provide any insight.
You want to compare the rules that hit, then you'll see what is different 
(and most of the differences should then be self-explicatory).

In general the time a message gets scanned does make a difference if it 
comes to any network/distributed tests - e.g. the message parameters may 
not be known as spam by the various RBLs at the time of the first scan. 
And if you use different systems (what you seem to do, one internal, one 
external), then there's a great chance, that they are configured 
different, of course.

Kai

-- 
Kai Schätzl, Berlin, Germany
Get your web at Conactive Internet Services: http://www.conactive.com





Re: Newbie Questions: Different Results for the same message

2008-12-03 Thread Björn K

Thank you, that should help. I don't really wanna print the whole headers
here (not giving away too many internals on how which company's mails I
handle in which way and what problems I have with it).

It's a spamassassin 3.1.7 out of the Debian (Etch) repository (debian
revision 2).


Karsten Bräckelmann-2 wrote:
> 
> On Wed, 2008-12-03 at 02:00 -0800, Björn K wrote:
>> Hello,
>> 
>> I am relatively new to SpamAssassin and have some problems with email
>> which
>> seems to get completely different scores when I check them manually than
>> when the automatic check upon reception by the Exim mail server is
>> performed.
>> 
>> Before we use an own spam filter the mail was put into an imap folder for
>> an
>> external mail service to be read (GMX), filtered and forwarded back to
>> another mail box. That system is still working for parts. When a mail is
> 
> Despite mentioning IMAP folders -- I assume this involves forwarding to
> another SMTP or polling by GMX? If so, SA likely can not detect all this
> properly and thus tests some of these "internal" forwarding relays
> against blacklists, instead of the actually handing over external one.
> As a result, quite a lot of DNSBLs will not trigger and your SA performs
> less effective than it could.
> 
> You can fix this by tweaking trusted_networks and internal_networks. But
> that wasn't your question. :)
> 
>> transferred like this I can see the spam score being evaluated twice. For
>> example there was a mail containing only a link to dagwizhua -dot- com,
>> which is a bad address. It received 6.8 on first run, 3.6 on the second
>> run
>> only for a few additional headers added by the external mail service.
> 
> This difference might actually be due to the trust path outlined above.
> If GMX does polling, they could have correctly tested the external
> handing over relay against blacklists.
> 
> Your local run doesn't show any such hits.
> 
>> However, when I copied the mail into a text file and used spamc to send
>> it
>> the /same/ spamd process I got this result:
>> [EMAIL PROTECTED]:~$ LANG=C spamc -lR < spam-mail.txt | recode latin1..utf8
>> 12.9/5.0
> 
> For some better evaluation, we'd need the full X-Spam headers, both as
> inserted by your local SA on the first run *and* the manual second run.
> Don't have that, so here's a guess.
> 
>> Pkte Regelname  Beschreibung
>>  --
>> --
>>  0.6 NO_REAL_NAME   Kein vollständiger Name in Absendeadresse
>>  1.8 INVALID_DATE   Datumskopfzeile nicht standardkonform zu RFC
>> 2822
>>  0.0 UNPARSEABLE_RELAY  Informational: message has unparseable relay
>> lines
>>  1.3 RCVD_IN_BL_SPAMCOP_NET RBL: Transportiert via Rechner in Liste von
>> www.spamcop.net
>>[Blocked - see
>> ]
> 
> This is about 3.6 (assuming some rounding), the score your first run
> ended up with.
> 
>>  3.3 URIBL_AB_SURBL Enthält URL in AB-Liste (www.surbl.org)
>> [URIs: dagwizhua -dot- com]
>>  2.6 URIBL_OB_SURBL Enthält URL in OB-Liste (www.surbl.org)
>> [URIs: dagwizhua -dot- com]
>>  3.6 URIBL_SC_SURBL Enthält URL in SC-Liste  (www.surbl.org)
>> [URIs: dagwizhua -dot- com]
> 
> These are moving targets. It is entirely possible that the URI
> blacklists haven't caught up when you initially scanned the mail -- and
> thus they didn't hit on the first run, but later only.
> 
>> -0.2 AWLAWL: From: address is in the auto white-list
> 
> Computed based on the sender/IP-block history.
> 
>> How can the results be so very different on the same spam process? Why
>> would
>> a few additional headers make a difference if the Bayes does not seem to
>> add
>> anything to the mail and there is no particular rule for those headers?
>> And
>> why does a manual scan produce a completely different result if the
>> service
>> that creates the actual results is the same process?
> 
> See above. It's likely not about the headers, but timing -- that URI
> simply hasn't been on the blacklists before.
> 
> The difference to the GMX score probably is due to the trust path. Plus
> the SA version used and thus the scores per rule. Don't remember
> off-hand which SA version GMX uses, but I do see you're running an old
> version, aren't you? The scores (and rules, mind you) don't match a
> recent SA 3.2.x.
> 
>   guenther
> 
> 
> -- 
> char
> *t="[EMAIL PROTECTED]";
> main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i c<<=1:
> (c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0;
> }}}
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Newbie-Questions%3A-Different-Results-for-the-same-message-tp20809927p20811311.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: Newbie Questions: Different Results for the same message

2008-12-03 Thread Karsten Bräckelmann
On Wed, 2008-12-03 at 03:36 -0800, Björn K wrote:
> Thank you, that should help. I don't really wanna print the whole headers
> here (not giving away too many internals on how which company's mails I
> handle in which way and what problems I have with it).

Forwarding mail for companies (smells like business to me) through GMX?
This seems to be your problem. But see below.

> It's a spamassassin 3.1.7 out of the Debian (Etch) repository (debian
> revision 2).
> 
> 
> Karsten Bräckelmann-2 wrote:

[ Cursing enumeration, there is only *one* such human, snipping an
utterly needless full quote including the sig. *sigh*  Nabble still
doesn't get it right. ]


> > For some better evaluation, we'd need the full X-Spam headers, both as
   ^^
> > inserted by your local SA on the first run *and* the manual second run.
> > Don't have that, so here's a guess.

I was asking for the SpamAssassin headers -- not all headers. *shrug*


-- 
char *t="[EMAIL PROTECTED]";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}