date:20090802

Re: SA-learn (spamassassin)

2009-08-02 Thread monolit


Good morning. The output of sa-learn --dump magic after bayes learning is +1
nspam/nham. I tried the command several. times. I tried write the mail with
Subject: viagra; body: viagra and sent it from my first account to the my
second account(score 0,4). Then I used sa-learn -spam for this mail. I wrote
the same mail and sent it from account one to the second. The mail gain
higher score 2.4. I took this mail and used sa-learn -spam. I wrote the same
mail and repeat  the sending(From 1. account to the second). The score was
again higher 3.4. I tried it still several times but the score didnt grow...
Thats was my small experiment with scoring by bayes.

My spamd process run under root. I started  sa-learn under root. BUT the
database is in /root directory and the same database is in /home/spamfilter
directory. Spamfilter is user which is  state in master.cf. In spamassassin
(local.cf) I have record for the bayes database and the path is
/home/spamfilter... When I started sa-learn under root so I check time of
updating database. The database under user spamfilter is correctly
updated(under root isnt updated).

I know it is strange and confusing ...use two user for this. I wish all
function and so on ran under one user, but I dont know how start up spamd
under spamfilter. I am not sure if is it the right... maybe spamd should
running under root.
Here is my modification from master.cf(postfix). This modification is
recommended by spamassassin www pages.

smtp  inet  n   -   n   -   -   smtpd
 -o content_filter=spamfilter:dummy


# Interfaces to non-Postfix software. Be sure to examine the manual
# pages of the non-Postfix software to find out what options it wants.
# 
spamfilter unix -   n   n   -   -   pipe
 flags=Rq user=spamfilter argv=/usr/local/bin/spamfilter -f ${sender} --
${recipient}

Thank you for explanation how bayes works and for time which you devoted to
me.
-- 
View this message in context: 
http://www.nabble.com/SA-learn-%28spamassassin%29-tp24773517p24786173.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.

Re: SA-learn (spamassassin)

2009-08-02 Thread Matt Kettler

monolit wrote:
> Question is logical. When SA learnt new spam/ham so SA have to write new info
> to the database and I think that database have to increase size. If you have
> for example *.doc file and you modify it. You add several words - *.doc will
> be bigger(increase his size).
>   
The database doesn't need to grow in size.

A berkley db file can contain free space. This is done to avoid
constantly shrinking and growing the file on disk. Deleted elements are
merely marked as free space for later use.

Therefore, data can be added to a berkley db file, without an increase
in file size.

Re: SA-learn (spamassassin)

2009-08-02 Thread Karsten Bräckelmann

On Sun, 2009-08-02 at 14:43 -0700, an anonymous Nabble user wrote:
> To by Karsten Bräckelmann-2: I want to apologize for my approach - I use
> Ubuntu and other forums because I am hopeless because my homework was
> install configure and run antispam(spamassassin, ClamAV, Clamsmtp,razor,
> postfix). Now I am under pressure because tomorrow I have to deliver my
> solution to my chief... I must explain to him how it works and so on. 

Good luck with that.

Utterly fucked-up quoting, err, dumping of previous posts intermixed
with comments, fixicated.

> > the number of spam exceeding the bayes_min_spam_num value does not activate
> > Bayes *learn*ing. It means that Bayes will classify mail -- based on what it
> > learned before.

> > it keeps track of *tokens*, and the number they have been seen in ham
> > or spam.

> Your  explanation is confusing for me, because you
> claim value of min_spam_num  means that Bayes will classify mail -- based on
> what it learned before My min_spam_num value is 1. I get the first mail.
> Subject: viagra; body: viagra. I use sa - learn -spam for this mail. I get
> new mail: Subject: viagra; body: viagra. What will do Bayes according to
> you? Keep in mind your words 

Bayes will check the tokens against its database. Based on the number of
occurrences of each token in ham and spam, Bayes will return whether the
mail appears spammy or hammy (based on what it learned before), and its
confidence of that assessment.

This classification (ham or spam) and confidence will be scored by SA.

Keep in mind there are a LOT more tokens in a message than merely the
words in the Subject and Body. This DOES have a severe impact on your
results, if your "test spam" is a self-generated message with the word
Viagra as Subject and Body. Nope, this is not a proper test environment.

> > The bayes_min_(ham|spam)_num values ONLY control, how many messages
> > Bayes needs to have learned, before it should start classifying mail.

> => my Bayes can classifying mail(because min_spam_num value is 1 => the
> condition is accomplish). What now? Will be my new mail mark like spam?
> Or will get any higher score...?

It will be classified (by Bayes) based on the tokens in the message and
the previously learned statistics. Bayes does NOT only mark spam. It
also can report a message to look like ham.

Anyway, I asked you before to provide sa-learn --dump magic output. You
didn't. Given the intro, I seriously wonder if the user you are training
Bayes and scanning mail is the same anyway.

-- 
char *t="\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}

Re: SA-learn (spamassassin)

2009-08-02 Thread monolit


To Benny Pedersen: I understand your explanation about increasing of
spamassassin database. Your example with md5 is clearly. Ok thank you very
much!

To by Karsten Bräckelmann-2: I want to apologize for my approach - I use
Ubuntu and other forums because I am hopeless because my homework was
install configure and run antispam(spamassassin, ClamAV, Clamsmtp,razor,
postfix). Now I am under pressure because tomorrow I have to deliver my
solution to my chief... I must explain to him how it works and so on. 

the number of spam exceeding the bayes_min_spam_num value does not activate
Bayes *learn*ing. It means that Bayes will classify mail -- based on what it
learned before. it keeps track of *tokens*, and the number they have been
seen in ham or spam. Your  explanation is confusing for me, because you
claim value of min_spam_num  means that Bayes will classify mail -- based on
what it learned before My min_spam_num value is 1. I get the first mail.
Subject: viagra; body: viagra. I use sa - learn -spam for this mail. I get
new mail: Subject: viagra; body: viagra. What will do Bayes according to
you? Keep in mind your words 
The bayes_min_(ham|spam)_num values ONLY control, how many messages
Bayes needs to have learned, before it should start classifying mail.  => my
Bayes can classifying mail(because min_spam_num value is 1 => the condition
is accomplish). What now? Will be my new mail mark like spam? Or will get
any higher score...?


And again, 1 is not a sane number. - I endeavour to explain to you that this
is only homework. Why number 1? Because I want to see on my own eyes how 
bayes works. I dont have time find many really spam(I know the number must
be bigger about 1000 - its OK I knew it).
-- 
View this message in context: 
http://www.nabble.com/SA-learn-%28spamassassin%29-tp24773517p24782439.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.

Re: Razor, spamassassin - network test

2009-08-02 Thread Karsten Bräckelmann

I'm starting to seriously wonder, what your homework actually is about.

On Sun, 2009-08-02 at 13:05 -0700, an anonymous Nabble user wrote:
> Your command works! I found in spamassassin -D razor2  < sample.msg  2>&1 |
> less  message the following:
> check[9444]: [ 6] a=c&e=4&ep4=7542-10&s=4uO_brp3_KWEDuqMYXBVHI-4-FwA
> But I dont know how to recognize that is a signature(hash) of the mail. In

This is a question for the Razor community, don't you think?

(Hint: The Razor community is also not hosted at some Ubuntu help forum.
Where you previously posted these two threads, and then dumped a copy of
the forum-mangled text to the SA forum at Nabble.)

> the old version it was clearly marked for example:
> debug: Signature: 48e74b8496877ba45072b201b41eebed7038186b.

This hash is hexadecimal encoded. Unlike the values above. A crypto-
graphic hash does not necessarily need to be encoded in hex.

> My second question is: When I send mail for example from XP a) station to XP
> b) station so spamassassin write to header of mail x-spam-status and so on.
> According to I recognise that mail was checked by using SA rules,
> bayes(autolearn), but how can I recognize that the mail was really checked
> by Razor? In mail header isnt any info and in razor.log is too any
> info(about checking the mail)

If Razor is enabled in SA, SA will do the test. The rule gets hit (and
added to the Status header) only, if it is recognized as spam by Razor.

You probably would be able to define more rules, with an informational
score of 0.001, using a much wider range possibly covering all cases.
See 25_razor2.cf for the current rule.

-- 
char *t="\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}

Re: SA-learn (spamassassin)

2009-08-02 Thread Benny Pedersen


On Sun, 2 Aug 2009 13:20:41 -0700 (PDT), monolit 
wrote:
> Question is logical.

so are google :)

> When SA learnt new spam/ham so SA have to write new info
> to the database and I think that database have to increase size.

no, my bayes db is around 150M, but all my mail is in webmail at 800M so
where is the rest in bayes ? :)

> If you have for example *.doc file and you modify it. You add several
words - *.doc
> will be bigger(increase his size).

if you use bayes on mysql and dump the data, then you see that it not just
add new words, it also count how much this word is seen in spam vs ham, and
all this words is not just words as we write them here, is encoded to
signatures that dont use that much room in the db

one example is you can try md5 sum your email address, it will be same
length everytime no matter how many chars you email have

-- 
Benny Pedersen

Re: SA-learn (spamassassin)

2009-08-02 Thread Karsten Bräckelmann

On Sun, 2009-08-02 at 11:53 -0700, an anonymous Nabble user wrote:
> I have theory ...I know you will think thats bad but I tried explain how I
> understand SA documentation. When I set the "bayes_min_spam_num 1" so it
> means that Bayes learn system will be activate. And now for example: I got

As I just settled with RW in this very thread, the number of spam excee-
ding the bayes_min_spam_num value does not activate Bayes *learn*ing. It
means that Bayes will classify mail -- based on what it learned before.

Learning, whether manual or automatic, always is available if use_bayes
and bayes_auto_learn are enabled.

The bayes_min_(ham|spam)_num values ONLY control, how many messages
Bayes needs to have learned, before it should start classifying mail.
And again, 1 is not a sane number.

> mail. I use sa-learn --spam --file mail. SA save the mail(or some signature
> to the database). And when I got the same mail again so Bayes looks to the
> database a he says: a the same mail like in my database which is marked
> like spam, and he mark the mail like spam. According to me is it logical.

No. *sigh*  I did explain this earlier today. This is NOT how Bayes
works. Bayes does NOT keep signatures of entire messages. Instead, it
keeps track of *tokens*, and the number they have been seen in ham or
spam. Think of tokens as words.

Please do read up on Bayes. And please stop re-iterating this false
assumption.

Given you repeating some "signature of a message", and your other thread
regarding Razor (which does actually calculate some signatures for a
message) -- I have a feeling you are confusing Bayes with Razor. They
are entirely unrelated and do not use the same mechanisms.

> What is strange when I use SA-LEARN so database dont expand the size, but
> the time of modification is the same when I sa-learn started.

It is a database. It is not a flat text file. There is nothing strange
about updating values in a database, and not seeing it inflate
proportional to your input data.

-- 
char *t="\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}

Re: SA-learn (spamassassin)

2009-08-02 Thread monolit


Question is logical. When SA learnt new spam/ham so SA have to write new info
to the database and I think that database have to increase size. If you have
for example *.doc file and you modify it. You add several words - *.doc will
be bigger(increase his size).
-- 
View this message in context: 
http://www.nabble.com/SA-learn-%28spamassassin%29-tp24773517p24781719.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.

Re: Razor, spamassassin - network test

2009-08-02 Thread monolit


Your command works! I found in spamassassin -D razor2  < sample.msg  2>&1 |
less  message the following:
check[9444]: [ 6] a=c&e=4&ep4=7542-10&s=4uO_brp3_KWEDuqMYXBVHI-4-FwA
But I dont know how to recognize that is a signature(hash) of the mail. In
the old version it was clearly marked for example:
debug: Signature: 48e74b8496877ba45072b201b41eebed7038186b.

My second question is: When I send mail for example from XP a) station to XP
b) station so spamassassin write to header of mail x-spam-status and so on.
According to I recognise that mail was checked by using SA rules,
bayes(autolearn), but how can I recognize that the mail was really checked
by Razor? In mail header isnt any info and in razor.log is too any
info(about checking the mail)
-- 
View this message in context: 
http://www.nabble.com/Razor%2C-spamassassin---network-test-tp24773506p24781568.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.

Re: SA-learn (spamassassin)

2009-08-02 Thread me


On Sun, 2 Aug 2009 11:53:53 -0700 (PDT), monolit 
wrote:

> What is strange when I use SA-LEARN so database dont expand the size, but
> the time of modification is the same when I sa-learn started.

question is ?

Re: SA-learn (spamassassin)

2009-08-02 Thread monolit


FROM SA WWW
bayes_min_ham_num (Default: 200)
bayes_min_spam_num (Default: 200)
To be accurate, the Bayes system does not activate until a certain
number of ham (non-spam) and spam have been learned. The default is 200 of
each ham and spam, but you can tune these up or down with these two
settings. 

I have theory ...I know you will think thats bad but I tried explain how I
understand SA documentation. When I set the "bayes_min_spam_num 1" so it
means that Bayes learn system will be activate. And now for example: I got
mail. I use sa-learn --spam --file mail. SA save the mail(or some signature
to the database). And when I got the same mail again so Bayes looks to the
database a he says: a the same mail like in my database which is marked
like spam, and he mark the mail like spam. According to me is it logical.
What is strange when I use SA-LEARN so database dont expand the size, but
the time of modification is the same when I sa-learn started.
-- 
View this message in context: 
http://www.nabble.com/SA-learn-%28spamassassin%29-tp24773517p24780842.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.

Re: Razor, spamassassin - network test

2009-08-02 Thread Karsten Bräckelmann

On Sun, 2009-08-02 at 11:17 -0700, monolit wrote:
> I understand that I must read whole output(message(TOP message)). But the
> output this command is very fast and it stop at the end. I dont catch TOP of
> message. I tried "| more" switch but it didint help. I tried redirecting
> output to the file but it doesnt work. The file was empty:( I dont know how
> can I read the TOP of output message.

You mean, your terminal does not have a scroll-back buffer? You can't
simply go back a few pages?

Well, then try redirecting STDERR, instead of STDOUT only. That's where
the debugging messages are.

  spamassassin -D razor2  < sample.msg  2>&1 | less


> Edit your spamd start-up script, or start-up options file (depending on
> which OS you're running, these may be different). There should be a -L or
> --local switch in that file. Remove it to enable network tests.
> 
> I cant find the file with this switch - I use CentOS distro. 

This  (a) applies to spamd only, not running the 'spamassassin' script
as you do right now, and  (b) only in the case network-tests have
explicitly been disabled in the daemon start-up script.


-- 
char *t="\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}

Re: Razor, spamassassin - network test

2009-08-02 Thread monolit


I understand that I must read whole output(message(TOP message)). But the
output this command is very fast and it stop at the end. I dont catch TOP of
message. I tried "| more" switch but it didint help. I tried redirecting
output to the file but it doesnt work. The file was empty:( I dont know how
can I read the TOP of output message.

The last things from spamassassin web is:

Edit your spamd start-up script, or start-up options file (depending on
which OS you're running, these may be different). There should be a -L or
--local switch in that file. Remove it to enable network tests.

I cant find the file with this switch - I use CentOS distro. 
 
-- 
View this message in context: 
http://www.nabble.com/Razor%2C-spamassassin---network-test-tp24773506p24780477.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.

Re: SA-learn (spamassassin)

2009-08-02 Thread Karsten Bräckelmann

On Sun, 2009-08-02 at 18:31 +0100, RW wrote:
> > > AFAIK it doesn't affect autoleaning at all, bayes_min_spam_num &
> > > bayes_min_ham_num control when scoring starts.
> > 
> > Well, it *does* nonetheless. *shrug*

> If you read back you'll see that that's consistent with what I wrote and
> the opposite of what you wrote.

Nah, I did set the thresholds to 1. :)

> I said that the limits don't effect autolearning, just scoring
> (activation).

Damn. My test-case was non-conclusive, I failed to crosscheck. :/

You are correct, auto-learning is not affected by these thresholds. SA
does bootstrap Bayes training, even if nspam/nham still is below the
limits. Sorry, my bad.

-- 
char *t="\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}

Re: Razor, spamassassin - network test

2009-08-02 Thread Karsten Bräckelmann

Getting kind of a headache, trying to wrap my head around this confusing
mess. Anyway, here's my shot at this.

On Sun, 2009-08-02 at 03:31 -0700, an anonymous Nabble user wrote:
> > > When I use spamassassin -t -D razor2 < /tmp/spam
> > > so I dont get the hash and so on but content analysis
> > > details...bayes clasification and so on. I expected message like 

The -D razor2 option limits debugging to Razor. No Bayes "and so on"
debugging.

I believe you're ONLY looking at the end. Which, due to the -t option,
indeed does show an additional Content Analysis at the end. The Razor
debugging however is at the TOP. Have a careful look at ALL the output,
not only the end.

> debug: Razor is available
> debug: Razor Agents 1.20, protocol version 2.
> debug: Read server list from /home/jgb/.razor.lst
> debug: 72636 seconds before closest server discovery
> debug: Closest server is 209.204.62.150
> debug: Connecting to 209.204.62.150...
> debug: Connection established
> debug: Signature: 48e74b8496877ba45072b201b41eebed7038186b
> debug: Server version: 1.11, protocol version 2
> debug: Server response: Negative 48e74b8496877ba45072b201b41eebed7038186b
> debug: Message 1 NOT found in the catalogue

This is a straight copy from the wiki [1], explaining how to test Razor
is working. However, it's an *old* snippet. Do run the command and have
a look at the Razor debug output at the top.

It will be different, cause this snippet is really, really old. Note the
version and protocol. But it will get you all the debugging output.

> I dont have any idea howto do razor works. This command(spamassassin -t -D
> razor2 < /tmp/spam) is without --lint and its recommended by spamassassin
> www pages.so  I am begginer in this field and therefore I need accurate
> advise. 

That command is correct.

[1] http://wiki.apache.org/spamassassin/RazorHowToTell

-- 
char *t="\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}

Re: SA-learn (spamassassin)

2009-08-02 Thread RW

On Sun, 02 Aug 2009 17:15:52 +0200
Karsten Bräckelmann  wrote:

> On Sun, 2009-08-02 at 02:00 +0100, RW wrote:
> > On Sun, 02 Aug 2009 01:42:21 +0200 Karsten Bräckelmann wrote:

> > > It's a counter-measure against bad learning, to force at least
> > > some MINIMAL manual training, before auto-learning kicks in. 

> > AFAIK it doesn't affect autoleaning at all, bayes_min_spam_num &
> > bayes_min_ham_num control when scoring starts.
> 
> Well, it *does* nonetheless. *shrug*
> 
> As per the docs, that threshold controls when Bayes activates. Nothing
> more, nothing less. Want to see for yourself?
>..
> X-Spam-Status: Yes, score=17.3 required=8.0
> tests=EMPTY_MESSAGE,MISSING_DATE,
> MISSING_HEADERS,MISSING_MID,MISSING_SUBJECT,NO_HEADERS_MESSAGE,NO_RECEIVED,
> NO_RELAYS,TVD_SPACE_RATIO autolearn=spam version=3.2.5
> 

If you read back you'll see that that's consistent with what I wrote and
the opposite of what you wrote.

I said that the limits don't effect autolearning, just scoring
(activation).

Whatever you think you wrote, what you actually wrote was:

 " to force at least some MINIMAL manual training, before
   auto-learning kicks in"

There's no ambiguity there, the use of the word "force" implies that
manual training is a prerequisite to auto-learning.

Re: SA-learn (spamassassin)

2009-08-02 Thread Karsten Bräckelmann

On Sun, 2009-08-02 at 04:36 -0700, an anonymous Nabble user wrote:
> I changed the value on "1"(I use this for testing and my self-learning its
> my homework). According to me - spam bayes learning was activated. When I
> use sa-learning so bayes learn that the mail is spam. And bayes learn the
> signatures...
> 
> Therefore is for me strange when I send the same mail again so bayes dont
> mark this mail like spam? I dont understand this. I realize all conditions -
> sa-learn  --spam --file  mail. "bayes_min_spam_num 1". The date the databaze
> was too changed(but the size stay the same). nspam was increased... I really
> dont understand what use is SA-LEARN! I have feel that the bayes dont work
> correctly- bayes ignore sa-learn. I am perhaps silly but I dont understand
> how it works:(( I am interesred how tell to bayes THIS MAIL IS SPAM(by using
> sa-learn), WHEN THIS SAME MAIL COME AGAIN SO YOU HAVE TO MARK LIKE SPAM! I
> know that bayes find similar element between mail and according to decide.
> But when I mark mail like spam a next mail have 100% similarity so bayes
> HAVE TO mark it like SPAM. It is logical acording to me.

Nope.  This is wrong. Bayes does not know the concept of a message, or
them being equal. It knows tokens.

Consider the following. Your have 100 ham messages that contain the word
'foo' somewhere in the body, and you learn these messages as ham. You
then learn a spam message that contains the word 'foo' as its only Bayes
token. (Won't happen in reality, this is a stripped down example. ;)

So Bayes, a statistical analyzer, knows that 'foo' is a rather hammy
token with 100 sightings, and only rarely observed in spam. A single
time.

If you then ask Bayes for its opinion about the very same, just learned
spam message containing 'foo' as its only Bayes token, it will tell you
that it's *ham* with a very high confidence.

Please show us the output of 'sa-learn --dump magic'.

-- 
char *t="\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}

Re: SA-learn (spamassassin)

2009-08-02 Thread Karsten Bräckelmann

On Sun, 2009-08-02 at 02:00 +0100, RW wrote:
> On Sun, 02 Aug 2009 01:42:21 +0200 Karsten Bräckelmann wrote:

> > > when I learn bayes by hand (sa-learn --spam --file mail) that this
> > > mail is spam? I have explicit set in local.cf bayes_min_spam_num 1.
> > > This means that for bayes is sufficient one mail for
> > > learning(according to me). But it dosesnt work.

> > Do NOT do that.
> > 
> > Unless you *really* understand the implications. Which you don't.
> > It's a default for a reason.
> > 
> > It's a counter-measure against bad learning, to force at least some
> > MINIMAL manual training, before auto-learning kicks in. You just side-
> > stepped that.
> 
> AFAIK it doesn't affect autoleaning at all, bayes_min_spam_num &
> bayes_min_ham_num control when scoring starts.

Well, it *does* nonetheless. *shrug*

As per the docs, that threshold controls when Bayes activates. Nothing
more, nothing less. Want to see for yourself?


$ echo | spamassassin --cf='score EMPTY_MESSAGE 6' --cf='score MISSING_DATE 6'

X-Spam-Status: Yes, score=17.3 required=8.0 tests=EMPTY_MESSAGE,MISSING_DATE,
  MISSING_HEADERS,MISSING_MID,MISSING_SUBJECT,NO_HEADERS_MESSAGE,NO_RECEIVED,
  NO_RELAYS,TVD_SPACE_RATIO autolearn=spam version=3.2.5

$ sa-learn --dump magic
0.000  0  3  0  non-token data: bayes db version
0.000  0  2  0  non-token data: nspam
0.000  0  1  0  non-token data: nham
0.000  0 20  0  non-token data: ntokens


-- 
char *t="\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}

Re: Network Tests / Rule Files Directories

2009-08-02 Thread Karsten Bräckelmann

On Sat, 2009-08-01 at 18:15 -0700, Stefan Malte Schumacher wrote:
> > Evidence that it's not working? Show us some SA headers. In this case, a
> > spam sample that triggered DCC, cause the Report header does show the
> > rule's score.

Hmm, I wasn't clear enough. :)  I meant an identified spam, where the
Report header is added. It isn't with that sample. Anyway...

> Here is an example with Razor2, but I guess the underlying problem is the
> same. 
> 
> http://www.pagan.mynetcologne.de/example-email

X-Spam-Status: No, score=2.2 required=5.0 tests=AWL,HTML_IMAGE_RATIO_04,
  HTML_MESSAGE,RAZOR2_CF_RANGE_51_100,RAZOR2_CF_RANGE_E4_51_100,RAZOR2_CHECK,
  UNPARSEABLE_RELAY autolearn=no version=3.2.5

> As you can see, the message only gets a score of 2.2. In the beginning I
> believed that I made some embarrassing mistake with the rules concerning the
> network checks, but if you say these are okay the problem most likely lies
> somewhere else. 

AWL. Obviously, it counters the custom scores, based on the sender's
history. And it seems, the sores have been really low in the past.

  spamassassin -t < sample

What does that say at the bottom of the output, for this sample?


-- 
char *t="\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}

Re: SA-learn (spamassassin)

2009-08-02 Thread RW

On Sun, 2 Aug 2009 04:36:34 -0700 (PDT)
monolit  wrote:

> 

> I changed the value on "1"(I use this for testing and my
> self-learning its my homework). According to me - spam bayes learning
> was activated. When I use sa-learning so bayes learn that the mail is
> spam. And bayes learn the signatures...
> 
> Therefore is for me strange when I send the same mail again so bayes
> dont mark this mail like spam? I dont understand this.

What you said before was that you corrected autolearn=ham to spam with
sa-learn, another similar spam then also had autolearn=ham.

Autolearning is not based on the bayes result it's based on other
Spamassassin rules . However it wont autolearn in the opposite
direction to a strong bayes result, which is why it's a good idea to
manually train first.

My guess is that you've fed it your one spam, but haven't
fed it enough ham to satisfy  bayes_min_ham_num, so there is no bayes
result and nothing to stop autolearning in the wrong direction.

It really is pretty useless to speculate about what it's doing when
you are misusing it like this. If you just want to play with it, then
feed it 10 hams and 10 spams and set the limits to 10. It wont be very
accurate, but it should behave sensibly.

Re: SA-learn (spamassassin)

2009-08-02 Thread monolit


I read spamassassin docs... I found out the following:
Sa-learn
--spam
Learn the input message(s) as spam. If you have previously learnt any of
the messages as ham, SpamAssassin will forget them first, then re-learn them
as spam. Alternatively, if you have previously learnt them as spam, it'll
skip them this time around. If the messages have already been filtered
through SpamAssassin, the learner will ignore any modifications SpamAssassin
may have made. 

...and the following 

bayes_min_ham_num (Default: 200)
bayes_min_spam_num (Default: 200)
To be accurate, the Bayes system does not activate until a certain
number of ham (non-spam) and spam have been learned. The default is 200 of
each ham and spam, but you can tune these up or down with these two
settings. 

I changed the value on "1"(I use this for testing and my self-learning its
my homework). According to me - spam bayes learning was activated. When I
use sa-learning so bayes learn that the mail is spam. And bayes learn the
signatures...

Therefore is for me strange when I send the same mail again so bayes dont
mark this mail like spam? I dont understand this. I realize all conditions -
sa-learn  --spam --file  mail. "bayes_min_spam_num 1". The date the databaze
was too changed(but the size stay the same). nspam was increased... I really
dont understand what use is SA-LEARN! I have feel that the bayes dont work
correctly- bayes ignore sa-learn. I am perhaps silly but I dont understand
how it works:(( I am interesred how tell to bayes THIS MAIL IS SPAM(by using
sa-learn), WHEN THIS SAME MAIL COME AGAIN SO YOU HAVE TO MARK LIKE SPAM! I
know that bayes find similar element between mail and according to decide.
But when I mark mail like spam a next mail have 100% similarity so bayes
HAVE TO mark it like SPAM. It is logical acording to me.

-- 
View this message in context: 
http://www.nabble.com/SA-learn-%28spamassassin%29-tp24773517p24777034.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.

Re: blacklisting a forger

2009-08-02 Thread mouss

Terry Carmen a écrit :
>> On Sat, 1 Aug 2009 19:33:40 -0400
>> "Terry Carmen"  wrote:
>>
>>> The backscatter would not have been received, since the sender is on
>>> a number of RBLs.
>> It's the IP address of the botnet PC that's on the RBLs, the backscatter
>> doesn't come from there, it comes from the recipients of the spam.
>>
>> See:  http://en.wikipedia.org/wiki/Backscatter_(e-mail)
> 
> Regardless of whether or not the message was backscatter, The sending system
> (triband-mum-59.184.51.13.mtnl.net.in) is blacklisted,
> 

- bot at triband-* sent junk to silly.server.example.
- silly.server.example didn't reject it. instead it bounced it to OP
- the bounce includes infos about which host sent the original junk to
silly.server.example, and this is triband-*

so for OP, this is backscatter, and RBL/DNSBL is of no help.

Re: Razor, spamassassin - network test

2009-08-02 Thread monolit


I am really sorry it was mistake - I was yesterday very tired.

Back on-list.  I'm not a personal help-line.

When I use spamassassin -t -D razor2 < /tmp/spam so I dont get the hash and
so on but content analysis
> > details...bayes clasification and so on. I expected message like 

debug: Razor is available
>  debug: Razor Agents 1.20, protocol version 2.
>  debug: Read server list from /home/jgb/.razor.lst
>  debug: 72636 seconds before closest server discovery
>  debug: Closest server is 209.204.62.150
>  debug: Connecting to 209.204.62.150...
>  debug: Connection established
>  debug: Signature: 48e74b8496877ba45072b201b41eebed7038186b
>  debug: Server version: 1.11, protocol version 2
>  debug: Server response: Negative
>  48e74b8496877ba45072b201b41eebed7038186b
>  debug: Message 1 NOT found in the catalogue

I dont have any idea howto do razor works. This command(spamassassin -t -D
razor2 < /tmp/spam) is without --lint and its recommended by spamassassin
www pages.so  I am begginer in this field and therefore I need accurate
advise. 
Thanks for your help


-- 
View this message in context: 
http://www.nabble.com/Razor%2C-spamassassin---network-test-tp24773506p24776602.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.

Re: SA-learn (spamassassin)

Re: SA-learn (spamassassin)

Re: SA-learn (spamassassin)

Re: SA-learn (spamassassin)

Re: Razor, spamassassin - network test

Re: SA-learn (spamassassin)

Re: SA-learn (spamassassin)

Re: SA-learn (spamassassin)

Re: Razor, spamassassin - network test

Re: SA-learn (spamassassin)

Re: SA-learn (spamassassin)

Re: Razor, spamassassin - network test

Re: Razor, spamassassin - network test

Re: SA-learn (spamassassin)

Re: Razor, spamassassin - network test

Re: SA-learn (spamassassin)

Re: SA-learn (spamassassin)

Re: SA-learn (spamassassin)

Re: Network Tests / Rule Files Directories

Re: SA-learn (spamassassin)

Re: SA-learn (spamassassin)

Re: blacklisting a forger

Re: Razor, spamassassin - network test

23 matches

Site Navigation

Mail list logo

Footer information