Bayes always reject.

2023-12-12 Thread Pierluigi Frullani
Hello all,
 I'm facing a strange problem.
I've feed the bayes db for a while and now I would like to put it in use
but all messages get a BAYES_99 and very high spam point.
I would like to understand why, and troubleshoot this problem but I can't
find a way.
Spamassassin version is:
root@puma:~# spamassassin --version
SpamAssassin version 3.4.6
  running on Perl version 5.22.2
This is the sa_learn --dump magic:
root@puma:~# sa-learn --dump magic
0.000  0  3  0  non-token data: bayes db version
0.000  0 130610  0  non-token data: nspam
0.000  0 316040  0  non-token data: nham
0.000  0 136493  0  non-token data: ntokens
0.000  0 1695915149  0  non-token data: oldest atime
0.000  0 1702447561  0  non-token data: newest atime
0.000  0 1702449197  0  non-token data: last journal sync
atime
0.000  0 1701476495  0  non-token data: last expiry atime
0.000  05529600  0  non-token data: last expire atime
delta
0.000  0  34998  0  non-token data: last expire
reduction count
and this is the spamassassin --lint -D:
root@puma:~# spamassassin -D --lint  2>&1 | grep -i bay
Dec 13 07:39:07.885 [26545] dbg: plugin: loading
Mail::SpamAssassin::Plugin::Bayes from @INC
Dec 13 07:39:08.005 [26545] dbg: config: fixed relative path:
/var/lib/spamassassin/3.004006/updates_spamassassin_org/23_bayes.cf
Dec 13 07:39:08.005 [26545] dbg: config: using
"/var/lib/spamassassin/3.004006/updates_spamassassin_org/23_bayes.cf" for
included file
Dec 13 07:39:08.005 [26545] dbg: config: read file
/var/lib/spamassassin/3.004006/updates_spamassassin_org/23_bayes.cf
Dec 13 07:39:08.047 [26545] dbg: config: fixed relative path:
/var/lib/spamassassin/3.004006/updates_spamassassin_org/
60_bayes_stopwords.cf
Dec 13 07:39:08.047 [26545] dbg: config: using
"/var/lib/spamassassin/3.004006/updates_spamassassin_org/
60_bayes_stopwords.cf" for included file
Dec 13 07:39:08.047 [26545] dbg: config: read file
/var/lib/spamassassin/3.004006/updates_spamassassin_org/
60_bayes_stopwords.cf
Dec 13 07:39:08.292 [26545] dbg: shortcircuit: adding BAYES_99 using
abbreviation spam
Dec 13 07:39:08.292 [26545] dbg: shortcircuit: adding BAYES_00 using
abbreviation ham
Dec 13 07:39:08.586 [26545] dbg: plugin:
Mail::SpamAssassin::Plugin::Bayes=HASH(0x5cca570) implements 'learner_new',
priority 0
Dec 13 07:39:08.586 [26545] dbg: bayes: learner_new
self=Mail::SpamAssassin::Plugin::Bayes=HASH(0x5cca570),
bayes_store_module=Mail::SpamAssassin::BayesStore::DBM
Dec 13 07:39:08.594 [26545] dbg: bayes: learner_new: got
store=Mail::SpamAssassin::BayesStore::DBM=HASH(0x6a51bb0)
Dec 13 07:39:08.594 [26545] dbg: plugin:
Mail::SpamAssassin::Plugin::Bayes=HASH(0x5cca570) implements
'learner_is_scan_available', priority 0
Dec 13 07:39:08.595 [26545] dbg: bayes: tie-ing to DB file R/O
/var/spamassasin/bayes_toks
Dec 13 07:39:08.595 [26545] dbg: bayes: tie-ing to DB file R/O
/var/spamassasin/bayes_seen
Dec 13 07:39:08.595 [26545] dbg: bayes: found bayes db version 3
Dec 13 07:39:08.595 [26545] dbg: bayes: DB journal sync: last sync:
1702449197
Dec 13 07:39:08.621 [26545] dbg: bayes: DB journal sync: last sync:
1702449197
Dec 13 07:39:08.621 [26545] dbg: bayes: corpus size: nspam = 130610, nham =
316040
Dec 13 07:39:08.622 [26545] dbg: bayes: tokenized body: 120 tokens
Dec 13 07:39:08.622 [26545] dbg: bayes: tokenized uri: 0 tokens
Dec 13 07:39:08.622 [26545] dbg: bayes: tokenized invisible: 0 tokens
Dec 13 07:39:08.623 [26545] dbg: bayes: tokenized header: 14 tokens
Dec 13 07:39:08.623 [26545] dbg: bayes: score = 0.976034467829266
Dec 13 07:39:08.624 [26545] dbg: bayes: DB expiry: tokens in DB: 136493,
Expiry max size: 15, Oldest atime: 1695915149, Newest atime:
1702447561, Last expire: 1701476495, Current time: 1702449548
Dec 13 07:39:08.624 [26545] dbg: bayes: DB journal sync: last sync:
1702449197
Dec 13 07:39:08.624 [26545] dbg: bayes: untie-ing
Dec 13 07:39:08.624 [26545] dbg: check: tagrun - tag BAYESTCHAMMY is now
ready, value: 0
Dec 13 07:39:08.624 [26545] dbg: check: tagrun - tag BAYESTCSPAMMY is now
ready, value: 2
Dec 13 07:39:08.624 [26545] dbg: check: tagrun - tag BAYESTCLEARNED is now
ready, value: 4
Dec 13 07:39:08.624 [26545] dbg: check: tagrun - tag BAYESTC is now ready,
value: 20
Dec 13 07:39:08.628 [26545] dbg: rules: ran eval rule BAYES_95 ==> got
hit (1)
Dec 13 07:39:08.863 [26545] dbg: check:
tests=BAYES_95,MISSING_DATE,MISSING_HEADERS,NO_RECEIVED,NO_RELAYS,T_SCC_BODY_TEXT_LINE
Dec 13 07:39:08.864 [26545] dbg: timing: total 1004 ms - init: 738 (73.5%),
parse: 0.85 (0.1%), extract_message_metadata: 1.10 (0.1%),
get_uri_detail_list: 3.9 (0.4%), tests_pri_-2000: 4.3 (0.4%), compile_gen:
85 (8.5%), compile_eval: 13 (1.3%), tests_pri_-1000: 3.6 (0.4%),
tests_pri_-950: 2.8 (0.3%), tests_pri_-900: 4.2 (0.4%), tests_pri_-100: 7
(0.7%), check_bayes: 3.9 (0.4%), b_tokenize: 2.1 

Re: some problem with spam

2023-12-12 Thread natan

Hi
thenx i try in this ruleset

W dniu 12.12.2023 o 14:59, Jimmy pisze:

These rules should matched

rawbody __DOUBLE_HTML /<\/a>\s*/
uri           __LONG_LINK_URL 
 /https?:\/\/.{50,128}\.[a-z]{2,}\/\.[a-z]{2,}\//i




On Tue, Dec 12, 2023 at 8:44 PM natan  wrote:

Hi
Thenx but link is random too like:

https://paste.debian.net/1300874/


W dniu 12.12.2023 o 12:21, Jimmy pisze:


uri     __ADB_CPN_LINK /\.campaign\.adobe\.com\/r\/\?/
rawbody __IMG_SRC_CID   / wrote:

Hi
I have a SpamAssassin version 3.4.6

And I try resolv two problem

1)I put eml with spam and learn SA like:
sa-learn --spam /root/spamik/

In /root/spamik/ is 4 e-mail
Worsk great but after 7 day i must learn agin like SA forgot
what he learned

2)I have a problem with one type a spam like:
https://paste.debian.net/1300865/
beacuse:
contents - random
from - random
IP - random

The construction is only somewhat similar like base64 + html
and png
All wass signed by DKIM

And I had to work around it in the following way but it is
not a solution

rawbody  EMAIL_20231207    /(necessary to delete the message
completely|email message and any attachments are
intended|automatically archived by Mimecast|sender and take
the steps necessary)/i
describe EMAIL_20231207    Spam fake IQ password
score    EMAIL_20231207    2

rawbody  EMAIL_20231207_1   /FONT\-FAMILY\:Arial/
score    EMAIL_20231207_1   0.1
rawbody  EMAIL_20231207_2

/BORDER-LEFT\:0\;MARGIN\:0\;PADDING-RIGHT\:0\;BACKGROUND\-COLOR\:white\;font\-stretch\:inherit/
meta EMAIL_20231207_ALL IQ_EMAIL_20231207_1 &&
IQ_EMAIL_20231207_2 && KAM_HTML_FONT_INVALID && MIME_HTML_ONLY
score    EMAIL_20231207_ALL 2

Any idea ?



-- 





-- 



--


Re: some problem with spam

2023-12-12 Thread Jimmy
These rules should matched

rawbody __DOUBLE_HTML   /<\/a>\s*/
uri   __LONG_LINK_URL
 /https?:\/\/.{50,128}\.[a-z]{2,}\/\.[a-z]{2,}\//i



On Tue, Dec 12, 2023 at 8:44 PM natan  wrote:

> Hi
> Thenx but link is random too like:
>
> https://paste.debian.net/1300874/
>
>
> W dniu 12.12.2023 o 12:21, Jimmy pisze:
>
>
> uri __ADB_CPN_LINK /\.campaign\.adobe\.com\/r\/\?/
> rawbody __IMG_SRC_CID   /
> meta ADB_CPN_ABUSE __ADB_CPN_LINK && __IMG_SRC_CID
> describe ADB_CPN_ABUSE Possible malware link
> score ADB_CPN_ABUSE 2.5000
>
> Establishing a rule for "CONFIDENTIALITY NOTICE" is ineffective, it can be
> false positive. Since I don't have visibility into all headers, consider
> create rules based on specific headers or other rule that match these.
> Append these rules to the meta-rule and boost the overall score accordingly.
>
> Jimmy
>
>
> On Tue, Dec 12, 2023 at 5:53 PM natan  wrote:
>
>> Hi
>> I have a SpamAssassin version 3.4.6
>>
>> And I try resolv two problem
>>
>> 1)I put eml with spam and learn SA like:
>> sa-learn --spam /root/spamik/
>>
>> In /root/spamik/ is 4 e-mail
>> Worsk great but after 7 day i must learn agin like SA forgot what he
>> learned
>>
>> 2)I have a problem with one type a spam like:
>> https://paste.debian.net/1300865/
>> beacuse:
>> contents - random
>> from - random
>> IP - random
>>
>> The construction is only somewhat similar like base64 + html and png
>> All wass signed by DKIM
>>
>> And I had to work around it in the following way but it is not a solution
>>
>> rawbody  EMAIL_20231207/(necessary to delete the message
>> completely|email message and any attachments are intended|automatically
>> archived by Mimecast|sender and take the steps necessary)/i
>> describe EMAIL_20231207Spam fake IQ password
>> scoreEMAIL_202312072
>>
>> rawbody  EMAIL_20231207_1   /FONT\-FAMILY\:Arial/
>> scoreEMAIL_20231207_1   0.1
>> rawbody  EMAIL_20231207_2
>> /BORDER-LEFT\:0\;MARGIN\:0\;PADDING-RIGHT\:0\;BACKGROUND\-COLOR\:white\;font\-stretch\:inherit/
>> meta EMAIL_20231207_ALL IQ_EMAIL_20231207_1 && IQ_EMAIL_20231207_2 &&
>> KAM_HTML_FONT_INVALID && MIME_HTML_ONLY
>> scoreEMAIL_20231207_ALL 2
>>
>> Any idea ?
>>
>>
>>
>> --
>>
>
>
>
> --
>


Re: some problem with spam

2023-12-12 Thread natan

Hi
Thenx but link is random too like:

https://paste.debian.net/1300874/


W dniu 12.12.2023 o 12:21, Jimmy pisze:


uri     __ADB_CPN_LINK /\.campaign\.adobe\.com\/r\/\?/
rawbody __IMG_SRC_CID   /Establishing a rule for "CONFIDENTIALITY NOTICE" is ineffective, it 
can be false positive. Since I don't have visibility into all headers, 
consider create rules based on specific headers or other rule that 
match these. Append these rules to the meta-rule and boost the overall 
score accordingly.


Jimmy


On Tue, Dec 12, 2023 at 5:53 PM natan  wrote:

Hi
I have a SpamAssassin version 3.4.6

And I try resolv two problem

1)I put eml with spam and learn SA like:
sa-learn --spam /root/spamik/

In /root/spamik/ is 4 e-mail
Worsk great but after 7 day i must learn agin like SA forgot what
he learned

2)I have a problem with one type a spam like:
https://paste.debian.net/1300865/
beacuse:
contents - random
from - random
IP - random

The construction is only somewhat similar like base64 + html and png
All wass signed by DKIM

And I had to work around it in the following way but it is not a
solution

rawbody  EMAIL_20231207    /(necessary to delete the message
completely|email message and any attachments are
intended|automatically archived by Mimecast|sender and take the
steps necessary)/i
describe EMAIL_20231207    Spam fake IQ password
score    EMAIL_20231207    2

rawbody  EMAIL_20231207_1   /FONT\-FAMILY\:Arial/
score    EMAIL_20231207_1   0.1
rawbody  EMAIL_20231207_2

/BORDER-LEFT\:0\;MARGIN\:0\;PADDING-RIGHT\:0\;BACKGROUND\-COLOR\:white\;font\-stretch\:inherit/
meta EMAIL_20231207_ALL IQ_EMAIL_20231207_1 &&
IQ_EMAIL_20231207_2 && KAM_HTML_FONT_INVALID && MIME_HTML_ONLY
score    EMAIL_20231207_ALL 2

Any idea ?



-- 



--


Re: some problem with spam

2023-12-12 Thread Jimmy
uri __ADB_CPN_LINK /\.campaign\.adobe\.com\/r\/\?/
rawbody __IMG_SRC_CID   / wrote:

> Hi
> I have a SpamAssassin version 3.4.6
>
> And I try resolv two problem
>
> 1)I put eml with spam and learn SA like:
> sa-learn --spam /root/spamik/
>
> In /root/spamik/ is 4 e-mail
> Worsk great but after 7 day i must learn agin like SA forgot what he
> learned
>
> 2)I have a problem with one type a spam like:
> https://paste.debian.net/1300865/
> beacuse:
> contents - random
> from - random
> IP - random
>
> The construction is only somewhat similar like base64 + html and png
> All wass signed by DKIM
>
> And I had to work around it in the following way but it is not a solution
>
> rawbody  EMAIL_20231207/(necessary to delete the message
> completely|email message and any attachments are intended|automatically
> archived by Mimecast|sender and take the steps necessary)/i
> describe EMAIL_20231207Spam fake IQ password
> scoreEMAIL_202312072
>
> rawbody  EMAIL_20231207_1   /FONT\-FAMILY\:Arial/
> scoreEMAIL_20231207_1   0.1
> rawbody  EMAIL_20231207_2
> /BORDER-LEFT\:0\;MARGIN\:0\;PADDING-RIGHT\:0\;BACKGROUND\-COLOR\:white\;font\-stretch\:inherit/
> meta EMAIL_20231207_ALL IQ_EMAIL_20231207_1 && IQ_EMAIL_20231207_2 &&
> KAM_HTML_FONT_INVALID && MIME_HTML_ONLY
> scoreEMAIL_20231207_ALL 2
>
> Any idea ?
>
>
>
> --
>


some problem with spam

2023-12-12 Thread natan

Hi
I have a SpamAssassin version 3.4.6

And I try resolv two problem

1)I put eml with spam and learn SA like:
sa-learn --spam /root/spamik/

In /root/spamik/ is 4 e-mail
Worsk great but after 7 day i must learn agin like SA forgot what he learned

2)I have a problem with one type a spam like:
https://paste.debian.net/1300865/
beacuse:
contents - random
from - random
IP - random

The construction is only somewhat similar like base64 + html and png
All wass signed by DKIM

And I had to work around it in the following way but it is not a solution

rawbody  EMAIL_20231207    /(necessary to delete the message 
completely|email message and any attachments are intended|automatically 
archived by Mimecast|sender and take the steps necessary)/i

describe EMAIL_20231207    Spam fake IQ password
score    EMAIL_20231207    2

rawbody  EMAIL_20231207_1   /FONT\-FAMILY\:Arial/
score    EMAIL_20231207_1   0.1
rawbody  EMAIL_20231207_2 
/BORDER-LEFT\:0\;MARGIN\:0\;PADDING-RIGHT\:0\;BACKGROUND\-COLOR\:white\;font\-stretch\:inherit/
meta EMAIL_20231207_ALL IQ_EMAIL_20231207_1 && IQ_EMAIL_20231207_2 
&& KAM_HTML_FONT_INVALID && MIME_HTML_ONLY

score    EMAIL_20231207_ALL 2

Any idea ?

--