Bayes always reject.
Hello all, I'm facing a strange problem. I've feed the bayes db for a while and now I would like to put it in use but all messages get a BAYES_99 and very high spam point. I would like to understand why, and troubleshoot this problem but I can't find a way. Spamassassin version is: root@puma:~# spamassassin --version SpamAssassin version 3.4.6 running on Perl version 5.22.2 This is the sa_learn --dump magic: root@puma:~# sa-learn --dump magic 0.000 0 3 0 non-token data: bayes db version 0.000 0 130610 0 non-token data: nspam 0.000 0 316040 0 non-token data: nham 0.000 0 136493 0 non-token data: ntokens 0.000 0 1695915149 0 non-token data: oldest atime 0.000 0 1702447561 0 non-token data: newest atime 0.000 0 1702449197 0 non-token data: last journal sync atime 0.000 0 1701476495 0 non-token data: last expiry atime 0.000 05529600 0 non-token data: last expire atime delta 0.000 0 34998 0 non-token data: last expire reduction count and this is the spamassassin --lint -D: root@puma:~# spamassassin -D --lint 2>&1 | grep -i bay Dec 13 07:39:07.885 [26545] dbg: plugin: loading Mail::SpamAssassin::Plugin::Bayes from @INC Dec 13 07:39:08.005 [26545] dbg: config: fixed relative path: /var/lib/spamassassin/3.004006/updates_spamassassin_org/23_bayes.cf Dec 13 07:39:08.005 [26545] dbg: config: using "/var/lib/spamassassin/3.004006/updates_spamassassin_org/23_bayes.cf" for included file Dec 13 07:39:08.005 [26545] dbg: config: read file /var/lib/spamassassin/3.004006/updates_spamassassin_org/23_bayes.cf Dec 13 07:39:08.047 [26545] dbg: config: fixed relative path: /var/lib/spamassassin/3.004006/updates_spamassassin_org/ 60_bayes_stopwords.cf Dec 13 07:39:08.047 [26545] dbg: config: using "/var/lib/spamassassin/3.004006/updates_spamassassin_org/ 60_bayes_stopwords.cf" for included file Dec 13 07:39:08.047 [26545] dbg: config: read file /var/lib/spamassassin/3.004006/updates_spamassassin_org/ 60_bayes_stopwords.cf Dec 13 07:39:08.292 [26545] dbg: shortcircuit: adding BAYES_99 using abbreviation spam Dec 13 07:39:08.292 [26545] dbg: shortcircuit: adding BAYES_00 using abbreviation ham Dec 13 07:39:08.586 [26545] dbg: plugin: Mail::SpamAssassin::Plugin::Bayes=HASH(0x5cca570) implements 'learner_new', priority 0 Dec 13 07:39:08.586 [26545] dbg: bayes: learner_new self=Mail::SpamAssassin::Plugin::Bayes=HASH(0x5cca570), bayes_store_module=Mail::SpamAssassin::BayesStore::DBM Dec 13 07:39:08.594 [26545] dbg: bayes: learner_new: got store=Mail::SpamAssassin::BayesStore::DBM=HASH(0x6a51bb0) Dec 13 07:39:08.594 [26545] dbg: plugin: Mail::SpamAssassin::Plugin::Bayes=HASH(0x5cca570) implements 'learner_is_scan_available', priority 0 Dec 13 07:39:08.595 [26545] dbg: bayes: tie-ing to DB file R/O /var/spamassasin/bayes_toks Dec 13 07:39:08.595 [26545] dbg: bayes: tie-ing to DB file R/O /var/spamassasin/bayes_seen Dec 13 07:39:08.595 [26545] dbg: bayes: found bayes db version 3 Dec 13 07:39:08.595 [26545] dbg: bayes: DB journal sync: last sync: 1702449197 Dec 13 07:39:08.621 [26545] dbg: bayes: DB journal sync: last sync: 1702449197 Dec 13 07:39:08.621 [26545] dbg: bayes: corpus size: nspam = 130610, nham = 316040 Dec 13 07:39:08.622 [26545] dbg: bayes: tokenized body: 120 tokens Dec 13 07:39:08.622 [26545] dbg: bayes: tokenized uri: 0 tokens Dec 13 07:39:08.622 [26545] dbg: bayes: tokenized invisible: 0 tokens Dec 13 07:39:08.623 [26545] dbg: bayes: tokenized header: 14 tokens Dec 13 07:39:08.623 [26545] dbg: bayes: score = 0.976034467829266 Dec 13 07:39:08.624 [26545] dbg: bayes: DB expiry: tokens in DB: 136493, Expiry max size: 15, Oldest atime: 1695915149, Newest atime: 1702447561, Last expire: 1701476495, Current time: 1702449548 Dec 13 07:39:08.624 [26545] dbg: bayes: DB journal sync: last sync: 1702449197 Dec 13 07:39:08.624 [26545] dbg: bayes: untie-ing Dec 13 07:39:08.624 [26545] dbg: check: tagrun - tag BAYESTCHAMMY is now ready, value: 0 Dec 13 07:39:08.624 [26545] dbg: check: tagrun - tag BAYESTCSPAMMY is now ready, value: 2 Dec 13 07:39:08.624 [26545] dbg: check: tagrun - tag BAYESTCLEARNED is now ready, value: 4 Dec 13 07:39:08.624 [26545] dbg: check: tagrun - tag BAYESTC is now ready, value: 20 Dec 13 07:39:08.628 [26545] dbg: rules: ran eval rule BAYES_95 ==> got hit (1) Dec 13 07:39:08.863 [26545] dbg: check: tests=BAYES_95,MISSING_DATE,MISSING_HEADERS,NO_RECEIVED,NO_RELAYS,T_SCC_BODY_TEXT_LINE Dec 13 07:39:08.864 [26545] dbg: timing: total 1004 ms - init: 738 (73.5%), parse: 0.85 (0.1%), extract_message_metadata: 1.10 (0.1%), get_uri_detail_list: 3.9 (0.4%), tests_pri_-2000: 4.3 (0.4%), compile_gen: 85 (8.5%), compile_eval: 13 (1.3%), tests_pri_-1000: 3.6 (0.4%), tests_pri_-950: 2.8 (0.3%), tests_pri_-900: 4.2 (0.4%), tests_pri_-100: 7 (0.7%), check_bayes: 3.9 (0.4%), b_tokenize: 2.1
Re: some problem with spam
Hi thenx i try in this ruleset W dniu 12.12.2023 o 14:59, Jimmy pisze: These rules should matched rawbody __DOUBLE_HTML /<\/a>\s*/ uri __LONG_LINK_URL /https?:\/\/.{50,128}\.[a-z]{2,}\/\.[a-z]{2,}\//i On Tue, Dec 12, 2023 at 8:44 PM natan wrote: Hi Thenx but link is random too like: https://paste.debian.net/1300874/ W dniu 12.12.2023 o 12:21, Jimmy pisze: uri __ADB_CPN_LINK /\.campaign\.adobe\.com\/r\/\?/ rawbody __IMG_SRC_CID / wrote: Hi I have a SpamAssassin version 3.4.6 And I try resolv two problem 1)I put eml with spam and learn SA like: sa-learn --spam /root/spamik/ In /root/spamik/ is 4 e-mail Worsk great but after 7 day i must learn agin like SA forgot what he learned 2)I have a problem with one type a spam like: https://paste.debian.net/1300865/ beacuse: contents - random from - random IP - random The construction is only somewhat similar like base64 + html and png All wass signed by DKIM And I had to work around it in the following way but it is not a solution rawbody EMAIL_20231207 /(necessary to delete the message completely|email message and any attachments are intended|automatically archived by Mimecast|sender and take the steps necessary)/i describe EMAIL_20231207 Spam fake IQ password score EMAIL_20231207 2 rawbody EMAIL_20231207_1 /FONT\-FAMILY\:Arial/ score EMAIL_20231207_1 0.1 rawbody EMAIL_20231207_2 /BORDER-LEFT\:0\;MARGIN\:0\;PADDING-RIGHT\:0\;BACKGROUND\-COLOR\:white\;font\-stretch\:inherit/ meta EMAIL_20231207_ALL IQ_EMAIL_20231207_1 && IQ_EMAIL_20231207_2 && KAM_HTML_FONT_INVALID && MIME_HTML_ONLY score EMAIL_20231207_ALL 2 Any idea ? -- -- --
Re: some problem with spam
These rules should matched rawbody __DOUBLE_HTML /<\/a>\s*/ uri __LONG_LINK_URL /https?:\/\/.{50,128}\.[a-z]{2,}\/\.[a-z]{2,}\//i On Tue, Dec 12, 2023 at 8:44 PM natan wrote: > Hi > Thenx but link is random too like: > > https://paste.debian.net/1300874/ > > > W dniu 12.12.2023 o 12:21, Jimmy pisze: > > > uri __ADB_CPN_LINK /\.campaign\.adobe\.com\/r\/\?/ > rawbody __IMG_SRC_CID / > meta ADB_CPN_ABUSE __ADB_CPN_LINK && __IMG_SRC_CID > describe ADB_CPN_ABUSE Possible malware link > score ADB_CPN_ABUSE 2.5000 > > Establishing a rule for "CONFIDENTIALITY NOTICE" is ineffective, it can be > false positive. Since I don't have visibility into all headers, consider > create rules based on specific headers or other rule that match these. > Append these rules to the meta-rule and boost the overall score accordingly. > > Jimmy > > > On Tue, Dec 12, 2023 at 5:53 PM natan wrote: > >> Hi >> I have a SpamAssassin version 3.4.6 >> >> And I try resolv two problem >> >> 1)I put eml with spam and learn SA like: >> sa-learn --spam /root/spamik/ >> >> In /root/spamik/ is 4 e-mail >> Worsk great but after 7 day i must learn agin like SA forgot what he >> learned >> >> 2)I have a problem with one type a spam like: >> https://paste.debian.net/1300865/ >> beacuse: >> contents - random >> from - random >> IP - random >> >> The construction is only somewhat similar like base64 + html and png >> All wass signed by DKIM >> >> And I had to work around it in the following way but it is not a solution >> >> rawbody EMAIL_20231207/(necessary to delete the message >> completely|email message and any attachments are intended|automatically >> archived by Mimecast|sender and take the steps necessary)/i >> describe EMAIL_20231207Spam fake IQ password >> scoreEMAIL_202312072 >> >> rawbody EMAIL_20231207_1 /FONT\-FAMILY\:Arial/ >> scoreEMAIL_20231207_1 0.1 >> rawbody EMAIL_20231207_2 >> /BORDER-LEFT\:0\;MARGIN\:0\;PADDING-RIGHT\:0\;BACKGROUND\-COLOR\:white\;font\-stretch\:inherit/ >> meta EMAIL_20231207_ALL IQ_EMAIL_20231207_1 && IQ_EMAIL_20231207_2 && >> KAM_HTML_FONT_INVALID && MIME_HTML_ONLY >> scoreEMAIL_20231207_ALL 2 >> >> Any idea ? >> >> >> >> -- >> > > > > -- >
Re: some problem with spam
Hi Thenx but link is random too like: https://paste.debian.net/1300874/ W dniu 12.12.2023 o 12:21, Jimmy pisze: uri __ADB_CPN_LINK /\.campaign\.adobe\.com\/r\/\?/ rawbody __IMG_SRC_CID /Establishing a rule for "CONFIDENTIALITY NOTICE" is ineffective, it can be false positive. Since I don't have visibility into all headers, consider create rules based on specific headers or other rule that match these. Append these rules to the meta-rule and boost the overall score accordingly. Jimmy On Tue, Dec 12, 2023 at 5:53 PM natan wrote: Hi I have a SpamAssassin version 3.4.6 And I try resolv two problem 1)I put eml with spam and learn SA like: sa-learn --spam /root/spamik/ In /root/spamik/ is 4 e-mail Worsk great but after 7 day i must learn agin like SA forgot what he learned 2)I have a problem with one type a spam like: https://paste.debian.net/1300865/ beacuse: contents - random from - random IP - random The construction is only somewhat similar like base64 + html and png All wass signed by DKIM And I had to work around it in the following way but it is not a solution rawbody EMAIL_20231207 /(necessary to delete the message completely|email message and any attachments are intended|automatically archived by Mimecast|sender and take the steps necessary)/i describe EMAIL_20231207 Spam fake IQ password score EMAIL_20231207 2 rawbody EMAIL_20231207_1 /FONT\-FAMILY\:Arial/ score EMAIL_20231207_1 0.1 rawbody EMAIL_20231207_2 /BORDER-LEFT\:0\;MARGIN\:0\;PADDING-RIGHT\:0\;BACKGROUND\-COLOR\:white\;font\-stretch\:inherit/ meta EMAIL_20231207_ALL IQ_EMAIL_20231207_1 && IQ_EMAIL_20231207_2 && KAM_HTML_FONT_INVALID && MIME_HTML_ONLY score EMAIL_20231207_ALL 2 Any idea ? -- --
Re: some problem with spam
uri __ADB_CPN_LINK /\.campaign\.adobe\.com\/r\/\?/ rawbody __IMG_SRC_CID / wrote: > Hi > I have a SpamAssassin version 3.4.6 > > And I try resolv two problem > > 1)I put eml with spam and learn SA like: > sa-learn --spam /root/spamik/ > > In /root/spamik/ is 4 e-mail > Worsk great but after 7 day i must learn agin like SA forgot what he > learned > > 2)I have a problem with one type a spam like: > https://paste.debian.net/1300865/ > beacuse: > contents - random > from - random > IP - random > > The construction is only somewhat similar like base64 + html and png > All wass signed by DKIM > > And I had to work around it in the following way but it is not a solution > > rawbody EMAIL_20231207/(necessary to delete the message > completely|email message and any attachments are intended|automatically > archived by Mimecast|sender and take the steps necessary)/i > describe EMAIL_20231207Spam fake IQ password > scoreEMAIL_202312072 > > rawbody EMAIL_20231207_1 /FONT\-FAMILY\:Arial/ > scoreEMAIL_20231207_1 0.1 > rawbody EMAIL_20231207_2 > /BORDER-LEFT\:0\;MARGIN\:0\;PADDING-RIGHT\:0\;BACKGROUND\-COLOR\:white\;font\-stretch\:inherit/ > meta EMAIL_20231207_ALL IQ_EMAIL_20231207_1 && IQ_EMAIL_20231207_2 && > KAM_HTML_FONT_INVALID && MIME_HTML_ONLY > scoreEMAIL_20231207_ALL 2 > > Any idea ? > > > > -- >
some problem with spam
Hi I have a SpamAssassin version 3.4.6 And I try resolv two problem 1)I put eml with spam and learn SA like: sa-learn --spam /root/spamik/ In /root/spamik/ is 4 e-mail Worsk great but after 7 day i must learn agin like SA forgot what he learned 2)I have a problem with one type a spam like: https://paste.debian.net/1300865/ beacuse: contents - random from - random IP - random The construction is only somewhat similar like base64 + html and png All wass signed by DKIM And I had to work around it in the following way but it is not a solution rawbody EMAIL_20231207 /(necessary to delete the message completely|email message and any attachments are intended|automatically archived by Mimecast|sender and take the steps necessary)/i describe EMAIL_20231207 Spam fake IQ password score EMAIL_20231207 2 rawbody EMAIL_20231207_1 /FONT\-FAMILY\:Arial/ score EMAIL_20231207_1 0.1 rawbody EMAIL_20231207_2 /BORDER-LEFT\:0\;MARGIN\:0\;PADDING-RIGHT\:0\;BACKGROUND\-COLOR\:white\;font\-stretch\:inherit/ meta EMAIL_20231207_ALL IQ_EMAIL_20231207_1 && IQ_EMAIL_20231207_2 && KAM_HTML_FONT_INVALID && MIME_HTML_ONLY score EMAIL_20231207_ALL 2 Any idea ? --