Re: why does that mail not get any bayes-classification
Am 12.06.2016 um 00:27 schrieb Sidney Markowitz: Reindl Harald wrote on 12/06/16 9:31 AM: headers don't help when you have a "spamd: result" log-line with a ton Ah, finally I understand what you are trying to do! You analyze the spamd result log lines, and they currently have two deficiencies: 1) They do not distinguish between Bayes failing and Bayes simply not finding significant tokens in the message; 2) They don't provide a way of matching the result log line to the actual message when the message does not contain an mid. Your proposed solution of a BAYES_NOTOK rule would solve the first one. The second is a bit trickier. Really there ought to be a way to configure custom output in the spamd result log line, or to have a rule that can include some information in addition to its name and score in its report both would require small changes in SA itself the second is not really trickier, when the code is able to put the MID in the log line than it has also the informations about sender/rcpt and just don't put it into the logs signature.asc Description: OpenPGP digital signature
Re: why does that mail not get any bayes-classification
Reindl Harald wrote on 12/06/16 9:31 AM: > > headers don't help when you have a "spamd: result" log-line with a ton Ah, finally I understand what you are trying to do! You analyze the spamd result log lines, and they currently have two deficiencies: 1) They do not distinguish between Bayes failing and Bayes simply not finding significant tokens in the message; 2) They don't provide a way of matching the result log line to the actual message when the message does not contain an mid. Your proposed solution of a BAYES_NOTOK rule would solve the first one. The second is a bit trickier. Really there ought to be a way to configure custom output in the spamd result log line, or to have a rule that can include some information in addition to its name and score in its report. Sidney
Re: why does that mail not get any bayes-classification
Am 11.06.2016 um 23:26 schrieb Sidney Markowitz: Reindl Harald wrote on 12/06/16 9:08 AM: and it's not worth to discuss since the *real* solution would be a "BAYES_NOTOKS" which would appear *everywhere* and clearly explain why no other BAYES_XX is present I can't argue with that. Without the ability to make it a rule or a meta-rule that would only show if there are no BAYES_NN rules triggered, adding the tags to the report unconditionally would not be the same as a BAYES_NOTOK rule. As far as I can tell implementing BAYES_NOTOK would require a (small) change in the Bayes plugin. It could not be done in the configuration file or by writing a new rule. So you are right about that. that's exactly the point However, what you said about _SENDERDOMAIN_ and _AUTHORDOMAIN_ could be handled by adding a report line that contains those tags just before or just after the report _SUMMARY_ line in the configuration headers don't help when you have a "spamd: result" log-line with a ton of rules or a new rule you are trying out appears when the message has no message-id since your only anchor is the mid=<> part of the logline from which you can grep the other relevant MTA lines and find out who was the sender, who was the rcpt and from where did that message arrive at all keep in mind: you get all that headers only in your own mails, they are not helping you much as sysadmin for a lot of users where you try to find out if rules needs to be rescored in whatever direction signature.asc Description: OpenPGP digital signature
Re: why does that mail not get any bayes-classification
Reindl Harald wrote on 12/06/16 9:08 AM: > > and it's not worth to discuss since the *real* solution would be a > "BAYES_NOTOKS" which would appear *everywhere* and clearly explain why > no other BAYES_XX is present I can't argue with that. Without the ability to make it a rule or a meta-rule that would only show if there are no BAYES_NN rules triggered, adding the tags to the report unconditionally would not be the same as a BAYES_NOTOK rule. As far as I can tell implementing BAYES_NOTOK would require a (small) change in the Bayes plugin. It could not be done in the configuration file or by writing a new rule. So you are right about that. However, what you said about _SENDERDOMAIN_ and _AUTHORDOMAIN_ could be handled by adding a report line that contains those tags just before or just after the report _SUMMARY_ line in the configuration. Sidney
Re: why does that mail not get any bayes-classification
Am 11.06.2016 um 23:00 schrieb Sidney Markowitz: Reindl Harald wrote on 12/06/16 8:37 AM: it is not part of the report itself while tags, scores and descriptions are - a report is something like this: what you showed is defined in the configuration file using "report". Those just happen to be the last lines of it in the default configuration. That default uses the template tags _SCORE_, _REQD_, and _SUMMARY_. What I'm saying is that you can include "report" lines in the configuration that use the _SENDERDOMAIN_, _AUTHORDOMAIN_, and the various Bayes related tags and they will show up in the report. If you can use the report that you showed, then you can make use of those tags. They won't be in the table of points, rules, and description. That table is what _SUMMARY_ expands into. But you can insert them into the report that you see using spamc -R. They can even come after _SUMMARY_ if you want to you can do a lot the whole purpose here is to a) upload a eml file on a webserver b) spamc -R -l -s 2000 --socket /socket-path < upload.eml c) display the part startign with "Content analysis details" d) combine it with clamd results e) display the raw-eml on the bottom of the website all the header tricks are *not* part of it and it's not worth to discuss since the *real* solution would be a "BAYES_NOTOKS" which would appear *everywhere* and clearly explain why no other BAYES_XX is present signature.asc Description: OpenPGP digital signature
Re: why does that mail not get any bayes-classification
Reindl Harald wrote on 12/06/16 8:37 AM: > > it is not part of the report itself while tags, scores and descriptions > are - a report is something like this: what you showed is defined in the configuration file using "report". Those just happen to be the last lines of it in the default configuration. That default uses the template tags _SCORE_, _REQD_, and _SUMMARY_. What I'm saying is that you can include "report" lines in the configuration that use the _SENDERDOMAIN_, _AUTHORDOMAIN_, and the various Bayes related tags and they will show up in the report. If you can use the report that you showed, then you can make use of those tags. They won't be in the table of points, rules, and description. That table is what _SUMMARY_ expands into. But you can insert them into the report that you see using spamc -R. They can even come after _SUMMARY_ if you want to. Sidney
Re: why does that mail not get any bayes-classification
Am 11.06.2016 um 19:06 schrieb Sidney Markowitz: Reindl Harald wrote on 12/06/16 4:44 AM: look above "sadly that works only for mail-headers - but don't appear in the logs" Oh, you mentioned spamc -R before and it does appear in that output. I got that confused with the spamd logs - You're right, I don't see them there. it is not part of the report itself while tags, scores and descriptions are - a report is something like this: Content analysis details: (28.1 points, 5.5 required) pts rule name description -- -- -0.1 CUST_DNSWL_2_SENDERSC_LOW RBL: score.senderscore.com (Low Trust) [145.253.224.163 listed in score.senderscore.com] 1.0 NIXSPAM_IXHASH DIGEST: ix.dnsbl.manitu.net 7.5 BAYES_99 BODY: Bayes spam probability is 99 to 100% [score: 0.9995] 1.5 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) 4.0 DKIM_ADSP_DISCARD No valid author signature, domain signs all mail and suggests discarding the rest 2.0 DATE_IN_FUTURE_06_12 Date: is 6 to 12 hours after Received: date 2.5 CUST_BODY_CONTAINS_M BODY: Contains Medium 0.5 CUST_BODY_CONTAINS_VL BODY: Contains Very Low 0.0 HTML_MESSAGE BODY: HTML included in message 0.4 BAYES_999 BODY: Bayes spam probability is 99.9 to 100% [score: 0.9995] 2.0 RAZOR2_CF_RANGE_E8_51_100 Razor2 gives engine 8 confidence level above 50% [cf: 100] 0.5 RAZOR2_CHECK Listed in Razor2 (http://razor.sf.net/) 0.5 RAZOR2_CF_RANGE_51_100 Razor2 gives confidence level above 50% [cf: 100] 1.0 FSL_BULK_SIG Bulk signature with no Unsubscribe 0.7 LOTS_OF_MONEY Huge... sums of money 0.1 MSGID_FROM_MTA_HEADER Message-Id was added by a relay 1.5 IXHASH_CHECK Message hits one ore more IXHASH digest-sources 2.5 DIGEST_MULTIPLE_LOCAL Message hits more than one network digest check (razor, pyzor, ixhash) 0.0 T_REMOTE_IMAGE Message contains an external image signature.asc Description: OpenPGP digital signature
Re: why does that mail not get any bayes-classification
Reindl Harald wrote on 12/06/16 4:44 AM: > look above "sadly that works only for mail-headers - but don't appear in > the logs" > Oh, you mentioned spamc -R before and it does appear in that output. I got that confused with the spamd logs - You're right, I don't see them there. Sidney
Re: why does that mail not get any bayes-classification
Am 11.06.2016 um 18:43 schrieb Sidney Markowitz: Reindl Harald wrote on 12/06/16 2:13 AM: sadly that works only for mail-headers - but don't appear in the logs I just tried adding this to my local configuration, not pretty, just to see what it would do report tag values sender _SENDERDOMAIN_ author _AUTHORDOMAIN_ bayesh _BAYESTCHAMMY_ bayses _BAYESTCSPAMMY_ and I got this line in the report tag values sender vantoll.nl author vantoll.nl bayesh 3 bayses 6 So even though the documentation is not clear about it, you can use those template tags in the report option too look above "sadly that works only for mail-headers - but don't appear in the logs" signature.asc Description: OpenPGP digital signature
Re: why does that mail not get any bayes-classification
Reindl Harald wrote on 12/06/16 2:13 AM: > sadly that works only for mail-headers - but don't appear in the logs I just tried adding this to my local configuration, not pretty, just to see what it would do report tag values sender _SENDERDOMAIN_ author _AUTHORDOMAIN_ bayesh _BAYESTCHAMMY_ bayses _BAYESTCSPAMMY_ and I got this line in the report tag values sender vantoll.nl author vantoll.nl bayesh 3 bayses 6 So even though the documentation is not clear about it, you can use those template tags in the report option too. Sidney
Re: why does that mail not get any bayes-classification
Am 11.06.2016 um 17:03 schrieb Sidney Markowitz: Reindl Harald wrote on 12/06/16 2:13 AM: sadly that works only for mail-headers - but don't appear in the logs nor in a report generated with "/usr/bin/spamc -R" over a webinterface which proceeds uploads of eml-files :-( Yes I see how that could be useful. It might work if you define a meta rule that checks for there not being any of the BAYES_NN rules but that wouldn't say anything useful because you can't distinct between "bayes not working at all", "to few training-messages" and "not enugh useful tokens" That doesn't give you access to _SENDERDOMAIN_ and _AUTHORDOMAIN_ tags, though jep signature.asc Description: OpenPGP digital signature
Re: why does that mail not get any bayes-classification
Reindl Harald wrote on 12/06/16 2:13 AM: > sadly that works only for mail-headers - but don't appear in the logs > nor in a report generated with "/usr/bin/spamc -R" over a webinterface > which proceeds uploads of eml-files :-( Yes I see how that could be useful. It might work if you define a meta rule that checks for there not being any of the BAYES_NN rules. That doesn't give you access to _SENDERDOMAIN_ and _AUTHORDOMAIN_ tags, though. Sidney
Re: why does that mail not get any bayes-classification
Am 11.06.2016 um 15:55 schrieb Sidney Markowitz: Reindl Harald wrote on 12/06/16 1:04 AM: output of "spamassassin -D < ignored_by_bayes_stripped.eml" attached See this line in that output: Jun 11 14:47:00.510 [5188] dbg: bayes: cannot use bayes on this message; not enough usable tokens found i would expect a bayes result in any case and even if it's just a informational BAYES_NOTOKS Not by default, but see the tags in the next lines in your debug output, starting with Jun 11 14:47:00.510 [5188] dbg: check: tagrun - tag BAYESTCHAMMY is now ready, value: 0 Jun 11 14:47:00.510 [5188] dbg: check: tagrun - tag BAYESTCSPAMMY is now ready, value: 0 And see how you can add custom headers to your output that includes these tags as documented here: https://spamassassin.apache.org/full/3.4.x/doc/Mail_SpamAssassin_Conf.html#template_tags sadly that works only for mail-headers - but don't appear in the logs nor in a report generated with "/usr/bin/spamc -R" over a webinterface which proceeds uploads of eml-files :-( a test-tag BAYES_NOTOKS or however called would appear on all places while for the spamd-log _SENDERDOMAIN_ and _AUTHORDOMAIN_ would give the benefit that you could also find out something useful about a result when there is no message-id signature.asc Description: OpenPGP digital signature
Re: why does that mail not get any bayes-classification
Reindl Harald wrote on 12/06/16 1:04 AM: > output of "spamassassin -D < ignored_by_bayes_stripped.eml" attached See this line in that output: Jun 11 14:47:00.510 [5188] dbg: bayes: cannot use bayes on this message; not enough usable tokens found > i would expect a bayes result in any case and even if it's just a > informational BAYES_NOTOKS Not by default, but see the tags in the next lines in your debug output, starting with Jun 11 14:47:00.510 [5188] dbg: check: tagrun - tag BAYESTCHAMMY is now ready, value: 0 Jun 11 14:47:00.510 [5188] dbg: check: tagrun - tag BAYESTCSPAMMY is now ready, value: 0 And see how you can add custom headers to your output that includes these tags as documented here: https://spamassassin.apache.org/full/3.4.x/doc/Mail_SpamAssassin_Conf.html#template_tags Sidney
Re: why does that mail not get any bayes-classification
On Sat, 11 Jun 2016 04:52:48 +0200 Reindl Harald wrote: > Am 10.06.2016 um 23:52 schrieb RW: > > On Fri, 10 Jun 2016 16:57:45 +0200 > > Reindl Harald wrote: > > > >> see attachemnt, no bayes tag at all looks like a major bug > >> somewhere > > > > In the absence of any debug it's hard to say. > > hence i attached the sample An email is not debug. I can't run it on *your* system. > > It is possible for no tokens to make it through the selection, in > > which case there is no result. That's more likely than normal in > > your case since you don't train on headers. > > if you would have looked at the message you would have seen that > there is content and not only headers and it looks like the message > has just incorrect mime-definitions (missing end headers) Of course I looked at it. And I ran it through spamassassin. Aside from header tokens, what made it past the token selection on my database was only: 'marcus','Marcus','enclosed','invoice','business' and 'thank' It's quite possible that all the body tokens in that email were in the neutral range on your system, which would cause Bayes to exit without producing a classification. In the absence of any debug against your database, there is nothing particularly suspicious here.
Re: why does that mail not get any bayes-classification
On Sat, 11 Jun 2016, Reindl Harald wrote: Am 10.06.2016 um 23:52 schrieb RW: On Fri, 10 Jun 2016 16:57:45 +0200 Reindl Harald wrote: see attachemnt, no bayes tag at all looks like a major bug somewhere In the absence of any debug it's hard to say. hence i attached the sample It is possible for no tokens to make it through the selection, in which case there is no result. That's more likely than normal in your case since you don't train on headers. if you would have looked at the message you would have seen that there is content and not only headers and it looks like the message has just incorrect mime-definitions (missing end headers) since thunderbird shows the attachment as well as the mail content that would be a way for spammers to completly trick out SA There may be a bug but I don't it is in the SA distro. I took your sample and fed it to my SA kit. First time thru it hit BAYES_50, I then did a "sa-learn --spam < /tmp/ignored_by_bayes_stripped.eml" and retested it. It then hit BAYES_999. So I'd say standard SA + Bayes works on that message. Somebody at your site may have done some modifications to your SA that is causing you problems. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{ smime.p7s Description: S/MIME Cryptographic Signature
Re: why does that mail not get any bayes-classification
Am 10.06.2016 um 23:52 schrieb RW: On Fri, 10 Jun 2016 16:57:45 +0200 Reindl Harald wrote: see attachemnt, no bayes tag at all looks like a major bug somewhere In the absence of any debug it's hard to say. hence i attached the sample It is possible for no tokens to make it through the selection, in which case there is no result. That's more likely than normal in your case since you don't train on headers. if you would have looked at the message you would have seen that there is content and not only headers and it looks like the message has just incorrect mime-definitions (missing end headers) since thunderbird shows the attachment as well as the mail content that would be a way for spammers to completly trick out SA signature.asc Description: OpenPGP digital signature
Re: why does that mail not get any bayes-classification
On Fri, 10 Jun 2016 16:57:45 +0200 Reindl Harald wrote: > see attachemnt, no bayes tag at all looks like a major bug somewhere In the absence of any debug it's hard to say. It is possible for no tokens to make it through the selection, in which case there is no result. That's more likely than normal in your case since you don't train on headers.