Re: why does that mail not get any bayes-classification

2016-06-11 Thread Reindl Harald



Am 12.06.2016 um 00:27 schrieb Sidney Markowitz:

Reindl Harald wrote on 12/06/16 9:31 AM:


headers don't help when you have a "spamd: result" log-line with a ton


Ah, finally I understand what you are trying to do! You analyze the spamd
result log lines, and they currently have two deficiencies: 1) They do not
distinguish between Bayes failing and Bayes simply not finding significant
tokens in the message; 2) They don't provide a way of matching the result log
line to the actual message when the message does not contain an mid.

Your proposed solution of a BAYES_NOTOK rule would solve the first one. The
second is a bit trickier. Really there ought to be a way to configure custom
output in the spamd result log line, or to have a rule that can include some
information in addition to its name and score in its report


both would require small changes in SA itself

the second is not really trickier, when the code is able to put the MID 
in the log line than it has also the informations about sender/rcpt and 
just don't put it into the logs




signature.asc
Description: OpenPGP digital signature


Re: why does that mail not get any bayes-classification

2016-06-11 Thread Sidney Markowitz
Reindl Harald wrote on 12/06/16 9:31 AM:
> 
> headers don't help when you have a "spamd: result" log-line with a ton 

Ah, finally I understand what you are trying to do! You analyze the spamd
result log lines, and they currently have two deficiencies: 1) They do not
distinguish between Bayes failing and Bayes simply not finding significant
tokens in the message; 2) They don't provide a way of matching the result log
line to the actual message when the message does not contain an mid.

Your proposed solution of a BAYES_NOTOK rule would solve the first one. The
second is a bit trickier. Really there ought to be a way to configure custom
output in the spamd result log line, or to have a rule that can include some
information in addition to its name and score in its report.

 Sidney




Re: why does that mail not get any bayes-classification

2016-06-11 Thread Reindl Harald



Am 11.06.2016 um 23:26 schrieb Sidney Markowitz:

Reindl Harald wrote on 12/06/16 9:08 AM:


and it's not worth to discuss since the *real* solution would be a
"BAYES_NOTOKS" which would appear *everywhere* and clearly explain why
no other BAYES_XX is present


I can't argue with that. Without the ability to make it a rule or a meta-rule
that would only show if there are no BAYES_NN rules triggered, adding the tags
to the report unconditionally would not be the same as a BAYES_NOTOK rule. As
far as I can tell implementing BAYES_NOTOK would require a (small) change in
the Bayes plugin. It could not be done in the configuration file or by writing
a new rule. So you are right about that.


that's exactly the point


However, what you said about _SENDERDOMAIN_ and _AUTHORDOMAIN_ could be
handled by adding a report line that contains those tags just before or just
after the report _SUMMARY_ line in the configuration


headers don't help when you have a "spamd: result" log-line with a ton 
of rules or a new rule you are trying out appears when the message has 
no message-id since your only anchor is the mid=<> part of the logline 
from which you can grep the other relevant MTA lines and find out who 
was the sender, who was the rcpt and from where did that message arrive 
at all


keep in mind: you get all that headers only in your own mails, they are 
not helping you much as sysadmin for a lot of users where you try to 
find out if rules needs to be rescored in whatever direction




signature.asc
Description: OpenPGP digital signature


Re: why does that mail not get any bayes-classification

2016-06-11 Thread Sidney Markowitz
Reindl Harald wrote on 12/06/16 9:08 AM:
> 
> and it's not worth to discuss since the *real* solution would be a 
> "BAYES_NOTOKS" which would appear *everywhere* and clearly explain why 
> no other BAYES_XX is present

I can't argue with that. Without the ability to make it a rule or a meta-rule
that would only show if there are no BAYES_NN rules triggered, adding the tags
to the report unconditionally would not be the same as a BAYES_NOTOK rule. As
far as I can tell implementing BAYES_NOTOK would require a (small) change in
the Bayes plugin. It could not be done in the configuration file or by writing
a new rule. So you are right about that.

However, what you said about _SENDERDOMAIN_ and _AUTHORDOMAIN_ could be
handled by adding a report line that contains those tags just before or just
after the report _SUMMARY_ line in the configuration.

Sidney




Re: why does that mail not get any bayes-classification

2016-06-11 Thread Reindl Harald



Am 11.06.2016 um 23:00 schrieb Sidney Markowitz:

Reindl Harald wrote on 12/06/16 8:37 AM:


it is not part of the report itself while tags, scores and descriptions
are - a report is something like this:


what you showed is defined in the configuration file using "report". Those
just happen to be the last lines of it in the default configuration. That
default uses the template tags _SCORE_, _REQD_, and _SUMMARY_.

What I'm saying is that you can include "report" lines in the configuration
that use the _SENDERDOMAIN_, _AUTHORDOMAIN_, and the various Bayes related
tags and they will show up in the report. If you can use the report that you
showed, then you can make use of those tags. They won't be in the table of
points, rules, and description. That table is what _SUMMARY_ expands into. But
you can insert them into the report that you see using spamc -R. They can even
come after _SUMMARY_ if you want to


you can do a lot

the whole purpose here is to

a) upload a eml file on a webserver
b) spamc -R -l -s 2000 --socket /socket-path < upload.eml
c) display the part startign with "Content analysis details"
d) combine it with clamd results
e) display the raw-eml on the bottom of the website

all the header tricks are *not* part of it


and it's not worth to discuss since the *real* solution would be a 
"BAYES_NOTOKS" which would appear *everywhere* and clearly explain why 
no other BAYES_XX is present




signature.asc
Description: OpenPGP digital signature


Re: why does that mail not get any bayes-classification

2016-06-11 Thread Sidney Markowitz
Reindl Harald wrote on 12/06/16 8:37 AM:
> 
> it is not part of the report itself while tags, scores and descriptions 
> are - a report is something like this:

what you showed is defined in the configuration file using "report". Those
just happen to be the last lines of it in the default configuration. That
default uses the template tags _SCORE_, _REQD_, and _SUMMARY_.

What I'm saying is that you can include "report" lines in the configuration
that use the _SENDERDOMAIN_, _AUTHORDOMAIN_, and the various Bayes related
tags and they will show up in the report. If you can use the report that you
showed, then you can make use of those tags. They won't be in the table of
points, rules, and description. That table is what _SUMMARY_ expands into. But
you can insert them into the report that you see using spamc -R. They can even
come after _SUMMARY_ if you want to.

 Sidney



Re: why does that mail not get any bayes-classification

2016-06-11 Thread Reindl Harald



Am 11.06.2016 um 19:06 schrieb Sidney Markowitz:

Reindl Harald wrote on 12/06/16 4:44 AM:

look above "sadly that works only for mail-headers - but don't appear in
the logs"


Oh, you mentioned spamc -R before and it does appear in that output. I got
that confused with the spamd logs - You're right, I don't see them there.


it is not part of the report itself while tags, scores and descriptions 
are - a report is something like this:



Content analysis details:   (28.1 points, 5.5 required)

 pts rule name  description
 -- 
--

-0.1 CUST_DNSWL_2_SENDERSC_LOW RBL: score.senderscore.com (Low Trust)
[145.253.224.163 listed in 
score.senderscore.com]

 1.0 NIXSPAM_IXHASH DIGEST: ix.dnsbl.manitu.net
 7.5 BAYES_99   BODY: Bayes spam probability is 99 to 100%
[score: 0.9995]
 1.5 SPF_SOFTFAIL   SPF: sender does not match SPF record 
(softfail)
 4.0 DKIM_ADSP_DISCARD  No valid author signature, domain signs all 
mail

and suggests discarding the rest
 2.0 DATE_IN_FUTURE_06_12   Date: is 6 to 12 hours after Received: date
 2.5 CUST_BODY_CONTAINS_M   BODY: Contains Medium
 0.5 CUST_BODY_CONTAINS_VL  BODY: Contains Very Low
 0.0 HTML_MESSAGE   BODY: HTML included in message
 0.4 BAYES_999  BODY: Bayes spam probability is 99.9 to 100%
[score: 0.9995]
 2.0 RAZOR2_CF_RANGE_E8_51_100 Razor2 gives engine 8 confidence level
above 50%
[cf: 100]
 0.5 RAZOR2_CHECK   Listed in Razor2 (http://razor.sf.net/)
 0.5 RAZOR2_CF_RANGE_51_100 Razor2 gives confidence level above 50%
[cf: 100]
 1.0 FSL_BULK_SIG   Bulk signature with no Unsubscribe
 0.7 LOTS_OF_MONEY  Huge... sums of money
 0.1 MSGID_FROM_MTA_HEADER  Message-Id was added by a relay
 1.5 IXHASH_CHECK   Message hits one ore more IXHASH digest-sources
 2.5 DIGEST_MULTIPLE_LOCAL  Message hits more than one network digest check
 (razor, pyzor, ixhash)
 0.0 T_REMOTE_IMAGE Message contains an external image



signature.asc
Description: OpenPGP digital signature


Re: why does that mail not get any bayes-classification

2016-06-11 Thread Sidney Markowitz
Reindl Harald wrote on 12/06/16 4:44 AM:
> look above "sadly that works only for mail-headers - but don't appear in 
> the logs"
> 

Oh, you mentioned spamc -R before and it does appear in that output. I got
that confused with the spamd logs - You're right, I don't see them there.

 Sidney



Re: why does that mail not get any bayes-classification

2016-06-11 Thread Reindl Harald



Am 11.06.2016 um 18:43 schrieb Sidney Markowitz:

Reindl Harald wrote on 12/06/16 2:13 AM:

sadly that works only for mail-headers - but don't appear in the logs


I just tried adding this to my local configuration, not pretty, just to see
what it would do

report tag values sender _SENDERDOMAIN_  author _AUTHORDOMAIN_  bayesh
_BAYESTCHAMMY_ bayses _BAYESTCSPAMMY_

and I got this line in the report

tag values sender vantoll.nl  author vantoll.nl  bayesh 3 bayses 6

So even though the documentation is not clear about it, you can use those
template tags in the report option too


look above "sadly that works only for mail-headers - but don't appear in 
the logs"




signature.asc
Description: OpenPGP digital signature


Re: why does that mail not get any bayes-classification

2016-06-11 Thread Sidney Markowitz
Reindl Harald wrote on 12/06/16 2:13 AM:
> sadly that works only for mail-headers - but don't appear in the logs 

I just tried adding this to my local configuration, not pretty, just to see
what it would do

report tag values sender _SENDERDOMAIN_  author _AUTHORDOMAIN_  bayesh
_BAYESTCHAMMY_ bayses _BAYESTCSPAMMY_

and I got this line in the report

tag values sender vantoll.nl  author vantoll.nl  bayesh 3 bayses 6

So even though the documentation is not clear about it, you can use those
template tags in the report option too.

 Sidney



Re: why does that mail not get any bayes-classification

2016-06-11 Thread Reindl Harald



Am 11.06.2016 um 17:03 schrieb Sidney Markowitz:

Reindl Harald wrote on 12/06/16 2:13 AM:

sadly that works only for mail-headers - but don't appear in the logs
nor in a report generated with "/usr/bin/spamc -R" over a webinterface
which proceeds uploads of eml-files :-(


Yes I see how that could be useful. It might work if you define a meta rule
that checks for there not being any of the BAYES_NN rules


but that wouldn't say anything useful because you can't distinct between 
"bayes not working at all", "to few training-messages" and "not enugh 
useful tokens"



That doesn't give
you access to _SENDERDOMAIN_ and _AUTHORDOMAIN_ tags, though


jep



signature.asc
Description: OpenPGP digital signature


Re: why does that mail not get any bayes-classification

2016-06-11 Thread Sidney Markowitz
Reindl Harald wrote on 12/06/16 2:13 AM:
> sadly that works only for mail-headers - but don't appear in the logs 
> nor in a report generated with "/usr/bin/spamc -R" over a webinterface 
> which proceeds uploads of eml-files :-(

Yes I see how that could be useful. It might work if you define a meta rule
that checks for there not being any of the BAYES_NN rules. That doesn't give
you access to _SENDERDOMAIN_ and _AUTHORDOMAIN_ tags, though.

 Sidney



Re: why does that mail not get any bayes-classification

2016-06-11 Thread Reindl Harald



Am 11.06.2016 um 15:55 schrieb Sidney Markowitz:

Reindl Harald wrote on 12/06/16 1:04 AM:

output of "spamassassin -D  < ignored_by_bayes_stripped.eml" attached


See this line in that output:

  Jun 11 14:47:00.510 [5188] dbg: bayes: cannot use bayes on this message; not
enough usable tokens found


i would expect a bayes result in any case and even if it's just a
informational BAYES_NOTOKS


Not by default, but see the tags in the next lines in your debug output,
starting with

Jun 11 14:47:00.510 [5188] dbg: check: tagrun - tag BAYESTCHAMMY is now ready,
value: 0
Jun 11 14:47:00.510 [5188] dbg: check: tagrun - tag BAYESTCSPAMMY is now
ready, value: 0

And see how you can add custom headers to your output that includes these tags
as documented here:

https://spamassassin.apache.org/full/3.4.x/doc/Mail_SpamAssassin_Conf.html#template_tags


sadly that works only for mail-headers - but don't appear in the logs 
nor in a report generated with "/usr/bin/spamc -R" over a webinterface 
which proceeds uploads of eml-files :-(


a test-tag BAYES_NOTOKS or however called would appear on all places 
while for the spamd-log _SENDERDOMAIN_ and _AUTHORDOMAIN_ would give the 
benefit that you could also find out something useful about a result 
when there is no message-id




signature.asc
Description: OpenPGP digital signature


Re: why does that mail not get any bayes-classification

2016-06-11 Thread Sidney Markowitz
Reindl Harald wrote on 12/06/16 1:04 AM:
> output of "spamassassin -D  < ignored_by_bayes_stripped.eml" attached

See this line in that output:

  Jun 11 14:47:00.510 [5188] dbg: bayes: cannot use bayes on this message; not
enough usable tokens found

> i would expect a bayes result in any case and even if it's just a 
> informational BAYES_NOTOKS

Not by default, but see the tags in the next lines in your debug output,
starting with

Jun 11 14:47:00.510 [5188] dbg: check: tagrun - tag BAYESTCHAMMY is now ready,
value: 0
Jun 11 14:47:00.510 [5188] dbg: check: tagrun - tag BAYESTCSPAMMY is now
ready, value: 0

And see how you can add custom headers to your output that includes these tags
as documented here:

https://spamassassin.apache.org/full/3.4.x/doc/Mail_SpamAssassin_Conf.html#template_tags

 Sidney



Re: why does that mail not get any bayes-classification

2016-06-10 Thread RW
On Sat, 11 Jun 2016 04:52:48 +0200
Reindl Harald wrote:

> Am 10.06.2016 um 23:52 schrieb RW:
> > On Fri, 10 Jun 2016 16:57:45 +0200
> > Reindl Harald wrote:
> >  
> >> see attachemnt, no bayes tag at all looks like a major bug
> >> somewhere  
> >
> > In the absence of any debug it's hard to say.  
> 
> hence i attached the sample

An email is not debug. I can't run it on *your* system.

> > It is possible for no tokens to make it through the selection, in
> > which case there is no result. That's more likely than normal in
> > your case since you don't train on headers.  
> 
> if you would have looked at the message you would have seen that
> there is content and not only headers and it looks like the message
> has just incorrect mime-definitions (missing end headers)

Of course I looked at it. And I ran it through spamassassin.

Aside from header tokens, what made it past the token selection on my
database was only:

   'marcus','Marcus','enclosed','invoice','business' and 'thank'

It's quite possible that all the body tokens in that email were
in the neutral range on your system, which would cause Bayes to exit
without producing a classification. 

In the absence of any debug against your database, there is nothing
particularly suspicious here.


Re: why does that mail not get any bayes-classification

2016-06-10 Thread David B Funk

On Sat, 11 Jun 2016, Reindl Harald wrote:




Am 10.06.2016 um 23:52 schrieb RW:

On Fri, 10 Jun 2016 16:57:45 +0200
Reindl Harald wrote:


see attachemnt, no bayes tag at all looks like a major bug somewhere


In the absence of any debug it's hard to say.


hence i attached the sample


It is possible for no tokens to make it through the selection, in which
case there is no result. That's more likely than normal in your case
since you don't train on headers.


if you would have looked at the message you would have seen that there is 
content and not only headers and it looks like the message has just incorrect 
mime-definitions (missing end headers)


since thunderbird shows the attachment as well as the mail content that would 
be a way for spammers to completly trick out SA


There may be a bug but I don't it is in the SA distro.

I took your sample and fed it to my SA kit. First time thru it hit BAYES_50, I
then did a "sa-learn --spam < /tmp/ignored_by_bayes_stripped.eml" and retested 
it. It then hit BAYES_999.


So I'd say standard SA + Bayes works on that message. Somebody at your site may
have done some modifications to your SA that is causing you problems.


--
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{

smime.p7s
Description: S/MIME Cryptographic Signature


Re: why does that mail not get any bayes-classification

2016-06-10 Thread Reindl Harald



Am 10.06.2016 um 23:52 schrieb RW:

On Fri, 10 Jun 2016 16:57:45 +0200
Reindl Harald wrote:


see attachemnt, no bayes tag at all looks like a major bug somewhere


In the absence of any debug it's hard to say.


hence i attached the sample


It is possible for no tokens to make it through the selection, in which
case there is no result. That's more likely than normal in your case
since you don't train on headers.


if you would have looked at the message you would have seen that there 
is content and not only headers and it looks like the message has just 
incorrect mime-definitions (missing end headers)


since thunderbird shows the attachment as well as the mail content that 
would be a way for spammers to completly trick out SA




signature.asc
Description: OpenPGP digital signature


Re: why does that mail not get any bayes-classification

2016-06-10 Thread RW
On Fri, 10 Jun 2016 16:57:45 +0200
Reindl Harald wrote:

> see attachemnt, no bayes tag at all looks like a major bug somewhere

In the absence of any debug it's hard to say.

It is possible for no tokens to make it through the selection, in which
case there is no result. That's more likely than normal in your case
since you don't train on headers.