On Wed, 2010-04-21 at 17:45 -0400, Kevin A. McGrail wrote: > > So the message in question actually was NOT spam, and falsely reported > > to be by spamd. > > Exactly. But "falsely" reported by spamd may be a bit harsh. It's a > confusing rounding error, though that had me very confused
I didn't mean to be harsh. But to get the facts straight, and stress the point that according to the analysis the message actually was marginally below the spam threshold. Or in other words, $is_spam == 0. Hence, falsely. :) > What we are doing is using spamd with the -R option which gives the > score and threshold as the first line. We are using the first line of > data to test and see if score >= threshold. > > If we do use -E, the error level is accurately reporting the is_spam > status but the inconsistency on scores is still something I consider a bug. No arguing here. At the very least, this is inconsistent behavior. > >> However, SpamD/C uses rounding to the 10th for the output of the first > >> line but then utilizes PerMsgStatus.pm for the report, etc. > >> > > Hmm, the patch is code duplication. And that in a case that caused > > confusion before. Would be nice if spamd could use PMS functions, rather > > than duplicating the code. No, I did not look closely at the surrounding > > code, just a quick look at the patch. :) > > > > And there's a bug in your patch, more precisely the comments, added to > > both PMS and spamd. The comment right before the final return *should* > > read, with the relevant "not" already added: > > > > + # if the email is NOT spam and $score = $rscore, return the $rscore - 0.1 > > + # effectively flooring the value to the closest tenth > > > Thanks. I fixed this in my routine and left it in the PMS. Another > reason why code duplication is bad. Please don't do that. The PMS comment actually is part of your patch, attachment 4750. Regarding code duplication... See, the patch didn't even land in trunk yet, but we got inconsistent comments already. *ducks* ;) > OK, I believe this routine belongs in Util.pm and will submit a new > patch that unifies the code. Unifying sounds really good. > https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6419 -- char *t="\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4"; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1: (c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}