On Thu, 2014-05-22 at 03:12 +0200, Karsten Bräckelmann wrote: > In either case, having a sample would speed up this ping-pong style > debugging. And I am curious. ;) Mind putting your sample up a pastebin?
Ian sent me the original message off-list. It indeed contains about 16 consecutive newlines, but doesn't trigger the rawbody rules discussed. The issue is not related to rawbody being split up into chunks. A stripped down test-case is easy to generate: echo -e "\n\n~\n\n\n\nend" That's an empty mail header and a very short text body, consisting of consecutive newlines. The tilde and end string are merely there for anchoring and visualizing the match. The rule for debugging the issue is the same I posted before, just slightly modified to better visualize the match. rawbody __BLANKS /.\n{2,}/ tflags __BLANKS multiple Feeding the test-case to spamassassin -D, the debug output shows the match like the following: dbg: rules: ran rawbody rule __BLANKS ======> got hit: "~ dbg: rules: [...] ... dbg: rules: [...] " The number of continuation lines equals the number of newlines in the test-case. Well, up until 12, that is. :-/ Any number up to 11 of consecutive newlines can be matched with rawbody rules. However, 12 or more consecutive newlines will be squeezed and replaced by exactly two newlines. I've had a quick look at the code already, but did not yet find where the supposedly raw (sic) body gets altered. -- char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4"; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1: (c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}