https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6143





--- Comment #29 from Sidney Markowitz <[email protected]>  2009-07-07 22:22:41 
PST ---
Ok, this is very weird.

It appears that BodyRuleBaseExtractor.pm stops when it hits \x{00} when it is
extracting a base_string from the original string of the pattern if there is a
\x{80} through \x{ff} character anywhere in the pattern, either before or after
the \x{00}.

You can see this by creating a rule, for example I put it in local.cf in the
update directory, run sa-compile --list, then in the sa-compile output look for
the rule name, first with the "orig" label which shows the original pattern,
then the "r" label which shows the base string that is extracted from it. If
you make the pattern a simple string with no regexp operators, e.g.
/FOO\x{00}BAR/ you will see that the "r" listing has /foo\x{0}bar/. But put
\x{ff} or \x{80} anywhere in the pattern and the "r" listing shows the pattern
truncated just before the \x{00}.

I haven't pored through the code in BodyRuleBaseExtractor.pm to figure out why
it does it, and I'm not sure if I'll have time to, so if someone more familiar
with that code can look at it, that would be great.

-- 
Configure bugmail: 
https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

Reply via email to