Is it possible to create a custom rule that looks at the charset= string in the Content-Type header?

We're getting a lot of Chinese language spam here at the moment (charset="gb2312") and they're only scoring in about a 6.3, but I'd like to push that slightly higher. I'm thinking that the proper response would be to write a header rule to score those a bit heavier (possibly +1.5).

Seems like the proper way is:

header LOCAL_CHARSET_ZHO Content-Type =~ /charset=\"?(gbk|gb2312|gb18030)\"?/i
score LOCAL_CHARSET_ZHO 3.5 3.0 2.5 2.0

So, my questions are:

1) Does a score like that make sense? Too high? Too low? Are there guidelines or suggestions for using the advanced scores?

2) Does the perl regex look correct? It fires on the test message and I think it will properly handle variations on the usual "charset=" strings.

3) Is anyone else using a similar rule or rule set? Have you found it useful or not?

4) What happens if the Content-Type header is spread across more then one line?

Reply via email to