Title: RE: Cyrillic chars in rule regex ?

Thanks,

I had tried that, but that RexEx class is not supported in PERL 5.6.1 (at least not in my install).

I'm really looking for a way to match specific phrases anyways.  Like "Мебельный фургон. Оцинкованная будка"  (as it is displayed in my browser)

The two basic RegEx's I've composed to match a string like this fails (even if I try a short phrase, "фургон").

/фургон/  or  /\ф\у\р\г\о\н/

I'm guessing that is because my RegEx is looking for those specific chars, but the chars in the email itself are not o's and a' with accents, there are full Cyrillic chars. I suppose I'm seeing the ASCCI equivalent of the Cyrillic chars?

Maybe mapping the displayed chars to ASCII, then RegEx the ASCII hex?  I'm just not so sure how the email charset is interpreted by the SA regex.

I know what I'm seeing, or how my computer is trying to display a charset it can't show correctly ... I'm just unsure of what kind of conversion steps I may need to match these specific phrases?

Thanks again,
Shane





-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
Sent: Monday, October 04, 2004 4:52 PM
To: users@spamassassin.apache.org
Subject: RE: Cyrillic chars in rule regex ?


Shane Metler wrote:
> I know this is more of a RegEx question, but I have been very
> unsuccessful at finding out how to match Cyrillic characters in Spam
> Assassin rules.

\p{Cyrillic} comes to mind.  Not sure what version of Perl is required.

Untested:
body    CONTAINS_CYRILLIC_CHARACTERS    /\p{Cyrillic}/
score   CONTAINS_CYRILLIC_CHARACTERS    0.1

[EMAIL PROTECTED]                      805.964.4554 x902
Hispanic Business Inc./HireDiversity.com         Software Engineer
perl -e"map{y/a-z/l-za-k/;print}shift" "Jjhi pcdiwtg Ptga wprztg,"

Reply via email to