-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
MATSUDA Yoh-ichi writes: > > - Writing rule with hex notation is troublesome, boaring and decreases > > productivity. If we could normalize charset, we could write rule > > directly with UTF-8 aware editor. > > Yes. > Directly writing REGEX rule with UTF-8 character is very convenience. > But I think character normalization and tokenization before body > testing is troublesome. > Because, character normalization and tokenization is modifying > message text, so REGEX rule writer can't recognize against the > modified text. > > Many rules are written for pure plain message text. > If character normalization and tokenization are inserted before body > testing, many body rules will be unavailable. > > So, > > > > But, if the character normalization will insert before body testing, > > > my rule will be unavailable. > > > > > > Do I have to re-write the above 2 rules from [body] to [rawbody]? > > > > There are two possibilities. > > > > (1) rewrite from BODY to RAWBODY as Matsuda-san says. > > (2) invent NBODY (or something else) apart from BODY. NBODY contains > > normalized and tokenized version of body. I once thought of this > > idea but did not propose because BODY has problems I mentioned > > above and overhead of executing nbody_test increases. > > I want (2), for the reason of compatibility of rules. +1, agreed. - --j. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (GNU/Linux) Comment: Exmh CVS iD8DBQFDzVTuMJF5cimLx9ARAqxOAKCILBFwluZj3/yicF3aPBSTpy8vigCgkZ7C kn0sKCBOmjDJRpSRh5LYVsw= =eJbr -----END PGP SIGNATURE-----
