bug#16912: [PATCH] no longer use CSET for non-UTF8 locale in DFA engine

Paolo Bonzini Tue, 04 Mar 2014 07:51:44 -0800

Il 03/03/2014 07:13, Paul Eggert ha scritto:

Norihiro Tanaka wrote:

However I don't understand why the optimization isn't completed on
non-UTF8 locale only.  Can you explain it?


Sorry, no; there's a lot about that code I don't yet understand.

IIRC it's because a CSET matches any byte, while the correspondingMBCSET only matches that byte if it is a single-byte character. So forexample, say "\x83A" is a two-byte character. The CSET "A" will matchit but the corresponding MBCSET will not.


This can happen in the Shift-JIS encoding.

Paolo

bug#16912: [PATCH] no longer use CSET for non-UTF8 locale in DFA engine

Reply via email to