Josiah Carlson: >From what I have noticed in my 3-4 years of working with Scintilla, its > underlying native representation is UTF-8 in GTK and Windows,
The underlying native representation is defined with the code page and character set properties and can be multi-byte including UTF-8, Shift-JIS and Big5 or single byte such as Latin-1 or KOI8-R. The regular expression code doesn't really understand these encodings so is matching multi-byte encodings by matching byte strings: it doesn't try to ensure that characters align or that character ranges above ASCII like "[Γ-Ξ]" work. It would be asking too much to require new regular expression code to handle all the encodings correctly but it shouldn't behave worse. Neil _______________________________________________ Scintilla-interest mailing list [email protected] http://mailman.lyra.org/mailman/listinfo/scintilla-interest
