Josiah Carlson:

>From what I have noticed in my 3-4 years of working with Scintilla, its
> underlying native representation is UTF-8 in GTK and Windows, 

   The underlying native representation is defined with the code page
and character set properties and can be multi-byte including UTF-8,
Shift-JIS and Big5 or single byte such as Latin-1 or KOI8-R. The regular
expression code doesn't really understand these encodings so is matching
multi-byte encodings by matching byte strings: it doesn't try to ensure
that characters align or that character ranges above ASCII like "[Γ-Ξ]"
work.

   It would be asking too much to require new regular expression code to
handle all the encodings correctly but it shouldn't behave worse.

   Neil
_______________________________________________
Scintilla-interest mailing list
[email protected]
http://mailman.lyra.org/mailman/listinfo/scintilla-interest

Reply via email to