Re: synedit patch from ales

Ales Katona Fri, 25 Jan 2008 08:36:39 -0800

Mattias Gärtner  wrote / napísal(a):


The character sets in synedit are 'set of char', which means only 8bit.
So, I guess the patch tries to fix an ANSI codepage accented chars problem,
right?
The fix is probably useless on other codepages including UTF-8, right?


Not as such. The problem is two fold.

1. If we ignore encoding (eg: just work in ansi space), then the oldstyle was simply plain wrong. It only allowed alpha (not num) chars, andworked on the principle of "what's not alpha, isn't a word".

2. If we also consider UTF-8 encoded content, then getting words byboundaries (eg: not-allowed chars) and not by allowed-chars means thatas long as given boundaries and whitespaces are < 127 (which the defaultones are), UTF-8 words will be parsed right, even if they containspecial multibyte chars.


I'm not sure if #2 applies also to some other encoding.

Ales


Mattias

_________________________________________________________________
     To unsubscribe: mail [EMAIL PROTECTED] with
                "unsubscribe" as the Subject
   archives at http://www.lazarus.freepascal.org/mailarchives


_________________________________________________________________
    To unsubscribe: mail [EMAIL PROTECTED] with
               "unsubscribe" as the Subject
  archives at http://www.lazarus.freepascal.org/mailarchives

Re: synedit patch from ales

Reply via email to