Re: [fpc-devel] Unicode support - for the 20th time... ;-)

Jonas Maebe Thu, 20 Nov 2008 04:25:38 -0800


On 20 Nov 2008, at 13:13, Graeme Geldenhuys wrote:

I think basing those functions on code points should suffice.  I also
think as soon as strings are assigned or loaded from file, they should
be normalized. So two code points like the A and Umlaut code points
would become one.

How would one know which code points were originally decomposed andwhich weren't? Should it be impossible to save a file thatdemonstrates the different possible UTF encodings of e.g. ö, andshould a loaded/saved file which contained both encodings really beautomatically entirely composed or decomposed when saved again?

I know of no text editor that handles UTF which automatically changesthe encoding of pre-existing characters when saving the documents. AndI would never want to use a text editor which does that by default.

The .SaveToFile() methods could take an optional parameter to decide
if the normalized version of the string gets saved, or if it must be
split again - which I think Mac OS-X prefers.

It doesn't. All OS functions that return file/path names returndecomposed (UTF-8)strings. They accept both composed and decomposedstrings. Text files are text files and can have any encoding you want,with any combination of composed and decomposed characters.



Jonas_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

Reply via email to