On 16 Sep 2009, at 11:44, Michael Schnell wrote:

Jonas Maebe wrote:

Analysing strings by hand not a very smart thing to do with unicode
strings.

How should it be avoided if I want to react on a user input or on a
string read from a file ?

Don't analyse them character by character, but use standard functions to compare them. Any unicode support library worth its salt will offer you many different ways to compare strings, because depending on the context you may need different ways:

a) the locale may matter (e.g., depending on whether "." means "decimal point" or "thousands separator", a comparison result may be different) b) you have many different ways to order (unicode) strings. E.g., these are the options that Apple's CFString comparison offers: <http://developer.apple.com/mac/library/documentation/CoreFoundation/Reference/CFStringRef/Reference/reference.html#//apple_ref/doc/constant_group/String_Comparison_Flags > (note that not all of those flags are about regular comparisons, and some of them are just for performance reasons). See in particular flags such as kCFCompareNonliteral, kCFCompareWidthInsensitive and kCFCompareLocalized.

This indeed causes problems with Pascal's generic comparison operators. I guess we will either have to define a particular behaviour for them (presumably whatever CodeGear chose), add some global variable that you can set to influence the behaviour, or tell people to use CompareText() and friends (and probably add variants with various options).

The upside of these complications (which have always existed, but most people just ignored them and their programs only worked with one or two locales and/or encodings), is that if you deal with it properly in the context of unicode, then your code will probably automatically behave "correctly" with many locales/scripts.


Jonas
_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Reply via email to