Michael Van Canneyt wrote:
You are mixing 2 things. There is the actual string content, and there is the
string metadata. The metadata is something that would apply for flyweight
pattern. There is nothing to be gained by putting the metadata in an object,

This is true --upto a point.

And, that point arises when you wish to be able to work further with a TCharacter.

Say, you're doing text processing --display and all. You would definitely like to be able to derive a new class from TCharacter and call it, say, TWPCharacter which contains all sorts of other properties, color, style, font, size etc.

This would make life immensely easier for such jobs whereby a character may need to have more attributes than there exists in the base class.

since there is only the encoding. Storing the encoding in an object is
ridiculous and a waste of heap space. a 2 byte encoding is less wasteful
than a 4 or 8 byte object pointer.

I am afraid I do not agree with this at all. Or rather, it comes accross a very ANSI-centric view.

You definitely need a 'language' attribute for a character.

'Locale' does not cut it simply because you can have mixed text i.e. portions that belong to a different language.

Some weird characters in a my locale (say, Turkish) does not mecessarily mean that that piece of string is in another language --it may well be a transcription of /my/ name in a different character set (say, Greek).

Yet, we all know that, (upper-, lower, title-) casing has nothing to do with the encoding; nor does collation order etc.

In the above example, I used Turkish and Greek {what an unfortunate pairing, some might say :) } on purpose:

Both of which follow their own case folding rules, as well as their own collation orders which are both dependent upon a language attribute/property.

Without a language attribute, how would you handle these sorts of issues?

Using a parallel byte array?

Really?

Wouldn't it be a lot more humane to us developers if the TCharacter had properties such as

-- Language
-- CollationOrder
-- UpperCase
-- LowerCase
-- TittlerCase

where, on setting the Language propery, all others get filled with their correct values and are read-only.

Cheers,
Adem
_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Reply via email to