Daniël Mantione wrote: > Op Wed, 16 Nov 2005, schreef Tomas Hajny: > >> Big overhead (double maintenance efforts for all targets supporting this >> schisma). :-( I'd say it's better to successively identify the weak >> points >> and address these case by case. > > I know, I'm all for abolishing Chinese (and perhaps Korean), the only > language(s) that absolutely cannot be written in with an 8 byte code.....
Well, I didn't suggest abolishing of any languages (or maybe... - C#? ;-) ). > However, it won't work that way in this world and also people in other > scripts tend to like Unicode. > > Now, take for example the FCL. Tstringlist uses ansistrings, for example. This is one particular weak point which needs to be addressed, IMHO. > Now there are several solutions: > 1. Make a Twidestringlist > 2. Make Tstringlist use widestrings internally and add methods for both > ansistrings and widestrings. > 3. Make an FCL with ansistrings and an FCL with widestrings. > > I have been convinced that 3 causes the least maintenance trouble and the > least overhead for people that don't need it. The reason is strings are > used everywhere in about any library. It cannot be reduced to a few weak > points :/ You're right that strings are used everywhere, but I don't think that this really means that you need to add special support for widestrings everywhere. In many places you can pass a DBCS/MBCS string to it today (e.g. encoded using UTF-8) and it wouldn't cause any harm. From my point of view, you need some kind of special support mainly for sort operations (which includes your TList) and then for visual classes (length of text for controls, etc.). In addition, you certainly need to have a proper routines for I/O. However, e.g. your particular example in the forum discussion is IMHO conceptually wrong. Turning a string around just cannot be performed this way (this is unsupported by design for DBCS/MBCS texts; not even mentioning the fact that the example is "somewhat" artificial). People who want to perform such an operation need to analyse and design the implementation properly, probably by translating the ansistring to a widestring first in this case. How this translation is performed is another question and it depends on programmer's decision. It could be that the string already _is_ an UCS2 string (and "translation" to widestring means that you just copy it byte by byte), it could be UTF-8 and it could be even a simple string created in particular codepage (SBCS). This is programmer's decision (trade-off between the widest support and the best performance); the same way that he has to decide whether he'd use multi-platform APIs or native API of a particular platform, or whether he'd use/import XxxxW or XxxxA API function for his Win32 application. Maybe I'm still overlooking the real issues. Please, give me more concrete examples which cannot be resolved at the moment, we could discuss them (and then possibly come to a conclusion that separate RTL would be better/necessary). Tomas _______________________________________________ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel