In our previous episode, Jonas Maebe said: > > There has been a lot of discussion about this problem. What happens is > > that FPC wishes to always have ansistrings holding system locale > > encoded strings, it's impossible to have strings which store utf-8 > > data as far as FPC is concerned. > > And the reason is that > a) if you mix system and non-system encodings in ansistrings, then a > bunch of string conversions between ansistrings and widestrings will > go horribly wrong > b) if you only use a particular non-system encoding for ansistrings, > then interfacing with OS routines will break down completely > > It is possible to solve b) by manually adding necessary extra string > conversions everywhere in the RTL where ansistrings are passed to OS > routines, but that is a lot of work (both to implement and to > maintain) and very error prone. Then it's indeed much cleaner to > simply introduce a new string type which does not have to be > compatible with the OS encoding.
The solution of Tiburon is the same as Florian's original solution for the multi unicode string type TUnicodeString (that now is still UTF16 only): add an encoding field to ansistring, and alter ansistring declaration with an encoding type: Type TUtf8String = ansistring (cp_UTF8); This way you can explicitely flag anything internal as UTF-8, and communicate with the outside 1-byte world using the native codepage (which might be UTF-8 too, if desired) The solution has Windows written all over it (including viewer UTF-8 as a codepage), but it has merits IMHO. _______________________________________________ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal