On Tuesday 01 July 2008 22.23:12 Marc Weustink wrote: > Martin Schreiber wrote: > > On Tuesday 01 July 2008 18.32:30 Mattias Gärtner wrote: > >>> In this routines length(widestring), widestring[index], pwidechar^, > >>> pwidechar[index], pwidechar + offset, pwidechar - pwidechar and > >>> inc(pwidechar)/dec(pwidechar) are used often. This can't be done with > >>> utf-8 strings. > >> > >> Ehm, do you know, that UTF-8 has the advantage, that many ascii > >> functions work without change? > >> For example ReplaceChar or searching a substring? > > > > Sure, but for layout calculation and the like we need fast access to > > codepoints. > > The only way to be sure is using utf-32 in this case. (or not supporting > unicode) > I'd like to repeat: We talk about the MSEgui framework here, not about FPC RTL or FCL. In MSEgui we need fast internal string and character handling routines which support UCS-2. UCS-2 is enough even for our single active Chinese user I know of. I don't want to slow down MSEgui for 100% of the MSEgui users because of the theoretical possibility that someone needs code points which don't fit into the base plane. If someone needs the whole unicode range he can use surrogate pairs. They will not show correct on screen, but all other tasks can be done. It is the same situation as with ansistring/utf8string. The use of 16bit instead of 8bit as storage base of the MSEgui string representation has the big advantage, that 100% of the MSEgui users can access characters by a simple linear index. Because MSEgui is mainly used by Russian speaking people, this would probably be less than 20% in case of 8bit. Most of the European users wold be out of luck because of the umlauts and accents. Another need of the MSEgui users and the MSEgui routines is converting internal string representation to the current 8bit system encoding. FPC supports this perfectly by the widestringmanager already. Xlib and gdi both have a widestring interface. The only drawback I see is that there is no reference counted FPC widestring type in Windows at the moment. The upcoming new Delphi version uses a simple reference counted widestring as string base type too AFAIK. So if FPC decides to implement a referencecounted widestring on Windows for Delphi compatibility, it should be available in OBJFPC mode too. Conclusion: MSEgui, and propably most of the MSEgui users too, has no need for a multi encoding string type at the expense of slower code and more memory consumption, a referencecounted widestring on Windows would be enough.
Martin _______________________________________________ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal