Marco van de Voort schrieb:
In our previous episode, Hans-Peter Diettrich said:
"non-native" strings, it can also be a performance win).
IMO a single encoding, i.e. UTF-8, can cover all cases.
Well, for starters, it doesn't cover the existing Delphi/unicode codebase.
Because it's bound to UTF-16? That's not a problem, because WideString will continue to exist, and according conversions are still inserted by the compiler.

That is DIY compatibility, or, in other words, no compaibility.

I still don't understand the problem :-(

Widestring will also grind the application to a halt due to being COM based
on Windows.

How that?


When system encoding changes with the target platform, indexed access to such strings can lead to different results. Unless the compiler can read the coder's mind...

You don't have to. The Delphi model provides a stringtype for the system
encoding, and then as such all strings from the system can be labeled. With
other stringtypes, the necessary conversions can be edited.

Indexed string access produces other results for Ansi and UTF-8 system encoding. Such code is not portable, and the data (ini files) are not, too. Allowing for UTF-8 as the system encoding will frustrate Windows users (dunno whether Windows allows for such a system encoding), and Linux users are frustrated when UTF-8 is disallowed.

Only solution: using OS encoding restricts the code to run on a single machine only, or on similarly configured machines.

The group of users, which accept this restriction, will be happy with a single AnsiString type and no implicit conversions. Without implicit conversions such a string type can hold UTF-8 as well.


Likewise, e.g. win32 console routines can be labeled with OEMString. (Since
windows uses a different default encoding for the console)

This either implies OEM encoding as the system encoding of Win32 console applications, or the use of multiple codepages, as before. But IMO Win32 console also implements a "W" interface, so that it's up to the user to use whatever is more appropriate for his code.

The RTL has to distinguish between system-wide "filesystem" and "GUI" encoding, in file handling (CreateFile...).


Why spend time in the design of multiple RTL/LCL versions, when a single version will be perfectly sufficient?
Why spent 13 years being compatible when you can throw it away in a
second?
It's sufficient to throw away what's no more needed :-)

The previous message from Jeff shows that even shortstring is still in major
production use. Nothing is unused and can be clipped without a long winded
transition, or Delphi 2009 like painful breaks.

It's all about the well known dilemma:
- force (possibly many) implicit conversions, or
- supply multiple RTL/LCL versions, or
- break legacy user code by moving to a different (but again unique) string type.

Moreover, these discussions are useless since you know as well as I do that
no one stringtype will ever satisfy everybody. So IMHO it is time to take
the consequences from the 500 posts on this subject on the unicode subject
on this and other FPC/Lazarus lists and start thinking in solutions to
manage that, instead of reiterating the "one type to rule them all" mantra
ad infinitum.

The discussion is only about the pros and cons of the various possible solutions. I.e. it should reveal the critical cases and consequences, that have to be considered and handled in every implementation.

The implementation can choose any model. Different models can be implemented as well, so that the final decision about the new standard can be delayed, until the models can be tested in real world applications.

One model has already been implemented: UTF-8. It may need some adds/improvements, like a *hard* separation of AnsiString from UTF8String, and nothing has to be thrown away.

DoDi

_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Reply via email to