Re: [Lazarus] String vs WideString

Tony Whyman via Lazarus Tue, 15 Aug 2017 02:16:30 -0700

On 14/08/17 22:01, Juha Manninen via Lazarus wrote:

Tony Whyman, this issue has been discussed again and again for the
past 10+ years first in FPC mailing lists and then in Lazarus lists.
The current Unicode support in Lazarus works f***ing well and is
amazingly compatible with Delphi.
WinAPI parameters may require an explicit temporary UnicodeString
variable but even then the code is compatible with Delphi.


Tony Whyman, Marcos Douglas and Michael Schnell, please study the facts.
For starters, this is about the current Unicode support in Lazarus:
   http://wiki.freepascal.org/Unicode_Support_in_Lazarus
I think the dynamic encoding and automatic conversion now work perfectly well.
If you have a piece of code where it does not work, please ask for
detailed info.

If a topic keeps on being discussed after 10+ years of argument, thereason is usually either (a) the problem and its solution have not beendocumented properly, or (b) the outcome is an unsatisfactory compromise.


In this case, I would argue that both are true.

I went back and read the wiki article you mentioned and was no more thewiser as to why the current mess exists. Is it really no more thanbecause Delphi continues to screw up in this area, so must FPC? The bodyof the article appears to be a set of notes - not necessarily wrong inthemselves but lacking the background and context needed to explain whyit is like it is.

This problem will keep coming up until it is fixed properly and, bythat, I mean the that solution is consistent, understandable intuitivelyand well documented. Windows eccentricity also need to kept to Windows.


Here is my wish list:

1. Stop using the term "Unicode".

   It is too ambiguous. It is used as both an all embracing term for
   multi-byte encoding and as a synonym for UTF16 and that is really
   too confusing. The problem is made worse by having UnicodeString as
   a two byte wide string type in both FPC and Delphi.


2. Clean up the char type.

   When Wirth created the "char" type in Pascal it was a simple ASCII
   or EBCDIC character. There are now seven different char types
   (including type equivalence) with no guidelines on when each is
   applicable. This is too many. Why shouldn't there be a single char
   type that intuitively represents a single character regardless of
   how many bytes are used to represent it. Yes, in a world where we
   have to live with UTF8, UTF16, UTF32, legacy code pages and Chinese
   variations on UTF8, that means that dynamic attributes have to be
   included in the type. But isn't that the only way to have consistent
   and intuitive character handling?

3. The problem with string handling today is that it is not based on aconsistent approach to the character type.


   If you clean up character handling then the model for string
   handling should become obvious. A string is after all no more than a
   container for a character array and which should be constrained to
   have the same character encoding. A string should intuitively
   represent a string of text regardless of how many bytes are used to
   represent each character and with dynamic attributes to tell you how
   it is encoded.

4. FPC should clean up Delphi's mess for it. If a unified string typefollows a consistent model then it should be possible to make all Delphistring types synonyms.


   You will need to allow exceptions for legacy programs that insist on
   manipulating the bytes themselves - but that is not rocket science.
   There is also the issue of the Windows API and its insistence on
   Wide Strings - but isn't that why calling conventions such as cdecl
   and stdcall exist - to tell the compiler when it needs to reformat
   the call for a given API convention.

Tony Whyman

-- 
_______________________________________________
Lazarus mailing list
Lazarus@lists.lazarus-ide.org
https://lists.lazarus-ide.org/listinfo/lazarus

Re: [Lazarus] String vs WideString

Reply via email to