On 14/08/17 22:01, Juha Manninen via Lazarus wrote:
Tony Whyman, this issue has been discussed again and again for the
past 10+ years first in FPC mailing lists and then in Lazarus lists.
The current Unicode support in Lazarus works f***ing well and is
amazingly compatible with Delphi.
WinAPI parameters may require an explicit temporary UnicodeString
variable but even then the code is compatible with Delphi.

Tony Whyman, Marcos Douglas and Michael Schnell, please study the facts.
For starters, this is about the current Unicode support in Lazarus:
   http://wiki.freepascal.org/Unicode_Support_in_Lazarus
I think the dynamic encoding and automatic conversion now work perfectly well.
If you have a piece of code where it does not work, please ask for
detailed info.
If a topic keeps on being discussed after 10+ years of argument, the reason is usually either (a) the problem and its solution have not been documented properly, or (b) the outcome is an unsatisfactory compromise.

In this case, I would argue that both are true.

I went back and read the wiki article you mentioned and was no more the wiser as to why the current mess exists. Is it really no more than because Delphi continues to screw up in this area, so must FPC? The body of the article appears to be a set of notes - not necessarily wrong in themselves but lacking the background and context needed to explain why it is like it is.

This problem will keep coming up until it is fixed properly and, by that, I mean the that solution is consistent, understandable intuitively and well documented. Windows eccentricity also need to kept to Windows.

Here is my wish list:

1. Stop using the term "Unicode".

   It is too ambiguous. It is used as both an all embracing term for
   multi-byte encoding and as a synonym for UTF16 and that is really
   too confusing. The problem is made worse by having UnicodeString as
   a two byte wide string type in both FPC and Delphi.


2. Clean up the char type.

   When Wirth created the "char" type in Pascal it was a simple ASCII
   or EBCDIC character. There are now seven different char types
   (including type equivalence) with no guidelines on when each is
   applicable. This is too many. Why shouldn't there be a single char
   type that intuitively represents a single character regardless of
   how many bytes are used to represent it. Yes, in a world where we
   have to live with UTF8, UTF16, UTF32, legacy code pages and Chinese
   variations on UTF8, that means that dynamic attributes have to be
   included in the type. But isn't that the only way to have consistent
   and intuitive character handling?


3. The problem with string handling today is that it is not based on a consistent approach to the character type.

   If you clean up character handling then the model for string
   handling should become obvious. A string is after all no more than a
   container for a character array and which should be constrained to
   have the same character encoding. A string should intuitively
   represent a string of text regardless of how many bytes are used to
   represent each character and with dynamic attributes to tell you how
   it is encoded.


4. FPC should clean up Delphi's mess for it. If a unified string type follows a consistent model then it should be possible to make all Delphi string types synonyms.

   You will need to allow exceptions for legacy programs that insist on
   manipulating the bytes themselves - but that is not rocket science.
   There is also the issue of the Windows API and its insistence on
   Wide Strings - but isn't that why calling conventions such as cdecl
   and stdcall exist - to tell the compiler when it needs to reformat
   the call for a given API convention.

Tony Whyman



-- 
_______________________________________________
Lazarus mailing list
Lazarus@lists.lazarus-ide.org
https://lists.lazarus-ide.org/listinfo/lazarus

Reply via email to