On 01/05/17 15:18, Juha Manninen via Lazarus wrote:
On Mon, May 1, 2017 at 12:30 PM, Tony Whyman via Lazarus
<lazarus@lists.lazarus-ide.org> wrote:
When I originally created the Firebird Pascal API package,
Now I realize your code may have been for FPC but not for Lazarus.
Even then the solution provided by LazUtils (2 files there) is good
because it allows compatible and portable code. Later when FPC's
UTF-16 support is ready, such code can be ported easily.

Juha
I assume that you mean that my code is non-visual which is indeed where I am coming from. If you want to write an application that is LCL/VCL compatible then that is another can of worms.

Your concluding remarks in your other post were:

>>I hope you find this a useful checklist.
It contained so much false information that it only confuses people.

I want to repeat that it is possible to write code dealing with
Unicode that is fully compatible with Delphi at source level.
It will be compatible with a future UTF-16 solution in Lazarus as well.
Encoding agnostic (UTF-8 / UTF-16) code is possible even if you must
iterate individual codepoints. See the wiki page for details.

Remember these to keep your code compatible:
  1. Normally use type "String".
  1. Assign a constant always to a type String variable.
  2. Use type UnicodeString explicitly for API calls that need it.
I am not sure how much your second post rows back from this but I do think that false is a bit harsh.

You seem to be coming from a view that strings are strings and the compiler should be allowed to work out what is the appropriate string encoding for the local environment. All the programmer has to do is declare the type as "string" and all will be good. I guess that is your definition of portable code: it is agnostic as regards the string encoding.

I am coming from a much messier perspective that says a portable program has to deal with whatever string encoding is thrown at it. It may be valid criticism to say that I was taking a particularly messy example and deriving generic rules from it - but few programs work in a vacuum and it is worth being aware of real world problems.

I my case, the real world problem is Firebird. Firebird will expect or give you a string encoded not according to the local environment but that which was specified for the database connection and it is the API user that decides this and not the API. Ideally, the user specifies UTF8, but Firebird supports many other string encodings - but not UTF16 or Unicode at present. In the original version of the library, the API was defined using the "string" type as were the internal structures. When I looked at moving to Delphi support, there was no way that this would work if "string" suddenly became "UnicodeString". All over the place I had assumed that "string" meant "AnsiString" including checking and setting the code page in order to match the connection character set with whatever code page was being used by the API user.

Could I have written the API without being aware of the character encoding? I doubt it. The connection character set is not something that the compiler can be aware of. Part of the role of the API library is to manage the character encoding on behalf of the user. On the other hand, by defining the API using the explicit AnsiString type, it should mean that if the API user uses the "string" type, then the compiler can automatically transliterate from the API to the API user's string types when string means "UnicodeString".

So is my messy example typical or atypical? Am I correct in offering it as a source of rules. Ideally, it is atypical. However, I would observe that few programs exist in isolation. They have to deal with external objects such as files, GUIs and TCP connections. The compiler cannot work out the character encoding for itself in these cases and either your program or some intermediate library has to be character coding aware in order to deal with these objects.

The bottom line is that it would be great if we never needed to be aware of the character encoding behind the string type. However, all too often you do and, because of that, when you are writing code that is portable between platforms and compilers, you either needed to be explicit in the string type throughout your program, or at least in the modules that deal with external interfaces.

Tony Whyman
--
_______________________________________________
Lazarus mailing list
Lazarus@lists.lazarus-ide.org
http://lists.lazarus-ide.org/listinfo/lazarus

Reply via email to