Re: [fpc-pascal] Console Encoding in Windows (Local VS. UTF8)

Michael Schnell Tue, 30 Jul 2013 00:57:20 -0700

On 07/30/2013 04:29 AM, Noah Silva wrote:

No, UTF16 only needs more memory if most of the text is ASCII. Itactually uses less than UTF8 in the average case for Japanese, forexample.

Of course you are right here.


    Linux OS API in most cases is 8 Bit,


I assume by 8bit, you mean variable byte encoding like UTF8.

Yep.

    Conversions are very expensive.
This is not as bad as some people make it out to be. You have to beconverting a *lot* of data for it to be noticeable.

That is why I pointed out that the way to select an encoding depends onhow much "calculations" are done on the strings.

But in fact I tend to agree, while the argument why - when converting toUnicode - the Lazarus team chose to do the LCL API in UTF-8 (while MSEchose UTF-16 for the same purpose) was exactly this (I never feltcomfortable with that, BTW).

> I suppose this is bound to change once fpc has completed the move to"new Delphi Strings".
I really don't think so, the reasons are even well detailed in the Wiki.

I always was told that Delphi compatibility is the primary driving forthfor any modifications. This necessarily suggests this move (which is notpossible before fpc does provides "new Delphi Strings"). But there mightbe multiple opinions.

In fact my primary intentions with Lazarus / fpc are not to do my owngeneric projects, but to help my colleagues to move their huge Delphi XEprogram system to Linux. This in fact needs complete support for "newDelphi Strings".

From what I understand, the plan is for strings to store theircodepage as an attribute internally along with their length, and sincethe compiler/runtime library will know their codepage, it can convertas necessary.

That already is ready to use in the svn and is exactly the said "newDelphi Strings", and - when activated - completely compatible withDelphi XE. It's rather nice and fast, but Delphi lacks a_completely_dynamic_encoding_ type with auto-conversion only whennecessary. (IMHO rather easy doable by compiler magic, but "forgotten"in Delphi XE)

Either way, you can make your own StringList variants for each typeeasily enough.

Not without compiler support (if you want auto-conversion when necessary).

In fact, I am fine with manual conversions, so long as 99% ofeverything "just works" with UTF8 and/or UTF16.

I'm not fine with TStringList and friends forcing any predefinedencoding. This in fact does work rather nicely without the applicationprogrammer even noticing it. But IMHO a cross platform system like fpccan be expected to do better, doing away with windowish remains fromDelphi whenever possible.


-Michael

_______________________________________________
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] Console Encoding in Windows (Local VS. UTF8)

Reply via email to