Martin schrieb:

just for how to do

procedure foo(x: utf8string); begin end;

var a: string; //ansistring, but contains already utf8

The encoding will be stored or converted when a string is assigned to that variable. When the FPC implementation is finished, it should be impossible to have strings stored with a wrong encoding.

foo(a); // do not convert

Why not?


And what happens if an app did read data from some external source (serial port) and then wants to declare what encoding it is?
http://docwiki.embarcadero.com/VCL/en/System.SetCodePage


I hadn't seen that.

That may help. Though not the best solution...

It does *not* help, because SetCodePage does a string *conversion*, when it really changes the encoding. Delphi even had allowed to convert between UTF-16 (CP 1200) and other (byte oriented) encodings, but later disallowed such in-place conversions again. Now an UTF-16 (Delphi default) string is *always* converted, when it's passed to a subroutine expecting an RawByteString argument.

I can call it before calling the "foo" proc. But I must revert it afterwards, or at sometime later, the string will be translated, when it will be used in a normal string again (yet expected to keep being utf8..

IMO the only chance for fixing a wrong encoding is a TBytes (or similar) buffer, then copy the string content into it (without translation), and read it back specifying the correct encoding.

Yes, I know, what i want to do, is not what it was designed for. ultimately a huge update to the entire source will be needed... but now I need a temporary solution until then

You don't need a temporary solution, until the new strings are perfectly implemented in FPC. Afterwards you only have to take care for reading strings from *external* sources, where you have to specify the correct external encoding - see e.g. http://docwiki.embarcadero.com/VCL/en/Classes.TStrings.LoadFromStream
with the added Encoding argument.

When you want a variable to contain strings of a specific encoding, e.g. UTF-8, you simply give it the appropriate type. I assume that an UTF8String type will be declared like AnsiString<cpUTF8>, with appropriate constants being declared for the standard codepages.

DoDi

_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Reply via email to