Then you don't understand it yet, I think.
May be....
If the compiler knows your source file is UTF-8 (by BOM or directive), the compiler generates a widestring constant and no conversion function is called when assigning to a widestring.
In my test the source code is not UTF8 but UCS2 and does have a correct BOM. The compiler does work correctly with that and (it seems to be designed that way) converts the string constant source UCS2->UTF8, supposedly temporarily assigning a type like UTF8String to the constant. As the compiler is not aware of the type UTDF8String and errineously identifies it with ANSIString, now the resulting code interprets the constant as ANSIString and the conversion to WideString results in a false text.

However, if you assing this constant to an utf8string, the compiler does a wide->ansi conversion, which is done according to the system code page, as the compiler does not know the difference between an ansistring and an utf8string. In this case you would need to utf8decode your widestring constant to get in in your ansistring in UTF-8 encoding.
I am aware of what is happening here, but it in fact is not the way a decent system should work.

A decent system should be able to do the necessary conversions automatically:

var
 ws1, ws2: widestring;
 us1, us2: utf8string;
begin
..
ws1 := 'ö2';
us1 := 'ü3';
ws2 := us1;
us2 := ws1;
memo1.lines.add(inttostr(length(w1));  // should show 2
memo1.lines.add(inttostr(length(w2));  // should show 2
memo1.lines.add(inttostr(length(u1)); // should show 3 (even if I would like 2, but counting in subcodes is much quicker)
memo1.lines.add(inttostr(length(u2));  // should show 3
memo1.lines.add(ws1);  // should show ö2
memo1.lines.add(us1);  // should show ö2
memo1.lines.add(ws1);  // should show ü3
memo1.lines.add(us1);  // should show ü3

end;

This should hold independently of the code the source code is stored in (ANSI: no BOM and no other directive, UZF8: BOM=?, UCS2: BOM=FFFF, UTF8: BOM=?, UCS4: BOM=?)

-Michael
_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Reply via email to