Michael Van Canneyt schrieb:

If you want a TStrings that can hold strings which may differ in their encoding (i.e. strings[0] has a different encoding from strings[1]) then you'll be left in the cold.

Just an idea:
What if FPC adds another encoding, similar to RawByteString ($FFFF), but without the Delphi quirks? Or simply fix the RawByteString flaws in the *Ansi* compiler and RTL?

1) In a discussion in the Embarcadero groups it turned out that, in an assignment of a RawByteString to another AnsiString type, the Delphi compiler should (but does not) check and eventually convert the string to the static encoding of the target. This is (almost) the only way to create strings with a different static and dynamic encoding.

2) The stupid conversion to CP_ACP in an assignment *to* an RawByteString should be dropped. This applies in detail to the assignment to *function results*.

3) The function result type should be honored, in functions accepting RawByteString parameters. The Delphi compiler seems to *assume* that the results of such functions is RawByteString, so that (including beforementioned flaws) the outcome is a CP_ACP string, even if the declared function result is e.g. an UTF8String.

Test case:
  function conc(a,b: RawByteString): UTF8String;
  begin Result := a+b; end;
The same result as for
  function conc(a,b: RawByteString): RawByteString;
  begin Result := a+b; end;
the returned string has CP_ACP encoding :-(


When these flaws are fixed in the FPC compiler, the AnsiString types will always have the same static and dynamic encoding, as it should be.

Then TStrings could be based on such RawByteStrings, without excess conversions or losses. Sorting (TStringList) eventually should ignore the dynamic encoding, i.e. work on a strictly binary (byte-by-byte) base.

DoDi

_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Reply via email to