On 07 Jan 2014, at 15:35, Hans-Peter Diettrich wrote:

> 1) In a discussion in the Embarcadero groups it turned out that, in an 
> assignment of a RawByteString to another AnsiString type, the Delphi compiler 
> should (but does not) check and eventually convert the string to the static 
> encoding of the target. This is (almost) the only way to create strings with 
> a different static and dynamic encoding.
> 
> 2) The stupid conversion to CP_ACP in an assignment *to* an RawByteString 
> should be dropped. This applies in detail to the assignment to *function 
> results*.

The conversion does not happen for all assignments, it only happens for 
concatenations that are assigned to RawByteString. And even then it doesn't 
always happen. Please read the wiki page I wrote (trying to prevent exactly 
this kind of wrong statements from being further repeated, and obviously 
failing). I even mentioned that we will probably add a way to change the 
behaviour in this specific case.

> 3) The function result type should be honored, in functions accepting 
> RawByteString parameters. The Delphi compiler seems to *assume* that the 
> results of such functions is RawByteString, so that (including 
> beforementioned flaws) the outcome is a CP_ACP string, even if the declared 
> function result is e.g. an UTF8String.
> 
> Test case:
>  function conc(a,b: RawByteString): UTF8String;
>  begin Result := a+b; end;

This will always return CP_UTF8 on FPC. Does it really return CP_ACP on Delphi? 
Even if it does, I doubt we will change that. We even couldn't easily do that, 
because we don't know the static code pages of the strings that are 
concatenated inside the RTL routine that handles this.

> Then TStrings could be based on such RawByteStrings, without excess 
> conversions or losses.

The problem with changing TStrings from AnsiString to RawByteString is not so 
much related to the behaviour of RawByteString, but more regarding descendent 
classes in existing third party (= user) source code that override methods 
using AnsiString parameters. We don't want to force everyone to rewrite their 
code so it uses RawByteString (if anything, RawByteString should probably be 
used as little as possible in user code, because always correctly dealing with 
all possible code pages is very hard).

> Sorting (TStringList) eventually should ignore the dynamic encoding, i.e. 
> work on a strictly binary (byte-by-byte) base.

Looking for just one second at the definition of the Sort methods of 
TStringList (and TStrings) would have prevented you from writing the above 
statement, which does not make any sense whatsoever (unless you want the 
compiler to start changing all code where a programmer passes a comparison 
function that does take code pages into account to the Sort methods of 
TStrings/TStringList).


Jonas
_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Reply via email to