On 01/11/2011 05:50 PM, Hans-Peter Diettrich wrote:
Since the generic Delphi "string" type can be any Unicode encoding now,
This
From what O read I understand
that the dynamically code string type can hold 1, 2, and 4 byte (maybe
even more) Codes for it's elements (denoted in one control-value) and
each of those (theoretically) in different coding schemes (denoted in
another control-value), allowing e.g. for UTF-8, UTF-16, UCS4, German
ANSI, raw Byte, string....
is what I (not owning a Delphi > 2007) thought, too, and have been
bashed for.
But The document "Delphi and Unicode" by Marco Cantu (
http://edn.embarcadero.com/article/images/38980/Delphi_and_Unicode.pdf
), dated Nov, 2008, in fact states:
length, the second element is the reference count. In Delphi 2009 the
representation for
reference-counted strings becomes:
-12 -10 -8 -4
String reference address
Code page Elem size Ref count length First char of string
Beside the length and reference count, the new fields represent the
element size and the code
page. While the element size is used to discriminate between AnsiString
and UnicodeString, the
code page makes sense in particular for the AnsiString type (as it works
in Delphi 2009), as the
UnicodeString type has the fixed code page 1200.
A corresponding support data structure is declared in the implementation
section of System unit as:
type
PStrRec = ^StrRec;
StrRec = packed record
codePage: Word;
elemSize: Word;
refCnt: Longint;
length: Longint;
end;
But maybe the document is outdated.
-Michael
_______________________________________________
fpc-devel maillist - fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel