Re: [fpc-devel] Performance of string handling in trunk

Michael Schnell Wed, 26 Jun 2013 05:00:49 -0700

BTW.

I think the implementation would be quite easy, straight forward, fastand compatible.


 - The compiler knows the static encoding type of each string variable.

- The dynamic encoding type of a String is preset to the staticencoding type when the string is allocated- only RawByteStrings (EncodingType $FFFF) are allowed to change theirdynamic encoding type, with other Strings this will lead tounpredictable results



When Strings are assigned:

- If the static encoding type of source and target is identical (be itnormal or RAW) (already checked by the compiler) -> the same happens aswith the pre-Unicode compiler (setting the pointer to the StringRecordand managing the RefCount)

otherwise:

- If the target is statically defined as RawByteString (alreadychecked by the compiler) -> the same happens- If the source is statically defined as RawByteString (alreadychecked by the compiler), code is implemented that checks if the dynamicencoding of the source is identical to the (known to the compiler)static encoding type of the target -> the same happens

otherwise the conversion library is called. Same checks the _dynamic_encoding type of source and target (thus it only needs to be providedwith the Strings themselves and no additional information generated bythe compiler) and does the conversion appropriately.

When doing operation on two Strings (such as "+" and compare), one ofthe operators is (virtually) copied to a String with the same encodingtype as the other.


Here:

- if one operand is a RawByteString use the (static or dynamic)encoding of the other.- if both are RawByteStrings use the dynamic encoding use the dynamicencoding of one of them (supposedly this is no alternate case to before)

If the conversion library sees a dynamic encoding type of $FFFF foreither source or target it will fail and issue an exception.

IMHO it makes a much more sense to implement things like TStringList onbase of RawByteString, as when doing it based on the default Systemencoding, there will be a dual conversion when using it with any otherencoding type.

IMHO big commonly used, arch independent, non super high-performancelibraries (like LCL) should use RawByteString as their user interfaceand internally as widely as possible, so that conversions are preventedwhenever possible (e.g. when the user's call provides a string andduring the work in the library it is decided that it is not actually used.)


-Michael (the weird one)

_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Performance of string handling in trunk

Reply via email to