On Sun, Sep 18, 2011 at 11:45 AM, Jonas Maebe <jonas.ma...@elis.ugent.be> wrote: > > On 18 Sep 2011, at 13:57, Flávio Etrusco wrote: > >> One obvious way to mitigate this would be to store the last >> CodePoint->Char in the string record, so that at least the most common >> case is covered. > > ... and so that the common case is broken in multithreaded environments. > > Directly indexing a string will most likely always work using fixed-length > steps (8, 16, 32 bit). > If you want to iterate based on anything else (such as code points), use some > kind of > iterator model instead. > > Jonas
By "the most common case" I meant non-threaded ;-) But no, I don't see any trivial and efficient solution to avoid the worst case (but among threadvars, per-string fixed lookup table, shared lookup caches, per-reference data (like Object), etc, there must be a good solution). Basically I think the UnicodeString should move farther (than AnsiString) away from PChar, from the compiler/RTL POV. I think that the user should (have to) use the iterator model to *efficiently* iterate over the string, but I see indexed access as a compatibility feature, and as such should care more about correctness and ease-of-use rather than performance. I thought the endless bugs WRT to char vs codepoint indexes, even in Java-developed software, would buy my argument... -Flávio _______________________________________________ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel