José Mejuto schrieb:
If no checks about utf8 integrity are performed they should not be that "lot slower", only a bit slower, at least utf8pos, utf8copy is for sure slower.
I see no need for integrity checks, when the procedures are called with reasonable arguments. Before e.g. Copy can be called, the required parameters have to be determined, and *this* is where the use of the appropriate functions will automatically return valid arguments.
A different thing is that current implementation is a bit overengined which add some overhead. Is it logical/safe that utf8 functions do not check utf8 integrity ? I'm talking about utf8pos, utf8copy, etc...
There exists no need for an utf8pos function, for use with an utf8copy, when Pos already returns the correct start index for Copy. Only the count parameter deserves different handling in utf8copy - where the determination of the byte count can be done once, e.g. in an (UTF8)ByteCount function. Then Copy can allocate immediately the requested number of bytes, then move the same number of bytes. The use of the ByteCount function is not required when the end index is already known, from e.g. another Pos call.
It also would help to ensure text integrity when indexed access to bytes/chars in (MBCS/UTF) strings simply would be dropped. Then either a different string type or different access methods have to be used, at the choice of the coder.
DoDi -- _______________________________________________ Lazarus mailing list [email protected] http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus
