Re: [fpc-pascal] Unicode file routines proposal

Luca Olivetti Tue, 01 Jul 2008 00:35:49 -0700

En/na Marco van de Voort ha escrit:

They have a UTF-16/UCS-2 internal representation, same as MSEgui which worksvery well and is fast and handy BTW.
And len, slicing, etc. work as expected.
Note that if you need characters beyond $ffff you have to compile it
with wide unicode support, and in that case every character will use 4
bytes.

That's IMHO a faulty system. It requires you to choose between an incomplete
solution or making strings a horrible memory hog.

OTOH using variable length characters will make string operationsexpensive (since you can't just multiply the index by 2 or 4 but youhave to examine the string from the beginning, and the length in bytesisn't the same as the length in characters).

But maybe that doesn't
matter for mere scripting languages (though I wonder then why they didn't
chose UTF-32 directly)

Surrogates are not nice, but they were invented for a reason.

Well, yes, they're a trade-off between performance and memoryconsumption, but I fear we're losing one of the advantages that pascalhas over C: fast and simple string handling.


Bye
--
Luca
_______________________________________________
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] Unicode file routines proposal

Reply via email to