----- Original Message ----- From: "Felipe Monteiro de Carvalho" <[EMAIL PROTECTED]>

On 11/17/05, Mattias Gaertner <[EMAIL PROTECTED]> wrote:
> Speaking for lazarus: we want to support the whole unicode and UTF8 is > the
> easiest to achieve that.

I particulary like this solution.

* It doesn´t break existing code
* It makes it easy to make a program unicode. Just change the encoding to utf-8!
* It includes no overhead to existing apps

Only a few RTL functions would have to be created to support utf-8, as
has being said here. Many functions could remain the same.

The programer should know what he is using and take the necessary
precautions (use special utf-8 function for example, if needed)

I think RTL functions can use UTF-8 stored in regular ansistrings. But string type name must be different from String or AnsiString. It must be named like Utf8String or StringUtf8 to specify that the function or procedure is UTF-8 aware.

Also conversion between WideString and Utf8String need to be transparent (automatic UTF-16 (UCS-2) <-> UTF-8 conversion).

This will allow to assing Utf8String to WideString and perform indexing or per character operations easily if needed. Also developer can work only with WideStrings in his application and pass them to functions that accept Utf8String and that will work good (in non time critical places of code of course).

Yury Sidorov.

_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Reply via email to