Re: [Lazarus] UTF8String and UTF8Delete

Sven Barth Sat, 12 Dec 2015 09:21:05 -0800

On 12.12.2015 12:46, Jürgen Hestermann wrote:

Am 2015-12-11 um 19:14 schrieb Sven Barth:
 > Windows uses multi byte strings (one byte per character or more)
 > and UTF-16 (which is mostly 2 Byte and 4 for surrogate pairs).
 > The functions WideCharToMultiByte and MultiByteToWideChar which
 > are also used inside FPC for string conversions both take a
 > CodePage parameter that can also be CP_UTF8.


As far as I know, (current) Windows versions only use UTF16 internally.
But it provides the old legacy ANSI functions too (which convert to UTF16).
MultiByteToWideChar and WideCharToMultiByte are just helper
functions to convert from arbitrary encodings to UTF16 (and back).
But UTF-8 is nowhere used internaly in Windows (not even ANSI anymore,
except the legacy functions which convert to and from UTF16) and
you cannot use UTF8 as string encoding for WIN API functions.
Otherwise we would not have this problem and could use UTF-8 as
a standard for everything.

Yes, internally Windows uses UTF-16, but if you set your Windows Ansicode page or at least the current thread's locale to UTF-8 (indirectlyby choosing a locale that has UTF-8 as code page, I don't know one rightnow though) then the *A functions *do* work with UTF-8, simply becausethey use the current locale's code page to convert from Ansi to Unicodeand in this case Ansi includes UTF-8.


Regards,
Sven


--
_______________________________________________
Lazarus mailing list
Lazarus@lists.lazarus.freepascal.org
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus

Re: [Lazarus] UTF8String and UTF8Delete

Reply via email to