Looking for a C library that converts UTF-8 strings from their decomposed to pre-composed form

Tay, William Mon, 08 Nov 2004 09:37:01 -0800

Title: Looking for a C library that converts UTF-8 strings from their decomposed to pre-composed form

Hi,

It seems that accented characters generated in MacOS X are represented in UTF-8 decomposed form, e.g. the character é is represented as 65 cc 81, instead of c3 a9 (the pre-composed form), and the character ズ is represented as e3 82 b9 e3 82 99 instead of e3 82 ba. My Solaris application needs to process these characters that are generated from MacOS X.

Is there any C library available that converts the decomposed UTF-8 byte streams into the pre-composed equivalent?

Thanks

Will

Looking for a C library that converts UTF-8 strings from their decomposed to pre-composed form

Reply via email to