Re: UNICODE version of _T(x) macro

Asmus Freytag Mon, 22 Nov 2010 11:28:00 -0800

On 11/22/2010 10:18 AM, Phillips, Addison wrote:

sowmya satyanarayana<sowmya underscore satyanarayana at yahoo dot
com>
wrote:

Taking this, what is the best way to define _T(x) macro of

UNICODE version, so

that my strings will always be
2 byte wide character?

Unicode characters aren't always 2 bytes wide.  Characters with
values
of U+10000 and greater take two UTF-16 code units, and are thus 4
bytes
wide in UTF-16.

Not exactly. The code units for UTF-16 are always 16-bits wide. Supplementary 
characters (those with code points>= U+10000) use a surrogate pair, which are 
two 16-bit code units. Most processing and string traversal is in terms of the 
16-bit code units, with a special case for the surrogate pairs.

It is very useful when discussing Unicode character encoding forms to distinguish between characters 
("code points") and their in memory representation ("code units"), rather than using 
non-specific terminology such as "character".

If you want to use UTF-32, which uses 32-bit code units, one per code point, 
you can use a 32-bit data type instead. Those are always 4 bytes wide.


The question is relevant to the C and C++ languages.

What is asked: which native data type to I use to make sure I end upwith a 16-bit code unit.


The usual way a _T macro is used is

TCHAR x = _T('x');
TCHAR * x = _T("x");

that is to wrap a string or character literal so that it can be usedeither as Unicode literal or as non-Unicode literal, depending onwhether some global compile time flat (usually UNICODE or _UNICODE) isset or not.


The usual way a _T macro is defined is something like:

#ifdef UNICODE
#define _T(x) L##x
#else
#define _T(x) x
#endif

That defintion relies on the compiler to support L'x' or L"string" byusing UTF-16.

A few years ago, there was a proposal to amend the C standard to have away to ensure that this is the case in a cross platform way. I can'trecall offhand what became of it.

A./

Re: UNICODE version of _T(x) macro

Reply via email to