>>> Asmus Freytag wrote
> TCHAR x = _T('x');
> TCHAR * x = _T("x");
>
> that is to wrap a string or character literal so that it can be used either
> as
>Unicode literal or as non-Unicode literal, depending on whether some global
>compile time flat (usually UNICODE or _UNICODE) is set or
>
> not.
>
> The usual way a _T macro is defined is something like:
>
> #ifdef UNICODE
> #define _T(x) L##x
> #else
> #define _T(x) x
> #endif
>
> That defintion relies on the compiler to support L'x' or L"string" by using
>UTF-16.
This what I am actually looking for. My ODBC application supports UTF-16, which
is 2 byte width characters. This application is completely oriented around
using
_T(x) macro as Asmus Freytag figured out.
I could not get away using this compiler dependent macro _T( ) for many
historical reasons or because of restriction by the underlying layers(it always
expects unsigned short text) that is being used by the ODBC application.So in
my case I need _T('x') or _T("x") to be defined to 16 bit all ways.The most
worrying part is that the macro definition of _T(x) is dependent on compiler to
support L'x' or L"x" as I need my application to be portable on Unix
platforms,which can have varying behavior.
In order to get similar behavior of _T( ) both in case of literal and strings,
I
tried some thing like this.
Case-1) Constant cases can be addressed by macro
#define __convert_to_integer(x) x
Case-2) String cases can be addressed by function
TCHAR *__copy_to_unicode( char *src )
{
int dest_len = strlen(src);
TCHAR *dest = malloc(strlen(src)*sizeof(TCHAR));
TCHAR *ptr = dest;
while( *src && dest_len > 0 ) {
*ptr++ = *src++;
dest_len --;
}
*ptr = '\0';
return dest;
}
Using these two functions we need formulate a macro for _T( ). Could get good
way to do this.
Sample C program
#include <stdio.h>
typedef unsigned short TCHAR;
/* Constant cases can be addressed by following macro */
#define __convert_to_integer(x) x
/* String cases can be addressed by following function */
TCHAR *__copy_to_unicode( char *src )
{
int dest_len = strlen(src);
TCHAR *dest = malloc(strlen(src)*sizeof(TCHAR));
TCHAR *ptr = dest;
while( *src && dest_len > 0 ) {
*ptr++ = *src++;
dest_len --;
}
*ptr = '\0';
return dest;
}
int main()
{
typedef struct st_AlternateCol
{
TCHAR *pszName;
signed short sType;
int ulLen;
signed short fNullable;
}
_ALT_COL;
char *string = "SELECT * FROM DUAL";
char src = 's';
TCHAR *pSql;
TCHAR ch;
TCHAR *concatstr = malloc((strlen(string)*2)*sizeof(TCHAR));
TCHAR *ternary;
// Variable initialization
pSql =__copy_to_unicode(string);
ch = __convert_to_integer(src);
//Conditional check
if(*pSql == __convert_to_integer('S'))
printf("string starts with letter S\n");
else
printf("string does not start with letter S\n");
//As constant
switch(*pSql)
{
case __convert_to_integer('S'): printf("matched with S\n");
break;
default: printf("did not match\n");
}
//Arguments to function
/* For Unicode string we can't use string.h functions. Rather use
functions from odbc.h like M_FSTRCPYU. Just to check if the function is working
I have added following part.
The output will not be Unicode string. */
strncpy(concatstr,__copy_to_unicode(string),strlen(string));
strcat(concatstr,pSql);
printf("Concatenated string is: %s\n", concatstr);
//Structure member initialize
_ALT_COL AltDescCol[] =
{
{ (TCHAR *)__copy_to_unicode("TABLE_CAT"), 12, 31L, 1 },
{ (TCHAR *)__copy_to_unicode("TABLE_SCHEM"), 12, 30L, 1 },
};
//Argument to ternary operator
ternary = (1>2)? __copy_to_unicode("Greater"):
__copy_to_unicode("Less");
printf("Ternary testing: %s\n",ternary);
return 0;
}
I don't know if I have to do some thing like this in order to have
_T("x"")/_T('x') represent UTF-16 characters i.e. 2 byte width.But doing the
above way is not practical.
I wanted to know if there is better way to have macro definition for
_T('x')/_T("xyz") which can be independent of compiler and with 2 byte (UTF-16)
wide character.
Thanks in advance.
Sowmya.
________________________________
From: Asmus Freytag <[email protected]>
To: "Phillips, Addison" <[email protected]>
Cc: Doug Ewell <[email protected]>; sowmya satyanarayana
<[email protected]>; [email protected]
Sent: Tue, 23 November, 2010 12:38:37 AM
Subject: Re: UNICODE version of _T(x) macro
On 11/22/2010 10:18 AM, Phillips, Addison wrote:
>> sowmya satyanarayana<sowmya underscore satyanarayana at yahoo dot
>> com>
>> wrote:
>>
>>> Taking this, what is the best way to define _T(x) macro of
>> UNICODE version, so
>>> that my strings will always be
>>> 2 byte wide character?
>> Unicode characters aren't always 2 bytes wide. Characters with
>> values
>> of U+10000 and greater take two UTF-16 code units, and are thus 4
>> bytes
>> wide in UTF-16.
>>
> Not exactly. The code units for UTF-16 are always 16-bits wide. Supplementary
>characters (those with code points>= U+10000) use a surrogate pair, which are
>two 16-bit code units. Most processing and string traversal is in terms of
>the
>16-bit code units, with a special case for the surrogate pairs.
>
> It is very useful when discussing Unicode character encoding forms to
>distinguish between characters ("code points") and their in memory
>representation ("code units"), rather than using non-specific terminology such
>as "character".
>
> If you want to use UTF-32, which uses 32-bit code units, one per code point,
>you can use a 32-bit data type instead. Those are always 4 bytes wide.
The question is relevant to the C and C++ languages.
What is asked: which native data type to I use to make sure I end up with a
16-bit code unit.
The usual way a _T macro is used is
TCHAR x = _T('x');
TCHAR * x = _T("x");
that is to wrap a string or character literal so that it can be used either as
Unicode literal or as non-Unicode literal, depending on whether some global
compile time flat (usually UNICODE or _UNICODE) is set or not.
The usual way a _T macro is defined is something like:
#ifdef UNICODE
#define _T(x) L##x
#else
#define _T(x) x
#endif
That defintion relies on the compiler to support L'x' or L"string" by using
UTF-16.
A few years ago, there was a proposal to amend the C standard to have a way to
ensure that this is the case in a cross platform way. I can't recall offhand
what became of it.
A./