[sqlite] sqlite3_errmsg and wide char

Scott Robison Thu, 26 Nov 2015 23:01:09 -0700

On Thu, Nov 26, 2015 at 10:13 PM, Igor Korot <ikorot01 at gmail.com> wrote:
>
> Hi,
> Is there any way to have "sqlite3_errmsg" function return a wide char
string?
> Or do a conversion in a portable way?
>
> Thank you.


The portable way would be to use the mbstowcs function from stdlib.h,
though it depends on what locales are supported by the system, so maybe not
as portable as you would like.

There isn't really a truly portable way of converting from char to wchar_t
based strings, given that there is no real guarantee about what exactly
wchar_t *is*. The ISO C90 standard merely defined it as "an integral type
whose range of values can represent distinct codes for all members of the
largest extended character set specified among the supported locales". In
theory, wchar_t could be a typedef of char if a platform only supported an
8 bit character set. Unicode was being designed at the time of C
standardization, but the first version of Unicode wasn't published for a
year or more after the C90 standard. Thus the wchar_t type doesn't have to
be Unicode,

In practical terms, I generally consider wchar_t to be a pre-2011 method of
storing Unicode. Even here there are no portability guarantees. Microsoft
went all in on Unicode in the early 1990s, back when it was only a two byte
encoding (UCS-2), so Microsoft compilers treat wchar_t as a two byte type.
Unicode 2.0 extended the Unicode character set in 1996 (I think) and
introduced UTF-16 as a compromise way of allowing systems that embraced
Unicode 1.0 (when it "guaranteed" a 16 bit character space) to support the
full space of Unicode code points from U+0000 to U+10FFFF via surrogate
pairs. Modern posix systems (as far as I know) define wchar_t to be a 32
bit type, so you can't really convert to wchar_t in a portable way, because
you have to handle surrogate pairs on Windows vs simple code points in
posix (though this should be handled by mbstowcs if the platform supports
wchar_t as unicode).

If you only use ASCII or Latin-1 8 bit characters in your code, or are
willing to treat all char objects as ASCII or Latin-1, then you can convert
char strings to wchar_t strings by simply zero extending each character
while copying it. Something like this (without any error checking):

void copy_narrow_to_wide(wchar_t* dst, const char* src)
{
  while (*src) *(dst++) = (unsigned char)(*(src++));
  *dst = 0;
}

--
Scott Robison

[sqlite] sqlite3_errmsg and wide char

Reply via email to