Hello J, string_t sTest; int nLengthNeeded = WideCharToMultiByte(CP_UTF8, 0, pszWide,nLength, 0, 0, 0, 0); if( !nLengthNeeded ) { ASSERT(0); return(E_ABORT); }
sTest.resize(nLengthNeeded + 16); nLength = WideCharToMultiByte(CP_UTF8, 0, pszWide,nLength, reinterpret_cast<char*>(&sTest[0]),(uint32_t)sTest.size(),0, 0); sTest[nLength] = 0; ASSERT(!strcmp(sTest.c_str(),(char*)(*this))); Is what I used to use to convert from UTF-16 to UTF-8 in Windows. There are similar functions for converting in the opposite direction. Internally my program is 100% UTF8. I do translations to UTF-16 right at the point I display the strings in Windows. This code is actually some test code I use today to compare the conversions I do manually to what Windows generates. In debug mode, it does two conversions and compares the two. Tuesday, October 7, 2014, 8:59:07 AM, you wrote: JD> On Tue, Oct 7, 2014 at 5:39 AM, Richard Hipp <d...@sqlite.org> wrote: >> On Tue, Oct 7, 2014 at 12:06 AM, J Decker <d3c...@gmail.com> wrote: >> >> > I saw a few things go by about unicode... and understand that it should >> > just work to store the data as characters... >> > >> > I'm getting a unrecognized token... and think this page isn't right... >> > I was playing with greek translation of 'mary had a little lamb' >> > >> > >> I ran the following script through the sqlite3 command-line shell and it >> works fine: >> >> CREATE TABLE option4_values(option_id, string, segment); >> REPLACE INTO option4_values(`option_id`,`string`,`segment`) >> VALUES('8b377a68-4358-11e4-ace4-3085a9903449','Μαίρη είχε ένα μικρό >> αρνί',0); >> SELECT * FROM option4_values; >> >> Hmm... wonder what it's getting.... >> I suggest that the problem is in your programming language, or in the >> wrapper that links your programming language to SQLite, not in SQLite >> itself. Can you tell us what programming language and what operating >> system you are using? >> >> C, visual studio 2012 build, windows. JD> built with UNICODE enabled... instead of multi-byte character set.... JD> it could be my conversion routine... I'm using wcstombs_s with _MSC_VER JD> set... before it was just faililng, because wcstombs_s doesn't convert JD> anything with a high bit set... so I added a handler to replace it with a JD> utf-8 16 bit character encode (expands to 3 bytes as described here JD> http://en.wikipedia.org/wiki/UTF-8#Description ) JD> if( err == 42 ) JD> { JD> (*ch++) = 0xE0 | ((unsigned char*)wch)[1] >> 4; JD> (*ch++) = 0x80 | ( ( ((unsigned char*)wch)[1] & 0xF ) << 2 ) | ( ( JD> ((unsigned char*)wch)[0] ) >> 6 ); JD> (*ch++) = 0x80 | ( ((unsigned char*)wch)[0] & 0x3F ); JD> } JD> which works... if I mouse-over on char * string it shows the right unicode JD> characters. JD> The logging that I included in the first message was converted from JD> wchar_t* to char* and then the sqlite3_strerror() is expanded from char * JD> to wchar_t * and still shows the right characters.... JD> I just cannot identify the unrecognized token... it's obviously not at JD> character 0... (that's gotten by comparing the pzTail result of JD> sqlite3_prepare_v2 )... >> -- >> D. Richard Hipp >> d...@sqlite.org >> _______________________________________________ >> sqlite-users mailing list >> sqlite-users@sqlite.org >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >> JD> _______________________________________________ JD> sqlite-users mailing list JD> sqlite-users@sqlite.org JD> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users -- Best regards, Teg mailto:t...@djii.com _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users