Re: [sqlite] Some clarification needed about Unicode

2009-10-30 Thread John Crenshaw
> http://codesnipers.com/?q=utf-8-versus-windows-unicode > > The author asset that .NET is the only platform that offer full UTF-16 > support in the Windows API. The author is half mistaken, as was I. Michael Kaplan and Raymond Chen (big MS names many will recognize) clarified this. For Win2k, o

Re: [sqlite] Some clarification needed about Unicode

2009-10-30 Thread A.J.Millan
- Original Message - From: "John Crenshaw" To: "General Discussion of SQLite Database" Sent: Thursday, October 29, 2009 10:55 PM Subject: Re: [sqlite] Some clarification needed about Unicode >No, I mean which encoding. You can't give a UTF-16 string to an

Re: [sqlite] Some clarification needed about Unicode

2009-10-29 Thread John Crenshaw
> > No, I mean which encoding. You can't give a UTF-16 string to an API > > that only knows how to handle UCS-2 encoded data > > Well, most of the time, you can. Only in rare cases do you need to treat > surrogate pairs in special way. One such case, relevant to this discussion, > is converting U

Re: [sqlite] Some clarification needed about Unicode

2009-10-29 Thread Igor Tandetnik
John Crenshaw wrote: > No, I mean which encoding. You can't give a UTF-16 string to an API > that only knows how to handle UCS-2 encoded data Well, most of the time, you can. Only in rare cases do you need to treat surrogate pairs in special way. One such case, relevant to this discussion, is

Re: [sqlite] Some clarification needed about Unicode

2009-10-29 Thread Jean-Christophe Deschamps
>Thanks for the link. That clarifies things a lot. So, for the OP, if you >are targeting Win2k, it would be a good idea to use UCS-2, not UTF-16, >with any wide API calls. XP and above should (according to Kaplan and >Chen) support UTF-16 for API calls. W2k is clearly something of the past. But

Re: [sqlite] Some clarification needed about Unicode

2009-10-29 Thread John Crenshaw
> Don't worry: we're all confused with MS wording! For what I understand > having also myself tried to sort out the question, is that there is a > line drawn: before XP unicode support included was nothing else than > UCS-2 (W2K). Xp and post-XP system include Unicode 5.1 and use UTF-16 > enc

Re: [sqlite] Some clarification needed about Unicode

2009-10-29 Thread John Crenshaw
says nothing. John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Igor Tandetnik Sent: Thursday, October 29, 2009 5:08 PM To: sqlite-users@sqlite.org Subject: Re: [sqlite] Some clarification needed about Unicode John Crenshaw

Re: [sqlite] Some clarification needed about Unicode

2009-10-29 Thread Jean-Christophe Deschamps
Hi John, >Microsoft never seems to clearly identify whether the wide APIs should >be given UTF-16 or UCS-2. Their guide on internationalization would seem >to suggest that UCS-2 must be used, however, there is some reason to >believe that perhaps UTF-16 is handled correctly as well. Couldn't find

Re: [sqlite] Some clarification needed about Unicode

2009-10-29 Thread Igor Tandetnik
John Crenshaw wrote: > 2. MultiByteToWideChar supports a "MB_COMPOSITE" flag, which appears > to > give UTF-16 output. MB_COMPOSITE has nothing to do with surrogate pairs, and everything to do with whether, say, Latin-1 character Á (A with accute) is converted to a single character U+00C1, or

Re: [sqlite] Some clarification needed about Unicode

2009-10-29 Thread John Crenshaw
#x27;s confused now.) John -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Jean-Christophe Deschamps Sent: Thursday, October 29, 2009 9:18 AM To: General Discussion of SQLite Database Subject: Re: [sqlite] Some clarification neede

Re: [sqlite] Some clarification needed about Unicode

2009-10-29 Thread Jean-Christophe Deschamps
Hi, ´¯¯¯ >Despite of that, I'm aware that I have some more that pure US-ASCII in >the >blob objects, in fact I'm near your situation because used the Spanish >languaje and have 8-bit extended ASCII with some special >characters -accented characters and so-. > >So the question is Yes, I have upper

Re: [sqlite] Some clarification needed about Unicode

2009-10-29 Thread John Crenshaw
> there must exist zillions [working] wrappers to VC++. You would think. In fact, there are only a few, and most are not very good. I used the wrapper at Code Project as a base, then added handling for SQLITE_LOCKED, a date class, better blob handling, transaction support, and other useful enhance

Re: [sqlite] Some clarification needed about Unicode

2009-10-29 Thread A.J.Millan
-- From: "Jean-Christophe Deschamps" To: "General Discussion of SQLite Database" Sent: Thursday, October 29, 2009 3:04 PM Subject: Re: [sqlite] Some clarification needed about Unicode Hi, Please, follow Igor advices, he is right. >[1] Read the actual textual data wit

Re: [sqlite] Some clarification needed about Unicode

2009-10-29 Thread Jean-Christophe Deschamps
Hi, Please, follow Igor advices, he is right. >[1] Read the actual textual data with sqlite3_column_blob() Which you can directly convert to TEXT if, as you say, you entered only 7-bit ASCII or UTF-8 compliant data. >[2] Assuming the system code page matches the one used when the data was >o

Re: [sqlite] Some clarification needed about Unicode

2009-10-29 Thread Jean-Christophe Deschamps
>My main point is that you can't take the UTF-16 string and safely supply >it to APIs which want UCS-2 encoded text, such as Win32 APIs (including >things like SetWindowText()). Odds are that the only library you are >using which supports UTF-16 is SQLite. You should always be converting >the te

Re: [sqlite] Some clarification needed about Unicode

2009-10-29 Thread A.J.Millan
- Original Message - From: "Igor Tandetnik" To: Sent: Thursday, October 29, 2009 1:45 PM Subject: Re: [sqlite] Some clarification needed about Unicode > > The only Win32 API function that can handle UTF-8 strings is > MultiByteToWideChar (when called with CP_UTF

Re: [sqlite] Some clarification needed about Unicode

2009-10-29 Thread Igor Tandetnik
A.J.Millan wrote: > Thanks for your answer; let me see if I understood correctly the process: > > [1] Read the actual textual data with sqlite3_column_blob() > > [2] Assuming the system code page matches the one used when the data was > originally inserted, convert with mbstowcs() > > [3] (Doubt

Re: [sqlite] Some clarification needed about Unicode

2009-10-29 Thread Jean-Denis Muys
On 10/29/09 12:55 , "A.J.Millan" wrote: > Now, do you know about some library to conver to and from UTF-8 or UTF-16 to > UCS-2? > [4-1b] convert with WideCharToMultiByte(CP_UTF8) On 10/29/09 12:51 , "Igor Tandetnik" wrote: > You > can use WideCharToMultiByte(CP_UTF8) - I don't quite see why

Re: [sqlite] Some clarification needed about Unicode

2009-10-29 Thread A.J.Millan
To: Sent: Thursday, October 29, 2009 12:51 PM Subject: Re: [sqlite] Some clarification needed about Unicode > A.J.Millan wrote: >> Really, here you touched tangentially the core of my question. Besides >> all >> those great theories, at last I have UTF-8 encoded data in a dBas

Re: [sqlite] Some clarification needed about Unicode

2009-10-29 Thread Igor Tandetnik
A.J.Millan wrote: > Now, do you know about some library to conver to and from UTF-8 or UTF-16 to > UCS-2? John's claims notwithstanding, you don't want or need UCS-2. It's a strict subset of UTF-16. Every valid UCS-2 string is also a UTF-16 string, but the converse is not true. UCS-2 is of histo

Re: [sqlite] Some clarification needed about Unicode

2009-10-29 Thread Igor Tandetnik
John Crenshaw wrote: > Similarly, UTF-16 is NOT the same as UCS-2 (the wide "Unicode" chars > used by MS APIs) Win32 API does too support UTF-16. What makes you believe otherwise? > though it looks the same at low values. UTF-16 is a > multibyte character set, while UCS-2 is always 2 bytes per ch

Re: [sqlite] Some clarification needed about Unicode

2009-10-29 Thread A.J.Millan
John: From: "John Crenshaw" To: "General Discussion of SQLite Database" Sent: Thursday, October 29, 2009 11:46 AM Subject: Re: [sqlite] Some clarification needed about Unicode > > My main point is that you can't take the UTF-16 string and safely supply > i

Re: [sqlite] Some clarification needed about Unicode

2009-10-29 Thread Igor Tandetnik
A.J.Millan wrote: > Really, here you touched tangentially the core of my question. Besides all > those great theories, at last I have UTF-8 encoded data in a dBase, and the > UCS-2 encoded data of the MS Win32 API (w_chars in muy Cpp app). The > question is: What is the concrete way to and from tha

Re: [sqlite] Some clarification needed about Unicode

2009-10-29 Thread Igor Tandetnik
John Crenshaw wrote: > My main point is that you can't take the UTF-16 string and safely supply > it to APIs which want UCS-2 encoded text, such as Win32 APIs (including > things like SetWindowText()). What makes you believe Win32 API, and SetWindowText in particular, does not support surrogate p

Re: [sqlite] Some clarification needed about Unicode

2009-10-29 Thread A.J.Millan
John: > 2. UTF-8 is NOT the same as ASCII for values greater than 127. > Similarly, UTF-16 is NOT the same as UCS-2 (the wide "Unicode" chars > used by MS APIs), though it looks the same at low values. UTF-16 is a > multibyte character set, while UCS-2 is always 2 bytes per character. > You have t

Re: [sqlite] Some clarification needed about Unicode

2009-10-29 Thread John Crenshaw
f Dan Kennedy Sent: Thursday, October 29, 2009 6:39 AM To: General Discussion of SQLite Database Subject: Re: [sqlite] Some clarification needed about Unicode On Oct 29, 2009, at 4:41 PM, Jean-Christophe Deschamps wrote: > >> [1] Supposing some textual data already inserted as UTF-8 (d

Re: [sqlite] Some clarification needed about Unicode

2009-10-29 Thread Dan Kennedy
On Oct 29, 2009, at 4:41 PM, Jean-Christophe Deschamps wrote: > >> [1] Supposing some textual data already inserted as UTF-8 (default >> mode) in >> a dBase, and a connection opened with sqlite3_open(): Does a >> sqlite3_column_text16 retrieves a correct UTF-16 content? Is to >> say, do >> SQLi

Re: [sqlite] Some clarification needed about Unicode

2009-10-29 Thread A.J.Millan
From: "Jean-Denis Muys" Sent: Thursday, October 29, 2009 11:10 AM Subject: Re: [sqlite] Some clarification needed about Unicode > This may be seen as nit picking, but when discussing character encodings > and > representations, the issues can become so subtil and confusin

Re: [sqlite] Some clarification needed about Unicode

2009-10-29 Thread Jean-Denis Muys
On 10/29/09 10:51 , "John Crenshaw" wrote: > 2. UTF-8 is NOT the same as ASCII for values greater than 127. ASCII only uses 7 bits values, so no larger representation can be "the same as ASCII for values greater than 127". This may be seen as nit picking, but when discussing character encoding

Re: [sqlite] Some clarification needed about Unicode

2009-10-29 Thread John Crenshaw
hn -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of A.J.Millan Sent: Thursday, October 29, 2009 5:14 AM To: sqlite-users@sqlite.org Subject: [sqlite] Some clarification needed about Unicode Hi list: After some years using this wonde

Re: [sqlite] Some clarification needed about Unicode

2009-10-29 Thread Jean-Christophe Deschamps
>[1] Supposing some textual data already inserted as UTF-8 (default >mode) in >a dBase, and a connection opened with sqlite3_open(): Does a >sqlite3_column_text16 retrieves a correct UTF-16 content? Is to say, do >SQLite the convertion internally? > >[2] Assuming the previous -or a UTF-16 content

[sqlite] Some clarification needed about Unicode

2009-10-29 Thread A.J.Millan
Hi list: After some years using this wonderful tool, I embraced the internationalization of a application, and despite some readings in this list, and muy own test -not conclusive-, I still have some obscure corners. [1] Supposing some textual data already inserted as UTF-8 (default mode) in a