Re: [sqlite] Latin-1 characters cannot be supported for Unicode
HEX() function -Ursprüngliche Nachricht- Von: sqlite-users-boun...@mailinglists.sqlite.org [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im Auftrag von Wang, Wei Gesendet: Mittwoch, 15. Juni 2016 11:52 An: SQLite mailing list Betreff: Re: [sqlite] Latin-1 characters cannot be supported for Unicode There are some unintelligible text in my database. Is there any methods to see their byte sequence? Best Regards, Wang Wei -Original Message- From: sqlite-users-boun...@mailinglists.sqlite.org [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of Hick Gunter Sent: Wednesday, June 15, 2016 4:21 PM To: 'SQLite mailing list' Subject: Re: [sqlite] Latin-1 characters cannot be supported for Unicode The answer is very simple: Do not use ANSI/ISO encoding with SQLite. SQLite expects Unicode. -Ursprüngliche Nachricht- Von: sqlite-users-boun...@mailinglists.sqlite.org [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im Auftrag von Wang, Wei Gesendet: Mittwoch, 15. Juni 2016 04:44 An: SQLite mailing list Betreff: Re: [sqlite] Latin-1 characters cannot be supported for Unicode Under the ANSI encoding environment, I created a table named TEST_PRODUÇÃO in the database. Then I opened this database with sqlite-tool. I ran the sql statement to query all the tables and found the new created table was shown as TEST_PRODU??O. Also this table could not be queried out using the table name TEST_PRODUÇÃO. It seemed that this issue was caused by encoding mismatch. Best Regards, Wang Wei -Original Message- From: sqlite-users-boun...@mailinglists.sqlite.org [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of Chris Brody Sent: Wednesday, June 08, 2016 4:20 PM To: SQLite mailing list Subject: Re: [sqlite] Latin-1 characters cannot be supported for Unicode Hi Wei Wang, Did you populate the database from the sqlite3 CLI tool, your own C program, or from another language? Do you see this when you create a database from scratch, if you use a database created by another program, or in both cases? If you populated the database from the sqlite3 CLI tool, can you post the commands you used to populate the database? If you populated the database from your own C program, can you post a simple test program that populates the database? If you populated the database from another language, can you post a test snippet that shows how you populated the database along with a pointer to which library you are using? What kind of system, CPU, and operating system(s) do you see this behavior on? It should be no problem for sqlite3 to deal with the Latin-1 characters you are using if you do it right. The trick is that sqlite3 is designed to deal with both UTF-8 and UTF-16 (le or be). SQLite stores which encoding is used in the database. The API allows you to use both UTF-8 and UTF-16 encoding, regardless of which encoding is actually used to store the data. I think this is documented properly in sqlite.org, and I found an excellent writeup (though 5 years old) at: http://www.mimec.org/node/297 I also like the Unicode link from Igor. Chris On Wed, Jun 8, 2016 at 3:49 AM, Wang, Wei wrote: > Thanks for your reply! But I found the Latin-1 encoded characters are > listed in the Unicode chart. http://unicode.org/charts/PDF/U0080.pdf > > > Best Regards, > Wang Wei > > -Original Message- > From: sqlite-users-boun...@mailinglists.sqlite.org > [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of > Igor Tandetnik > Sent: Tuesday, June 07, 2016 10:20 PM > To: sqlite-users@mailinglists.sqlite.org > Subject: Re: [sqlite] Latin-1 characters cannot be supported for > Unicode > > On 6/7/2016 3:43 AM, Wang, Wei wrote: >> I met a problem that was maybe caused by the encoding of SQLite. I inserted >> a item which including some Latin1 characters like Ç and à into a table. >> Then I opened the database with SQLite Developer. After I setting the >> encoding to ANSI, the display and the query result for that table were OK. >> However after I setting the encoding to Unicode, these Latin1 characters >> could not be displayed normally, and could not be queried out. Please see >> the attached pictures for the details. > > A byte sequence containing Latin-1-encoded characters Ç or à is not in > fact a valid byte sequence in any Unicode encoding - neither UTF-8 nor > UTF-16 nor any other. If you want Unicode data in your database, then store > Unicode data, and not ANSI, in your database. > -- > Igor Tandetnik > > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > ___ > sqlite-users mailing list > sqlite-users@mailing
Re: [sqlite] Latin-1 characters cannot be supported for Unicode
There are some unintelligible text in my database. Is there any methods to see their byte sequence? Best Regards, Wang Wei -Original Message- From: sqlite-users-boun...@mailinglists.sqlite.org [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of Hick Gunter Sent: Wednesday, June 15, 2016 4:21 PM To: 'SQLite mailing list' Subject: Re: [sqlite] Latin-1 characters cannot be supported for Unicode The answer is very simple: Do not use ANSI/ISO encoding with SQLite. SQLite expects Unicode. -Ursprüngliche Nachricht- Von: sqlite-users-boun...@mailinglists.sqlite.org [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im Auftrag von Wang, Wei Gesendet: Mittwoch, 15. Juni 2016 04:44 An: SQLite mailing list Betreff: Re: [sqlite] Latin-1 characters cannot be supported for Unicode Under the ANSI encoding environment, I created a table named TEST_PRODUÇÃO in the database. Then I opened this database with sqlite-tool. I ran the sql statement to query all the tables and found the new created table was shown as TEST_PRODU??O. Also this table could not be queried out using the table name TEST_PRODUÇÃO. It seemed that this issue was caused by encoding mismatch. Best Regards, Wang Wei -Original Message- From: sqlite-users-boun...@mailinglists.sqlite.org [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of Chris Brody Sent: Wednesday, June 08, 2016 4:20 PM To: SQLite mailing list Subject: Re: [sqlite] Latin-1 characters cannot be supported for Unicode Hi Wei Wang, Did you populate the database from the sqlite3 CLI tool, your own C program, or from another language? Do you see this when you create a database from scratch, if you use a database created by another program, or in both cases? If you populated the database from the sqlite3 CLI tool, can you post the commands you used to populate the database? If you populated the database from your own C program, can you post a simple test program that populates the database? If you populated the database from another language, can you post a test snippet that shows how you populated the database along with a pointer to which library you are using? What kind of system, CPU, and operating system(s) do you see this behavior on? It should be no problem for sqlite3 to deal with the Latin-1 characters you are using if you do it right. The trick is that sqlite3 is designed to deal with both UTF-8 and UTF-16 (le or be). SQLite stores which encoding is used in the database. The API allows you to use both UTF-8 and UTF-16 encoding, regardless of which encoding is actually used to store the data. I think this is documented properly in sqlite.org, and I found an excellent writeup (though 5 years old) at: http://www.mimec.org/node/297 I also like the Unicode link from Igor. Chris On Wed, Jun 8, 2016 at 3:49 AM, Wang, Wei wrote: > Thanks for your reply! But I found the Latin-1 encoded characters are > listed in the Unicode chart. http://unicode.org/charts/PDF/U0080.pdf > > > Best Regards, > Wang Wei > > -Original Message- > From: sqlite-users-boun...@mailinglists.sqlite.org > [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of > Igor Tandetnik > Sent: Tuesday, June 07, 2016 10:20 PM > To: sqlite-users@mailinglists.sqlite.org > Subject: Re: [sqlite] Latin-1 characters cannot be supported for > Unicode > > On 6/7/2016 3:43 AM, Wang, Wei wrote: >> I met a problem that was maybe caused by the encoding of SQLite. I inserted >> a item which including some Latin1 characters like Ç and à into a table. >> Then I opened the database with SQLite Developer. After I setting the >> encoding to ANSI, the display and the query result for that table were OK. >> However after I setting the encoding to Unicode, these Latin1 characters >> could not be displayed normally, and could not be queried out. Please see >> the attached pictures for the details. > > A byte sequence containing Latin-1-encoded characters Ç or à is not in > fact a valid byte sequence in any Unicode encoding - neither UTF-8 nor > UTF-16 nor any other. If you want Unicode data in your database, then store > Unicode data, and not ANSI, in your database. > -- > Igor Tandetnik > > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Latin-1 characters cannot be supported for Unicode
The answer is very simple: Do not use ANSI/ISO encoding with SQLite. SQLite expects Unicode. -Ursprüngliche Nachricht- Von: sqlite-users-boun...@mailinglists.sqlite.org [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im Auftrag von Wang, Wei Gesendet: Mittwoch, 15. Juni 2016 04:44 An: SQLite mailing list Betreff: Re: [sqlite] Latin-1 characters cannot be supported for Unicode Under the ANSI encoding environment, I created a table named TEST_PRODUÇÃO in the database. Then I opened this database with sqlite-tool. I ran the sql statement to query all the tables and found the new created table was shown as TEST_PRODU??O. Also this table could not be queried out using the table name TEST_PRODUÇÃO. It seemed that this issue was caused by encoding mismatch. Best Regards, Wang Wei -Original Message- From: sqlite-users-boun...@mailinglists.sqlite.org [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of Chris Brody Sent: Wednesday, June 08, 2016 4:20 PM To: SQLite mailing list Subject: Re: [sqlite] Latin-1 characters cannot be supported for Unicode Hi Wei Wang, Did you populate the database from the sqlite3 CLI tool, your own C program, or from another language? Do you see this when you create a database from scratch, if you use a database created by another program, or in both cases? If you populated the database from the sqlite3 CLI tool, can you post the commands you used to populate the database? If you populated the database from your own C program, can you post a simple test program that populates the database? If you populated the database from another language, can you post a test snippet that shows how you populated the database along with a pointer to which library you are using? What kind of system, CPU, and operating system(s) do you see this behavior on? It should be no problem for sqlite3 to deal with the Latin-1 characters you are using if you do it right. The trick is that sqlite3 is designed to deal with both UTF-8 and UTF-16 (le or be). SQLite stores which encoding is used in the database. The API allows you to use both UTF-8 and UTF-16 encoding, regardless of which encoding is actually used to store the data. I think this is documented properly in sqlite.org, and I found an excellent writeup (though 5 years old) at: http://www.mimec.org/node/297 I also like the Unicode link from Igor. Chris On Wed, Jun 8, 2016 at 3:49 AM, Wang, Wei wrote: > Thanks for your reply! But I found the Latin-1 encoded characters are > listed in the Unicode chart. http://unicode.org/charts/PDF/U0080.pdf > > > Best Regards, > Wang Wei > > -Original Message- > From: sqlite-users-boun...@mailinglists.sqlite.org > [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of > Igor Tandetnik > Sent: Tuesday, June 07, 2016 10:20 PM > To: sqlite-users@mailinglists.sqlite.org > Subject: Re: [sqlite] Latin-1 characters cannot be supported for > Unicode > > On 6/7/2016 3:43 AM, Wang, Wei wrote: >> I met a problem that was maybe caused by the encoding of SQLite. I inserted >> a item which including some Latin1 characters like Ç and à into a table. >> Then I opened the database with SQLite Developer. After I setting the >> encoding to ANSI, the display and the query result for that table were OK. >> However after I setting the encoding to Unicode, these Latin1 characters >> could not be displayed normally, and could not be queried out. Please see >> the attached pictures for the details. > > A byte sequence containing Latin-1-encoded characters Ç or à is not in > fact a valid byte sequence in any Unicode encoding - neither UTF-8 nor > UTF-16 nor any other. If you want Unicode data in your database, then store > Unicode data, and not ANSI, in your database. > -- > Igor Tandetnik > > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ Gunter Hick Software Engineer Scientific Games International GmbH FN 157284 a, HG Wien Klitschgasse 2-4, A-1130 Vienna, Austria Tel: +43 1 80100 0 E-Mail: h...@scigames.at This communication (including any attachments) is intended for the use o
Re: [sqlite] Latin-1 characters cannot be supported for Unicode
On 15 Jun 2016, at 3:44am, Wang, Wei wrote: > Under the ANSI encoding environment, I created a table named TEST_PRODUÇÃO in > the database. All strings handled by SQLite, including the strings that make up SQL commands like "CREATE TABLE ...", are Unicode strings. If you are constructing an ANSI string and passing that to sqlite3_exec() or sqlite3_prepare(), then you are doing the wrong thing. You must convert to Unicode before passing the string to any sqlite3 API call. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Latin-1 characters cannot be supported for Unicode
Under the ANSI encoding environment, I created a table named TEST_PRODUÇÃO in the database. Then I opened this database with sqlite-tool. I ran the sql statement to query all the tables and found the new created table was shown as TEST_PRODU??O. Also this table could not be queried out using the table name TEST_PRODUÇÃO. It seemed that this issue was caused by encoding mismatch. Best Regards, Wang Wei -Original Message- From: sqlite-users-boun...@mailinglists.sqlite.org [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of Chris Brody Sent: Wednesday, June 08, 2016 4:20 PM To: SQLite mailing list Subject: Re: [sqlite] Latin-1 characters cannot be supported for Unicode Hi Wei Wang, Did you populate the database from the sqlite3 CLI tool, your own C program, or from another language? Do you see this when you create a database from scratch, if you use a database created by another program, or in both cases? If you populated the database from the sqlite3 CLI tool, can you post the commands you used to populate the database? If you populated the database from your own C program, can you post a simple test program that populates the database? If you populated the database from another language, can you post a test snippet that shows how you populated the database along with a pointer to which library you are using? What kind of system, CPU, and operating system(s) do you see this behavior on? It should be no problem for sqlite3 to deal with the Latin-1 characters you are using if you do it right. The trick is that sqlite3 is designed to deal with both UTF-8 and UTF-16 (le or be). SQLite stores which encoding is used in the database. The API allows you to use both UTF-8 and UTF-16 encoding, regardless of which encoding is actually used to store the data. I think this is documented properly in sqlite.org, and I found an excellent writeup (though 5 years old) at: http://www.mimec.org/node/297 I also like the Unicode link from Igor. Chris On Wed, Jun 8, 2016 at 3:49 AM, Wang, Wei wrote: > Thanks for your reply! But I found the Latin-1 encoded characters are > listed in the Unicode chart. http://unicode.org/charts/PDF/U0080.pdf > > > Best Regards, > Wang Wei > > -Original Message- > From: sqlite-users-boun...@mailinglists.sqlite.org > [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of > Igor Tandetnik > Sent: Tuesday, June 07, 2016 10:20 PM > To: sqlite-users@mailinglists.sqlite.org > Subject: Re: [sqlite] Latin-1 characters cannot be supported for > Unicode > > On 6/7/2016 3:43 AM, Wang, Wei wrote: >> I met a problem that was maybe caused by the encoding of SQLite. I inserted >> a item which including some Latin1 characters like Ç and à into a table. >> Then I opened the database with SQLite Developer. After I setting the >> encoding to ANSI, the display and the query result for that table were OK. >> However after I setting the encoding to Unicode, these Latin1 characters >> could not be displayed normally, and could not be queried out. Please see >> the attached pictures for the details. > > A byte sequence containing Latin-1-encoded characters Ç or à is not in > fact a valid byte sequence in any Unicode encoding - neither UTF-8 nor > UTF-16 nor any other. If you want Unicode data in your database, then store > Unicode data, and not ANSI, in your database. > -- > Igor Tandetnik > > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Latin-1 characters cannot be supported for Unicode
Hi Wei Wang, Did you populate the database from the sqlite3 CLI tool, your own C program, or from another language? Do you see this when you create a database from scratch, if you use a database created by another program, or in both cases? If you populated the database from the sqlite3 CLI tool, can you post the commands you used to populate the database? If you populated the database from your own C program, can you post a simple test program that populates the database? If you populated the database from another language, can you post a test snippet that shows how you populated the database along with a pointer to which library you are using? What kind of system, CPU, and operating system(s) do you see this behavior on? It should be no problem for sqlite3 to deal with the Latin-1 characters you are using if you do it right. The trick is that sqlite3 is designed to deal with both UTF-8 and UTF-16 (le or be). SQLite stores which encoding is used in the database. The API allows you to use both UTF-8 and UTF-16 encoding, regardless of which encoding is actually used to store the data. I think this is documented properly in sqlite.org, and I found an excellent writeup (though 5 years old) at: http://www.mimec.org/node/297 I also like the Unicode link from Igor. Chris On Wed, Jun 8, 2016 at 3:49 AM, Wang, Wei wrote: > Thanks for your reply! But I found the Latin-1 encoded characters are listed > in the Unicode chart. http://unicode.org/charts/PDF/U0080.pdf > > > Best Regards, > Wang Wei > > -Original Message- > From: sqlite-users-boun...@mailinglists.sqlite.org > [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of Igor > Tandetnik > Sent: Tuesday, June 07, 2016 10:20 PM > To: sqlite-users@mailinglists.sqlite.org > Subject: Re: [sqlite] Latin-1 characters cannot be supported for Unicode > > On 6/7/2016 3:43 AM, Wang, Wei wrote: >> I met a problem that was maybe caused by the encoding of SQLite. I inserted >> a item which including some Latin1 characters like Ç and à into a table. >> Then I opened the database with SQLite Developer. After I setting the >> encoding to ANSI, the display and the query result for that table were OK. >> However after I setting the encoding to Unicode, these Latin1 characters >> could not be displayed normally, and could not be queried out. Please see >> the attached pictures for the details. > > A byte sequence containing Latin-1-encoded characters Ç or à is not in fact a > valid byte sequence in any Unicode encoding - neither UTF-8 nor > UTF-16 nor any other. If you want Unicode data in your database, then store > Unicode data, and not ANSI, in your database. > -- > Igor Tandetnik > > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Latin-1 characters cannot be supported for Unicode
Yes, I missed the trailing 00 -Ursprüngliche Nachricht- Von: sqlite-users-boun...@mailinglists.sqlite.org [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im Auftrag von Jean-Christophe Deschamps Gesendet: Mittwoch, 08. Juni 2016 09:37 An: SQLite mailing list Betreff: Re: [sqlite] Latin-1 characters cannot be supported for Unicode At 09:22 08/06/2016, you wrote: >A 3 Byte Sequence 0xFFFEC4 when converting 0xC4 to UTF-8 in UltraEdit This 3-byte sequence is neither UTF8 or UTF16 even if the BOM would make us believe it is UTF16-LE. UTF16 implies 16-bit encoding units, so an odd byte length is impossible. You probably meant FF FE C4 00 for UTF16-LE. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ Gunter Hick Software Engineer Scientific Games International GmbH FN 157284 a, HG Wien Klitschgasse 2-4, A-1130 Vienna, Austria Tel: +43 1 80100 0 E-Mail: h...@scigames.at This communication (including any attachments) is intended for the use of the intended recipient(s) only and may contain information that is confidential, privileged or legally protected. Any unauthorized use or dissemination of this communication is strictly prohibited. If you have received this communication in error, please immediately notify the sender by return e-mail message and delete all copies of the original communication. Thank you for your cooperation. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Latin-1 characters cannot be supported for Unicode
At 09:22 08/06/2016, you wrote: A 3 Byte Sequence 0xFFFEC4 when converting 0xC4 to UTF-8 in UltraEdit This 3-byte sequence is neither UTF8 or UTF16 even if the BOM would make us believe it is UTF16-LE. UTF16 implies 16-bit encoding units, so an odd byte length is impossible. You probably meant FF FE C4 00 for UTF16-LE. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Latin-1 characters cannot be supported for Unicode
That the same character is found in both encodings is no surprise. You need to look at the actual sequence of bytes. Comparing a file containing just the "capital A with diaresis" yields A 1 Byte sequence 0xC4 in ANSI A 2 Byte sequence 0xC384 in en_US.UTF8 on a RH5 linux system A 3 Byte Sequence 0xFFFEC4 when converting 0xC4 to UTF-8 in UltraEdit If you store the single byte 0xC4 then SQLite will retrieve the single byte 0xC4. If you change the representation layer to expect 0xFFFEC4 or 0xC384 then you will be disappointed. If you put a cat into a box labeled "cat" and then change the label to "dog", will that change what is inside? If you sell the box, will the buyer not complain? -Ursprüngliche Nachricht- Von: sqlite-users-boun...@mailinglists.sqlite.org [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im Auftrag von Wang, Wei Gesendet: Mittwoch, 08. Juni 2016 03:49 An: SQLite mailing list Betreff: Re: [sqlite] Latin-1 characters cannot be supported for Unicode Thanks for your reply! But I found the Latin-1 encoded characters are listed in the Unicode chart. http://unicode.org/charts/PDF/U0080.pdf Best Regards, Wang Wei -Original Message- From: sqlite-users-boun...@mailinglists.sqlite.org [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of Igor Tandetnik Sent: Tuesday, June 07, 2016 10:20 PM To: sqlite-users@mailinglists.sqlite.org Subject: Re: [sqlite] Latin-1 characters cannot be supported for Unicode On 6/7/2016 3:43 AM, Wang, Wei wrote: > I met a problem that was maybe caused by the encoding of SQLite. I inserted a > item which including some Latin1 characters like Ç and à into a table. Then > I opened the database with SQLite Developer. After I setting the encoding to > ANSI, the display and the query result for that table were OK. > However after I setting the encoding to Unicode, these Latin1 characters > could not be displayed normally, and could not be queried out. Please see the > attached pictures for the details. A byte sequence containing Latin-1-encoded characters Ç or à is not in fact a valid byte sequence in any Unicode encoding - neither UTF-8 nor UTF-16 nor any other. If you want Unicode data in your database, then store Unicode data, and not ANSI, in your database. -- Igor Tandetnik ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ Gunter Hick Software Engineer Scientific Games International GmbH FN 157284 a, HG Wien Klitschgasse 2-4, A-1130 Vienna, Austria Tel: +43 1 80100 0 E-Mail: h...@scigames.at This communication (including any attachments) is intended for the use of the intended recipient(s) only and may contain information that is confidential, privileged or legally protected. Any unauthorized use or dissemination of this communication is strictly prohibited. If you have received this communication in error, please immediately notify the sender by return e-mail message and delete all copies of the original communication. Thank you for your cooperation. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Latin-1 characters cannot be supported for Unicode
On 6/7/2016 9:49 PM, Wang, Wei wrote: Thanks for your reply! But I found the Latin-1 encoded characters are listed in the Unicode chart. http://unicode.org/charts/PDF/U0080.pdf All the characters available in Latin-1 codepage are indeed also available in Unicode. However, the same character is represented by a different sequence of bytes when encoded in an ANSI codepage, in UTF-8, and in UTF-16. For example, character Ç (aka U+00C7) is represented by a single byte C7 in ANSI Latin-1 encoding, by two bytes C3 87 in UTF-8, and by two bytes C7 00 in UTF-16LE. I suggest you read http://www.joelonsoftware.com/articles/Unicode.html . -- Igor Tandetnik ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Latin-1 characters cannot be supported for Unicode
Thanks for your reply! But I found the Latin-1 encoded characters are listed in the Unicode chart. http://unicode.org/charts/PDF/U0080.pdf Best Regards, Wang Wei -Original Message- From: sqlite-users-boun...@mailinglists.sqlite.org [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of Igor Tandetnik Sent: Tuesday, June 07, 2016 10:20 PM To: sqlite-users@mailinglists.sqlite.org Subject: Re: [sqlite] Latin-1 characters cannot be supported for Unicode On 6/7/2016 3:43 AM, Wang, Wei wrote: > I met a problem that was maybe caused by the encoding of SQLite. I inserted a > item which including some Latin1 characters like Ç and à into a table. Then > I opened the database with SQLite Developer. After I setting the encoding to > ANSI, the display and the query result for that table were OK. > However after I setting the encoding to Unicode, these Latin1 characters > could not be displayed normally, and could not be queried out. Please see the > attached pictures for the details. A byte sequence containing Latin-1-encoded characters Ç or à is not in fact a valid byte sequence in any Unicode encoding - neither UTF-8 nor UTF-16 nor any other. If you want Unicode data in your database, then store Unicode data, and not ANSI, in your database. -- Igor Tandetnik ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Latin-1 characters cannot be supported for Unicode
As already stated, this is not a problem of SQLite. SQLite assumes all input to be correctly encoded in UTF (unicode), the precise flavor of which may be set (once, between creating a db file and the first insert) by a pragma. If you insert ISO (latin) encoded strings, SQLite will faithfully reproduce the exact sequence of bytes presented on insert. As long as you use the same encoding to display the results, everything seems tob e ok, even though the byte sequence stored is technically worng. If you insist on interpreting these using a different encoding and without explicitly converting, then you will experience problems with characters that encode differently. -Ursprüngliche Nachricht- Von: sqlite-users-boun...@mailinglists.sqlite.org [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im Auftrag von Wang, Wei Gesendet: Dienstag, 07. Juni 2016 09:43 An: sqlite-users@mailinglists.sqlite.org Betreff: [sqlite] Latin-1 characters cannot be supported for Unicode Hi, I met a problem that was maybe caused by the encoding of SQLite. I inserted a item which including some Latin1 characters like Ç and à into a table. Then I opened the database with SQLite Developer. After I setting the encoding to ANSI, the display and the query result for that table were OK. However after I setting the encoding to Unicode, these Latin1 characters could not be displayed normally, and could not be queried out. Please see the attached pictures for the details. Best Regards, Wang Wei ___ Gunter Hick Software Engineer Scientific Games International GmbH FN 157284 a, HG Wien Klitschgasse 2-4, A-1130 Vienna, Austria Tel: +43 1 80100 0 E-Mail: h...@scigames.at This communication (including any attachments) is intended for the use of the intended recipient(s) only and may contain information that is confidential, privileged or legally protected. Any unauthorized use or dissemination of this communication is strictly prohibited. If you have received this communication in error, please immediately notify the sender by return e-mail message and delete all copies of the original communication. Thank you for your cooperation. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Latin-1 characters cannot be supported for Unicode
On 7 Jun 2016, at 8:43am, Wang, Wei wrote: > Then I opened the database with SQLite Developer. SQLite Developer is not supported by the team which wrote SQLite. It's just a program which uses SQLite. If it allows you to pick character encoding then it is not correctly showing you the contents of your database so you should not necessarily trust what you see. If you want to see what's really in your database please use the SQLite shell tool, which was written by the team which wrote SQLite and is understood to be 100% correct. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Latin-1 characters cannot be supported for Unicode
On 6/7/2016 3:43 AM, Wang, Wei wrote: I met a problem that was maybe caused by the encoding of SQLite. I inserted a item which including some Latin1 characters like Ç and à into a table. Then I opened the database with SQLite Developer. After I setting the encoding to ANSI, the display and the query result for that table were OK. However after I setting the encoding to Unicode, these Latin1 characters could not be displayed normally, and could not be queried out. Please see the attached pictures for the details. A byte sequence containing Latin-1-encoded characters Ç or à is not in fact a valid byte sequence in any Unicode encoding - neither UTF-8 nor UTF-16 nor any other. If you want Unicode data in your database, then store Unicode data, and not ANSI, in your database. -- Igor Tandetnik ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Latin-1 characters cannot be supported for Unicode
Hi, I met a problem that was maybe caused by the encoding of SQLite. I inserted a item which including some Latin1 characters like Ç and à into a table. Then I opened the database with SQLite Developer. After I setting the encoding to ANSI, the display and the query result for that table were OK. However after I setting the encoding to Unicode, these Latin1 characters could not be displayed normally, and could not be queried out. Please see the attached pictures for the details. Best Regards, Wang Wei ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users