RE: BLOB data gets encoded as utf8!
Thank you for your reply, Janusz. I appreciate your help. I have tried making that call before the INSERT statement, but to no avail. The table collation is set to utf8 - default collation, and all columns are set to Table default. I am thinking that this problem might be due to me sending the BLOB data in a unicode string to the server, i.e: INSERT INTO TableName (Rawfield) VALUES ('%s'); - whereas the %s string is a unicode string That string is encoded by escaping mysql characters according to http://dev.mysql.com/doc/refman/5.0/en/string-syntax.html (0x00, 0x22, 0x27, 0x5C). Then each byte (and each escaped byte) is followed by a 0x00 before being sent to the server, just so that I can send it as a unicode string via ODBC. I.e. the binary data FA 5C 93 A4 would be expanded into FA 00 5C 00 5C 00 93 00 A4 00 Otherwise I can't send the data in a unicode INSERT statement via ODBC from C++ / MFC. Do you think that could be the issue? If so, do you have a suggestion on how to do it better? Thank you again for your help! Sincerely, Andreas Iwanowski -Original Message- From: Janusz Paśkiewicz [mailto:ad...@multipasko.pl] Sent: Wednesday, February 09, 2011 7:45 AM To: Andreas Iwanowski Subject: BLOB data gets encoded as utf8! (Anyone?) Before inserting BLOB data use: mysql_query(SET CHARACTER SET latin1); after that you can set it back to: mysql_query(SET CHARACTER SET utf8); This is due to: http://dev.mysql.com/doc/refman/5.1/en/charset-conversion.html If the column has a binary data type (BINARY http://dev.mysql.com/doc/refman/5.1/en/binary-varbinary.html , VARBINARY http://dev.mysql.com/doc/refman/5.1/en/binary-varbinary.html , BLOB http://dev.mysql.com/doc/refman/5.1/en/blob.html ), all the values that it contains must be encoded using a single character set (the character set you're converting the column to). If you use a binary column to store information in multiple character sets, MySQL has no way to know which values use which character set and cannot convert the data properly. Kind Regards, Janusz Paskiewicz www.freelancer4u.net -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: BLOB data gets encoded as utf8!
I can't help but wonder, if you send a string, does that mean you're putting text in a blob ? Blobs are binary, and thus don't get encoded in the sense of UTF8 vs Unicode. For a string, you may want a TEXT type column. On the other hand, if you're indeed trying to insert binary data, it is not the best of ideas to insert it into an SQL string, where it will be subject to UTF8 or whatever interpretation. For large objects, and generally with repeating statements too, it's best to use bind variables to send your data. That also removes the need to worry about escaping, sql insertion and the like. 2011/2/10 Andreas Iwanowski namez...@afim.info Thank you for your reply, Janusz. I appreciate your help. I have tried making that call before the INSERT statement, but to no avail. The table collation is set to utf8 - default collation, and all columns are set to Table default. I am thinking that this problem might be due to me sending the BLOB data in a unicode string to the server, i.e: INSERT INTO TableName (Rawfield) VALUES ('%s'); - whereas the %s string is a unicode string That string is encoded by escaping mysql characters according to http://dev.mysql.com/doc/refman/5.0/en/string-syntax.html (0x00, 0x22, 0x27, 0x5C). Then each byte (and each escaped byte) is followed by a 0x00 before being sent to the server, just so that I can send it as a unicode string via ODBC. I.e. the binary data FA 5C 93 A4 would be expanded into FA 00 5C 00 5C 00 93 00 A4 00 Otherwise I can't send the data in a unicode INSERT statement via ODBC from C++ / MFC. Do you think that could be the issue? If so, do you have a suggestion on how to do it better? Thank you again for your help! Sincerely, Andreas Iwanowski -Original Message- From: Janusz Paśkiewicz [mailto:ad...@multipasko.pl] Sent: Wednesday, February 09, 2011 7:45 AM To: Andreas Iwanowski Subject: BLOB data gets encoded as utf8! (Anyone?) Before inserting BLOB data use: mysql_query(SET CHARACTER SET latin1); after that you can set it back to: mysql_query(SET CHARACTER SET utf8); This is due to: http://dev.mysql.com/doc/refman/5.1/en/charset-conversion.html If the column has a binary data type (BINARY http://dev.mysql.com/doc/refman/5.1/en/binary-varbinary.html , VARBINARY http://dev.mysql.com/doc/refman/5.1/en/binary-varbinary.html , BLOB http://dev.mysql.com/doc/refman/5.1/en/blob.html ), all the values that it contains must be encoded using a single character set (the character set you're converting the column to). If you use a binary column to store information in multiple character sets, MySQL has no way to know which values use which character set and cannot convert the data properly. Kind Regards, Janusz Paskiewicz www.freelancer4u.net -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=vegiv...@tuxera.be -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel
Re: BLOB data gets encoded as utf8! (Anyone?)
I think that probably the mistake is in the way you are inserting the blob. If i understand right, you are inserting the blob with a traditional insert statement ??, in my experience i always use prepared statements to insert blobs, in that way the text goes exactly as the source without any utf-8 conversion. Besides this, using the insert can truncate your blob / statement, because some drivers/connectors limit the size of the sql statement, so if you are uploading a huge file the statement will not work (I dont know if this happens with the mysql driver or if there is a limit), so i strongly recommend using prepared statements for blobs with any database. Hope this helps. Carlos Proal On 8/28/2010 1:04 AM, Andreas Iwanowski wrote: Has no one encountered this before? There has got to be a solution, and I still haven't found it... So if you have any input please let me know! -Original Message- From: Andreas Iwanowski [mailto:namez...@afim.info] Sent: Tuesday, August 24, 2010 2:48 PM To: mysql@lists.mysql.com Subject: BLOB data gets encoded as utf8! Hello everyone! I am using an MFC unicode project that uses ODBC to access a MySQL 5.1.50 database via the MySQL ODBC 5.1.6 driver. character_set_connection is set to utf8 (Which I believe is the default for the driver) One of the tables contains two LONGBLOB columns, and the table default charset is utf-8 (since the application is unicode). However, when inserting into the LONGBLOB columns via an INSERT statement, the data gets corrupted/modified because is incorrectly UTF-8 encoded. My insert statement (simplified) does this: INSERT INTO _table_ (Desc, Data) VALUES ('...', '_blobdata_'); I have also tried the _binary introducer (http://dev.mysql.com/doc/refman/5.1/en/string-syntax.html) INSERT INTO _table_ (Desc, Data) VALUES ('...', _binary'_blobdata_'); I'm escaping 0x00, 0x22, 0x27, and 0x5C, which are the only four that MySQL requires to be escaped per documentation for BLOB fields, and according to the MySQL docs columns of any BLOB type use no character set. Here is the problem: I found that, for example, 0xFF gets modified to 0xC3BF, which is in fact the UTF8 encoding for ASCII/Binary 0xFF. Needless to say the data becomes useless. I did, however, determine that the data is already corrupted in the myodbc.log file that the driver outputs if the option is set, so there is likely a problem with the driver settings or with the statement itself. I cannot be the first one to encounter this issue, so maybe if you have an idea (or a solution would be even better! :) then please let me know. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=namez...@afim.info -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: BLOB data gets encoded as utf8! (Anyone?)
Is the code you use to get the data out in the same charset as the code you use to put the data in ? Both should ideally also match your database setting. Have you tried explicitly setting the connection to UTF8 ? Just swinging in the dark, here, really. On Sat, Aug 28, 2010 at 8:04 AM, Andreas Iwanowski namez...@afim.infowrote: Has no one encountered this before? There has got to be a solution, and I still haven't found it... So if you have any input please let me know! -Original Message- From: Andreas Iwanowski [mailto:namez...@afim.info] Sent: Tuesday, August 24, 2010 2:48 PM To: mysql@lists.mysql.com Subject: BLOB data gets encoded as utf8! Hello everyone! I am using an MFC unicode project that uses ODBC to access a MySQL 5.1.50 database via the MySQL ODBC 5.1.6 driver. character_set_connection is set to utf8 (Which I believe is the default for the driver) One of the tables contains two LONGBLOB columns, and the table default charset is utf-8 (since the application is unicode). However, when inserting into the LONGBLOB columns via an INSERT statement, the data gets corrupted/modified because is incorrectly UTF-8 encoded. My insert statement (simplified) does this: INSERT INTO _table_ (Desc, Data) VALUES ('...', '_blobdata_'); I have also tried the _binary introducer (http://dev.mysql.com/doc/refman/5.1/en/string-syntax.html) INSERT INTO _table_ (Desc, Data) VALUES ('...', _binary'_blobdata_'); I'm escaping 0x00, 0x22, 0x27, and 0x5C, which are the only four that MySQL requires to be escaped per documentation for BLOB fields, and according to the MySQL docs columns of any BLOB type use no character set. Here is the problem: I found that, for example, 0xFF gets modified to 0xC3BF, which is in fact the UTF8 encoding for ASCII/Binary 0xFF. Needless to say the data becomes useless. I did, however, determine that the data is already corrupted in the myodbc.log file that the driver outputs if the option is set, so there is likely a problem with the driver settings or with the statement itself. I cannot be the first one to encounter this issue, so maybe if you have an idea (or a solution would be even better! :) then please let me know. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=namez...@afim.info -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=vegiv...@tuxera.be -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel
BLOB data gets encoded as utf8! (Anyone?)
Has no one encountered this before? There has got to be a solution, and I still haven't found it... So if you have any input please let me know! -Original Message- From: Andreas Iwanowski [mailto:namez...@afim.info] Sent: Tuesday, August 24, 2010 2:48 PM To: mysql@lists.mysql.com Subject: BLOB data gets encoded as utf8! Hello everyone! I am using an MFC unicode project that uses ODBC to access a MySQL 5.1.50 database via the MySQL ODBC 5.1.6 driver. character_set_connection is set to utf8 (Which I believe is the default for the driver) One of the tables contains two LONGBLOB columns, and the table default charset is utf-8 (since the application is unicode). However, when inserting into the LONGBLOB columns via an INSERT statement, the data gets corrupted/modified because is incorrectly UTF-8 encoded. My insert statement (simplified) does this: INSERT INTO _table_ (Desc, Data) VALUES ('...', '_blobdata_'); I have also tried the _binary introducer (http://dev.mysql.com/doc/refman/5.1/en/string-syntax.html) INSERT INTO _table_ (Desc, Data) VALUES ('...', _binary'_blobdata_'); I'm escaping 0x00, 0x22, 0x27, and 0x5C, which are the only four that MySQL requires to be escaped per documentation for BLOB fields, and according to the MySQL docs columns of any BLOB type use no character set. Here is the problem: I found that, for example, 0xFF gets modified to 0xC3BF, which is in fact the UTF8 encoding for ASCII/Binary 0xFF. Needless to say the data becomes useless. I did, however, determine that the data is already corrupted in the myodbc.log file that the driver outputs if the option is set, so there is likely a problem with the driver settings or with the statement itself. I cannot be the first one to encounter this issue, so maybe if you have an idea (or a solution would be even better! :) then please let me know. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=namez...@afim.info -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
BLOB data gets encoded as utf8!
Hello everyone! I am using an MFC unicode project that uses ODBC to access a MySQL 5.1.50 database via the MySQL ODBC 5.1.6 driver. One of the tables contains two LONGBLOB columns, and the table default charset is utf-8 (since the application is unicode). However, when inserting into the LONGBLOB columns via an INSERT statement, the data gets corrupted/modified because is incorrectly UTF-8 encoded. My insert statement (simplified) does this: INSERT INTO _table_ (Desc, Data) VALUES ('...', '_blobdata_'); I have also tried: -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
BLOB data gets encoded as utf8! (First post incomplete)
Hello everyone! I am using an MFC unicode project that uses ODBC to access a MySQL 5.1.50 database via the MySQL ODBC 5.1.6 driver. character_set_connection is set to utf8 (Which I believe is the default for the driver) One of the tables contains two LONGBLOB columns, and the table default charset is utf-8 (since the application is unicode). However, when inserting into the LONGBLOB columns via an INSERT statement, the data gets corrupted/modified because is incorrectly UTF-8 encoded. My insert statement (simplified) does this: INSERT INTO _table_ (Desc, Data) VALUES ('...', '_blobdata_'); I have also tried the _binary introducer (http://dev.mysql.com/doc/refman/5.1/en/string-syntax.html) INSERT INTO _table_ (Desc, Data) VALUES ('...', _binary'_blobdata_'); I'm escaping 0x00, 0x22, 0x27, and 0x5C, which are the only four that MySQL requires to be escaped per documentation for BLOB fields, and according to the MySQL docs columns of any BLOB type use no character set. Here is the problem: I found that, for example, 0xFF gets modified to 0xC3BF, which is in fact the UTF8 encoding for ASCII/Binary 0xFF. Needless to say the data becomes useless. I did, however, determine that the data is already corrupted in the myodbc.log file that the driver outputs if the option is set, so there is likely a problem with the driver settings or with the statement itself. I cannot be the first one to encounter this issue, so maybe if you have an idea (or a solution would be even better! :) then please let me know. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org