On Jan 22, 4:49 am, Gaurav Veda <vedagau...@gmail.com> wrote: > Hi, > > I am trying to put some webpages into a mysql database using python > (after some processing on the text). If I use Python 2.4.2, it works > without a fuss. However, on Python 2.5, I get the following error: > > UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position > 4357: ordinal not in range(128) > > Before sending the (insert) query to the mysql server, I do the > following which I think should've taken care of this problem: > sqlStr = sqlStr.replace('\\', '\\\\') > > (where sqlStr is the query). > > Any suggestions?
The 0xc2 strongly suggests that you are feeding the beast data encoded in UTF-8 while giving it no reason to believe that it is in fact not encoded in ASCII. Curiously the first errant byte is a long way (4KB) into your data. Consider doing print repr(data) to see what you've actually got there. I'm a little skeptical about the "2.4 works, 2.5 doesn't" notion -- different versions of mysql, perhaps? Show at the very least the full traceback that you get. Try to write a short script that demonstrates the problem with 2.5 and no problem with 2.4, so that (a) it is apparent what you are doing (b) the problem can be reproduced if necessary by someone with access to mysql. You might like to explain why you think that doubling backslashes in your SQL is a good idea, and amplify "some processing on the text". HTH, John -- http://mail.python.org/mailman/listinfo/python-list