On Thu, 06 Jul 2006 19:16:53 +0200, Stefan Behnel wrote: >> >> Is there a correct way to handle text input from a <FORM> when the page is >> utf-8 and that input is going to be used in SQL statements? I've tried >> things like (with no success): >> sql = u"select * from blah where col='%s'" % input > > What about " ... % unicode(input, "UTF-8")" ? > >
I guess it's similar, I've had partial success with input.decode('utf-8') before DB usage, and then output.encode('utf-8') for output. But although this stores and displays newly added utf-8 texts correctly, it causes other problems when displaying the existing texts. I think they're suffering from a double encoding issue. It seems rather strange the encode/decode appears to be required now, and not before. Is this how it should be done? > > You didn't tell us what database you are using, which encoding your > database uses, which Python-DB interface library you deploy, and lots of > other things that might be helpful to solve your problem. That would be MySQLdb with latin1, but I've tried various methods to make it utf-8 (lots of guidance online). But this was only after I discovered the breakage with the newer python. I.e. it has worked for years on both machines and various python versions. I omitted that info because I can paste the SQL into mysql's shell, it does the expected thing with no errors, so I assumed the DB itself isn't the cause. I guess it could be a new MySQLdb issue causing breakage. I feel I can see part of the light, but if I'm close to what I think is needed, it's not practical to change everything to handle encode/decode site wide, especially as some of the data gets moved to Oracle for other applications (most is written in Perl). I'm thinking I need to do this now, is this the norm?: get user input from web text.encode('utf-8') store or use as search in DB text.decode('utf-8') display page etc The encode/decode stages have never been required before :-( > > Stefan -- http://mail.python.org/mailman/listinfo/python-list