Hey, it is solved. Using iconv to convert my DB didn't work - there were always some characters missing.
I found this script: http://hg.pyobject.ru/webapp/sandbox/file/a2a33c1ab7d1/movedb/movedb.py it was meant to help moving a DB from SQLite to MySQL using SA, but it did the job great. I just created the new DB with utf8 collation, and passed moved.py the right SA parameters. Thank you! Yo'av 2009/10/14 Yo'av Moshe <bje...@gmail.com> > Hey, thanks. > I tried to add the encoding parameter with the "latin1" value, but it > messed up everything and all of content was shown wrong. > > I decided to try to convert my whole DB into UTF-8, but I found out that > I'm not sure how SA converts the gibberish in my DB into Hebrew. After a lot > of trying different encodings, I built a program that will tell me what > conversion is done to my Hebrew strings, so I can revert them back to Hebrew > and then insert them as UTF-8. Apparently I need to use iconv to convert my > sql dump file from utf8 to cp1252, and then I could just insert the sql file > as a UTF-8 file. > > I'll try to convert everything in the next few days and will let you know. > Anyhow, the program called "Memir" is released here - > http://github.com/bjesus/memir . It's a PyGTK application that helps you > test different encodings quickly, and trace conversions. > > Thank you, > Yo'av. > > 2009/10/13 Michael Bayer <mike...@zzzcomputing.com> > > >> On Oct 12, 2009, at 7:22 PM, Yo'av Moshe wrote: >> >> Hey, >> Yes, I'm using a MySQL 5. >> >> I understand that the problem is probably happening because of some data I >> have in my DB, but it's seems odd to me since everything I have in this DB >> was created using SA. Can't it read the data it written? >> >> My mysql connection is specified with "charset=latin1&unicode=0". My >> website is shown right, and if I set it to charsrt=utf8 like the wiki says >> everything is garbled. The charset is because that is my mysql's tables' >> encoding. >> >> Maybe if I used utf8 when I created the tables it was working now, but >> it's too late and I just don't understand how come everything works except >> for this search query, and how come SA created data it cannot read, and why >> the hell it works the second time ... :( >> >> >> so if your MySQL DB is all in latin1, then you'd have to use that >> character set across the board, including the "encoding" parameter sent to >> create_engine() - it defaults to utf-8, which is why you see that in your >> error message. >> >> to dig deeper you'd have to really understand exactly what is present in >> your tables. This would involve pulling out the row as a raw string and >> just trying to decode it with different encodings to see what you have. >> >> I'm not sure that "latin1" encoding can handle hebrew characters either >> (maybe it can, I've never used "latin1" extensively), that's something you >> might want to research as well. >> >> >> >> >> >> >> >> Yo'av >> >> 2009/10/11 Michael Bayer <mike...@zzzcomputing.com> >> >>> >>> On Oct 11, 2009, at 2:29 PM, Yo'av Moshe wrote: >>> >>> No, the error is an UnicodeDecodeError (http://paste2.org/p/457059). >>> I can't just "try" a different DB, switch to SQLite, etc. As I've said, >>> my website is on production and I have a lot of users using it. >>> >>> >>> the purpose of "trying" a different database is to narrow down the cause >>> of the issue, not that you would switch the platform in use for production. >>> >>> One thing you should be aware of is that your program is failing due to >>> the data coming back in your result set, not the data being bound to your >>> SQL query. You likely have mis-encoded data present in your table which is >>> matched by the criterion you're sending it. When the data is fetched, it >>> cannot be decoded via utf-8. >>> >>> Also you havent as yet told us what database you're using , but I'm >>> guessing MySQL, in which case you should ensure that you are using the >>> correct client encoding as well as the correct encoding in your schema. >>> These are MySQL settings, not SQLAlchemy. client encoding can be specified >>> with create_engine() ( >>> http://www.sqlalchemy.org/trac/wiki/DatabaseNotes#MySQL) or within >>> my.cnf. >>> >>> >>> >>> >>> Also, the problem is something that started lately, probably because of >>> some content that a user has uploaded, so a new DB will work for sure, even >>> if it's the same kind. But, I need it to work with my DB, or a least >>> understand what caused it so I can make sure it never happens again. >>> >>> I'll check my DBAPI, although I'm pretty sure it's that latest one that >>> is shipped with CentOS5. >>> >>> Thank you, >>> Yo'av >>> >>> 2009/10/10 Michael Bayer <mike...@zzzcomputing.com> >>> >>>> >>>> On Oct 10, 2009, at 3:43 AM, Yo'av Moshe wrote: >>>> >>>> Any ideas? >>>> I still don't understand why the query is failing even when I'm using a >>>> unicode object. >>>> >>>> >>>> whats the error ? "EOF in multi-line statement" ? thats not a >>>> SQLAlchemy error message. what happens when you try SQLA 0.5.6 (perhaps >>>> there was some quirk regarding encoding that was fixed) ? a different / >>>> latest version of your DBAPI (perhaps your DBAPI is misunderstanding a >>>> character as a newline ) ? try SQLite with the same statement ? (what >>>> database are you using ?) >>>> >>>> >>>> >>>> >>>> Yo'av >>>> >>>> 2009/10/8 Yo'av Moshe <bje...@gmail.com> >>>> >>>>> Thanks, I didn't know about that awful IPython bug... >>>>> >>>>> I checked, and apparently my website is already doing the SA query with >>>>> a unicode object and not with a string one, so I think that it's not the >>>>> u'' >>>>> thing (it's true that I forgot it in my console testing, though). >>>>> What you showed about IPython explains why it didn't give me any result >>>>> when running in IPython with the unicode object - since it wasn't really a >>>>> unicode object. >>>>> >>>>> So again - I *am* querying SA with a unicode object, and still, it >>>>> fails the first time and works the second time. >>>>> >>>>> Yo'av. >>>>> >>>>> 2009/10/7 Wolodja Wentland <wentl...@cl.uni-heidelberg.de> >>>>> >>>>>> On Wed, Oct 07, 2009 at 07:55 -0700, Yo'av Moshe wrote: >>>>>> > See what I mean here (it's me running the same query twice in >>>>>> > IPython): http://paste2.org/p/457059 >>>>>> > >>>>>> > What can cause this behavior?! I can't think of anything! I guess >>>>>> that >>>>>> > one of my users has uploaded some article with some invalid utf8 >>>>>> code, >>>>>> > but should that kill the query? and how come it doesn't kill the >>>>>> > second one? and what can I do to avoid it? >>>>>> >>>>>> In addition to the bug Mike pointed out to you I want to introduce you >>>>>> to my favourite bug this year: >>>>>> >>>>>> https://bugs.launchpad.net/ipython/+bug/339642 >>>>>> >>>>>> If you run into unicode issues with IPython it is wise to check the >>>>>> 'python' behaviour before development code against this bug. >>>>>> >>>>>> kind regards >>>>>> >>>>>> Wolodja Wentland >>>>>> >>>>>> -----BEGIN PGP SIGNATURE----- >>>>>> Version: GnuPG v1.4.10 (GNU/Linux) >>>>>> >>>>>> iQIcBAEBCAAGBQJKzMesAAoJEIt/fTDK8U78OTsP/jLC/OHMy7SqyM4T1OswUsfL >>>>>> 7V4JXjvxk7xSRUaUwWSqbi4FHYPUDVQ3iFD4czVxmqBXeClW8gxJBXCLpYjisXNR >>>>>> yXiDurakbeHG5FxrJEstYK9S2ZCM5uAx/aFy8PdT6rf7UO6XAi6nJ7xxQaMx4JMX >>>>>> XoA4oU1HsyOh8a0eg8NkmpMVJxeeZxr4DjlfLmXosMEpysG3d+mdq9SkKfKXGEsS >>>>>> t8PQqJDw8uLS+XdMmVLuwK6RtHV+ojNkH/FBQ6qfMGJEFWleeh2cKxiBoNTqOKlg >>>>>> sf9PznO/63HrswpeUJb8gfPs3tq7Mxa9DJzhgBc0U3toRg2VPjQTASXDc4PYqsJd >>>>>> K+WT/vbhpy34VDTABEPdD1DAxgit5H7AI+4DP6l5610qgWn1eNG6/jUi3mRIbojI >>>>>> S24/3udaFhOY/0NNDcI5mMijr77sjMbTSizO8ITabef/o9IiYkob32+0pW3j3+aO >>>>>> 0kK4SwWtoJ4qWwFsOD4ANcg5QjC9KcL2NlYe2gtWQhk3f9Fz9FbdfNzAptNvs94v >>>>>> qic2JONG9aa/CWnqO6RjF0JUCXIcUyr3jr5eKsBh9mli6wd3RYJbRZXHAXBD7ypA >>>>>> 3MPd2gX72zl6lCM+gJWgedK7c1YB6YbDcie+hGrj4m/0oHZeZdThbZJLymxvFRul >>>>>> 0gr9vxE99ggO3sTq9XLr >>>>>> =2y73 >>>>>> -----END PGP SIGNATURE----- >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Yo'av Moshe >>>>> >>>> >>>> >>>> -- >>>> Yo'av Moshe >>>> >>>> >>>> >>>> >>>> >>>> >>> >>> -- >>> Yo'av Moshe >>> >>> >>> >>> >>> >>> >> >> -- >> Yo'av Moshe >> >> >> >> >> >> >> > > -- > Yo'av Moshe > -- Yo'av Moshe --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~----------~----~----~----~------~----~------~--~---