Hey, it is solved.
Using iconv to convert my DB didn't work - there were always some characters
missing.

I found this script:
http://hg.pyobject.ru/webapp/sandbox/file/a2a33c1ab7d1/movedb/movedb.py
it was meant to help moving a DB from SQLite to MySQL using SA, but it did
the job great.
I just created the new DB with utf8 collation, and passed moved.py the right
SA parameters.

Thank you!

Yo'av

2009/10/14 Yo'av Moshe <bje...@gmail.com>

> Hey, thanks.
> I tried to add the encoding parameter with the "latin1" value, but it
> messed up everything and all of content was shown wrong.
>
> I decided to try to convert my whole DB into UTF-8, but I found out that
> I'm not sure how SA converts the gibberish in my DB into Hebrew. After a lot
> of trying different encodings, I built a program that will tell me what
> conversion is done to my Hebrew strings, so I can revert them back to Hebrew
> and then insert them as UTF-8. Apparently I need to use iconv to convert my
> sql dump file from utf8 to cp1252, and then I could just insert the sql file
> as a UTF-8 file.
>
> I'll try to convert everything in the next few days and will let you know.
> Anyhow, the program called "Memir" is released here -
> http://github.com/bjesus/memir . It's a PyGTK application that helps you
> test different encodings quickly, and trace conversions.
>
> Thank you,
> Yo'av.
>
> 2009/10/13 Michael Bayer <mike...@zzzcomputing.com>
>
>
>> On Oct 12, 2009, at 7:22 PM, Yo'av Moshe wrote:
>>
>> Hey,
>> Yes, I'm using a MySQL 5.
>>
>> I understand that the problem is probably happening because of some data I
>> have in my DB, but it's seems odd to me since everything I have in this DB
>> was created using SA. Can't it read the data it written?
>>
>> My mysql connection is specified with "charset=latin1&unicode=0". My
>> website is shown right, and if I set it to charsrt=utf8 like the wiki says
>> everything is garbled. The charset is because that is my mysql's tables'
>> encoding.
>>
>> Maybe if I used utf8 when I created the tables it was working now, but
>> it's too late and I just don't understand how come everything works except
>> for this search query, and how come SA created data it cannot read, and why
>> the hell it works the second time ... :(
>>
>>
>> so if your MySQL DB is all in latin1, then you'd have to use that
>> character set across the board, including the "encoding" parameter sent to
>> create_engine() - it defaults to utf-8, which is why you see that in your
>> error message.
>>
>> to dig deeper you'd have to really understand exactly what is present in
>> your tables.   This would involve pulling out the row as a raw string and
>> just trying to decode it with different encodings to see what you have.
>>
>> I'm not sure that "latin1" encoding can handle hebrew characters either
>> (maybe it can, I've never used "latin1" extensively), that's something you
>> might want to research as well.
>>
>>
>>
>>
>>
>>
>>
>> Yo'av
>>
>> 2009/10/11 Michael Bayer <mike...@zzzcomputing.com>
>>
>>>
>>> On Oct 11, 2009, at 2:29 PM, Yo'av Moshe wrote:
>>>
>>> No, the error is an UnicodeDecodeError (http://paste2.org/p/457059).
>>> I can't just "try" a different DB, switch to SQLite, etc. As I've said,
>>> my website is on production and I have a lot of users using it.
>>>
>>>
>>> the purpose of "trying" a different database is to narrow down the cause
>>> of the issue, not that you would switch the platform in use for production.
>>>
>>> One thing you should be aware of is that your program is failing due to
>>> the data coming back in your result set, not the data being bound to your
>>> SQL query.   You likely have mis-encoded data present in your table which is
>>> matched by the criterion you're sending it.   When the data is fetched, it
>>> cannot be decoded via utf-8.
>>>
>>> Also you havent as yet told us what database you're using , but I'm
>>> guessing MySQL, in which case you should ensure that you are using the
>>> correct client encoding as well as the correct encoding in your schema.
>>> These are MySQL settings, not SQLAlchemy.  client encoding can be specified
>>> with create_engine() (
>>> http://www.sqlalchemy.org/trac/wiki/DatabaseNotes#MySQL)  or within
>>> my.cnf.
>>>
>>>
>>>
>>>
>>> Also, the problem is something that started lately, probably because of
>>> some content that a user has uploaded, so a new DB will work for sure, even
>>> if it's the same kind. But, I need it to work with my DB, or a least
>>> understand what caused it so I can make sure it never happens again.
>>>
>>> I'll check my DBAPI, although I'm pretty sure it's that latest one that
>>> is shipped with CentOS5.
>>>
>>> Thank you,
>>> Yo'av
>>>
>>> 2009/10/10 Michael Bayer <mike...@zzzcomputing.com>
>>>
>>>>
>>>> On Oct 10, 2009, at 3:43 AM, Yo'av Moshe wrote:
>>>>
>>>> Any ideas?
>>>> I still don't understand why the query is failing even when I'm using a
>>>> unicode object.
>>>>
>>>>
>>>> whats the error ?  "EOF in multi-line statement" ?  thats not a
>>>> SQLAlchemy error message.   what happens when you try SQLA 0.5.6 (perhaps
>>>> there was some quirk regarding encoding that was fixed) ?  a different /
>>>> latest version of your DBAPI (perhaps your DBAPI is misunderstanding a
>>>> character as a newline ) ?  try SQLite with the same statement  ?  (what
>>>> database are you using ?)
>>>>
>>>>
>>>>
>>>>
>>>> Yo'av
>>>>
>>>> 2009/10/8 Yo'av Moshe <bje...@gmail.com>
>>>>
>>>>> Thanks, I didn't know about that awful IPython bug...
>>>>>
>>>>> I checked, and apparently my website is already doing the SA query with
>>>>> a unicode object and not with a string one, so I think that it's not the 
>>>>> u''
>>>>> thing (it's true that I forgot it in my console testing, though).
>>>>> What you showed about IPython explains why it didn't give me any result
>>>>> when running in IPython with the unicode object - since it wasn't really a
>>>>> unicode object.
>>>>>
>>>>> So again - I *am* querying SA with a unicode object, and still, it
>>>>> fails the first time and works the second time.
>>>>>
>>>>> Yo'av.
>>>>>
>>>>> 2009/10/7 Wolodja Wentland <wentl...@cl.uni-heidelberg.de>
>>>>>
>>>>>> On Wed, Oct 07, 2009 at 07:55 -0700, Yo'av Moshe wrote:
>>>>>> > See what I mean here (it's me running the same query twice in
>>>>>> > IPython): http://paste2.org/p/457059
>>>>>> >
>>>>>> > What can cause this behavior?! I can't think of anything! I guess
>>>>>> that
>>>>>> > one of my users has uploaded some article with some invalid utf8
>>>>>> code,
>>>>>> > but should that kill the query? and how come it doesn't kill the
>>>>>> > second one? and what can I do to avoid it?
>>>>>>
>>>>>> In addition to the bug Mike pointed out to you I want to introduce you
>>>>>> to my favourite bug this year:
>>>>>>
>>>>>> https://bugs.launchpad.net/ipython/+bug/339642
>>>>>>
>>>>>> If you run into unicode issues with IPython it is wise to check the
>>>>>> 'python' behaviour before development code against this bug.
>>>>>>
>>>>>> kind regards
>>>>>>
>>>>>>    Wolodja Wentland
>>>>>>
>>>>>> -----BEGIN PGP SIGNATURE-----
>>>>>> Version: GnuPG v1.4.10 (GNU/Linux)
>>>>>>
>>>>>> iQIcBAEBCAAGBQJKzMesAAoJEIt/fTDK8U78OTsP/jLC/OHMy7SqyM4T1OswUsfL
>>>>>> 7V4JXjvxk7xSRUaUwWSqbi4FHYPUDVQ3iFD4czVxmqBXeClW8gxJBXCLpYjisXNR
>>>>>> yXiDurakbeHG5FxrJEstYK9S2ZCM5uAx/aFy8PdT6rf7UO6XAi6nJ7xxQaMx4JMX
>>>>>> XoA4oU1HsyOh8a0eg8NkmpMVJxeeZxr4DjlfLmXosMEpysG3d+mdq9SkKfKXGEsS
>>>>>> t8PQqJDw8uLS+XdMmVLuwK6RtHV+ojNkH/FBQ6qfMGJEFWleeh2cKxiBoNTqOKlg
>>>>>> sf9PznO/63HrswpeUJb8gfPs3tq7Mxa9DJzhgBc0U3toRg2VPjQTASXDc4PYqsJd
>>>>>> K+WT/vbhpy34VDTABEPdD1DAxgit5H7AI+4DP6l5610qgWn1eNG6/jUi3mRIbojI
>>>>>> S24/3udaFhOY/0NNDcI5mMijr77sjMbTSizO8ITabef/o9IiYkob32+0pW3j3+aO
>>>>>> 0kK4SwWtoJ4qWwFsOD4ANcg5QjC9KcL2NlYe2gtWQhk3f9Fz9FbdfNzAptNvs94v
>>>>>> qic2JONG9aa/CWnqO6RjF0JUCXIcUyr3jr5eKsBh9mli6wd3RYJbRZXHAXBD7ypA
>>>>>> 3MPd2gX72zl6lCM+gJWgedK7c1YB6YbDcie+hGrj4m/0oHZeZdThbZJLymxvFRul
>>>>>> 0gr9vxE99ggO3sTq9XLr
>>>>>> =2y73
>>>>>> -----END PGP SIGNATURE-----
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Yo'av Moshe
>>>>>
>>>>
>>>>
>>>> --
>>>> Yo'av Moshe
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>> --
>>> Yo'av Moshe
>>>
>>>
>>>
>>>
>>>
>>>
>>
>> --
>> Yo'av Moshe
>>
>>
>>
>>
>> >>
>>
>
> --
> Yo'av Moshe
>


-- 
Yo'av Moshe

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to