On Thu, Sep 3, 2020 at 9:55 AM chat...@gmail.com <chatz...@gmail.com> wrote:
>
> Trying to query all items from a mysql (charset:utf8) table which has a field 
> that contains rows with chinese and other special characters I am taking the 
> above error
>
> items = session.query(Item).all()
>
> File 
> "/root/.local/share/virtualenvs/server-WesSANjA/lib/python3.8/site-packages/MySQLdb/cursors.py",
>  line 355, in _post_get_result self._rows = self._fetch_row(0) File 
> "/root/.local/share/virtualenvs/server-WesSANjA/lib/python3.8/site-packages/MySQLdb/cursors.py",
>  line 328, in _fetch_row return self._result.fetch_row(size, 
> self._fetch_type) File "/usr/local/lib/python3.8/encodings/cp1252.py", line 
> 15, in decode return codecs.charmap_decode(input,errors,decoding_table)
>
> UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 30: 
> character maps to
>

You mention utf8, but the error suggests that the data is being
decoded as cp1252. Are you declaring an explicit charset when you
create your engine, as suggested here:

    https://docs.sqlalchemy.org/en/13/dialects/mysql.html#unicode

What does this output:

    for row in dbsession.execute("show variables like 'character%'").fetchall():
        print(row)

Warning: I've run an application for a long time where I didn't
specify the charset in the connection string. SQLAlchemy defaulted to
encoding strings as utf8 (because the dialect didn't support unicode
strings). However, my output from the above command looked something
like this:

    ('character_set_client', 'latin1')
    ('character_set_connection', 'latin1')
    ('character_set_database', 'utf8')
    ('character_set_filesystem', 'binary')
    ('character_set_results', 'latin1')
    ('character_set_server', 'latin1')
    ('character_set_system', 'utf8')
    ('character_sets_dir', '/usr/share/mysql/charsets/')

This meant that SQLAlchemy was sending utf-8 strings, but the database
was interpreting them as latin1. To make things worse, some of my
tables have a default charset of latin1, and the others are utf8. The
result is that the tables that are declared to hold latin1 data
actually hold utf8, and the tables that are declared to hold utf8
actually hold double-encoded utf8.

Simon

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/CAFHwexfzHCGuwR3Xe-yBRLfMOm%3DjEOpY3KDrbkfPFA20zuB3Zw%40mail.gmail.com.

Reply via email to