Re: [sqlalchemy] SQLAlchemy: UnicodeEncodeError: 'ascii' codec can't encode characters (db engine encoding ignored?)

2020-05-06 Thread Simon King
It might help to display the stack trace when the encoding fails, so
we can see exactly where the error is coming from.

Simon

On Wed, May 6, 2020 at 9:01 AM Anno Nühm  wrote:
>
> I am currently engaged in evaluating SQLAlchemy for a new project. When 
> trying to execute queries containing non-ascii characters an exception is 
> raised.
>
> The SQL statement used for carrying out the evaluation:
>
> SELECT owner, table_name FROM all_tables  WHERE owner LIKE 'äöüßÄÖÜœ';
>
>
> Executing this statement in SQL*Plus, SQL Developer results--as expected--in 
> an empty list.
>
> In order to connect to an Oracle database the following code is being used:
>
> from sqlalchemy import create_engine, MetaData, Table, inspect, select
> import pandas as pd
> import keyring
>
> dbtype = 'Oracle'
> dbenv = 'LOCAL'
> dbname = 'MYDB'
> dbsys = '%s%s' % (dbtype, dbenv)
> dbusr = 'myusr'
> dbpwd = keyring.get_password(dbsys, dbusr)
> dbhost = 'mydbhost'
> dbport = 1521
> dbconstr = 'oracle+cx_oracle://%s:%s@%s:%s/%s' % (dbusr, dbpwd, dbhost, 
> dbport, dbname)
>
>
> To evaluate the database engine encoding:
>
> dbencs = ['UTF8', 'UTF-8', 'utf8', 'utf-8', 'latin1', 'ascii', None]
>
> for dbenc in dbencs:
> if dbenc is None:
> engine = create_engine(dbconstr)
> else:
> engine = create_engine(dbconstr, encoding=dbenc)
> con = engine.connect()
>
> try:
> df = pd.read_sql_query(u'SELECT owner, table_name FROM all_tables 
>  WHERE owner LIKE \'äöüßÄÖÜœ\'', con)
> print('SUCCESS: sql query with db encoding %s succeeded!' % dbenc)
> except Exception as e:
> print('ERROR: sql query with db encoding %s failed (%s)' % 
> (dbenc, e))
>
> con.close()
> engine.dispose()
>
>
> Regardless of the encoding specified when creating the db engine, every 
> single attempt to executed the query raises an exception
>
> ERROR: sql query with db encoding UTF8 failed ('ascii' codec can't encode 
> characters in position 60-66: ordinal not in range(128))
> ERROR: sql query with db encoding UTF-8 failed ('ascii' codec can't 
> encode characters in position 60-66: ordinal not in range(128))
> ERROR: sql query with db encoding utf8 failed ('ascii' codec can't encode 
> characters in position 60-66: ordinal not in range(128))
> ERROR: sql query with db encoding utf-8 failed ('ascii' codec can't 
> encode characters in position 60-66: ordinal not in range(128))
> ERROR: sql query with db encoding latin1 failed ('ascii' codec can't 
> encode characters in position 60-66: ordinal not in range(128))
> ERROR: sql query with db encoding ascii failed ('ascii' codec can't 
> encode characters in position 60-66: ordinal not in range(128))
> ERROR: sql query with db encoding None failed ('ascii' codec can't encode 
> characters in position 60-66: ordinal not in range(128))
>
>
> When connecting to the database directly with cx_Oracle (without SQLAlchemy)
>
> import cx_Oracle
> import pandas as pd
> import keyring
>
> dbtype = 'Oracle'
> dbenv = 'LOCAL'
> dbname = 'MYDB'
> dbsys = '%s%s' % (dbtype, dbenv)
> dbusr = 'myusr'
> dbpwd = keyring.get_password(dbsys, dbusr)
> dbhost = 'mydbhost'
> dbport = 1521
> dbconstr = '%s:%s/%s' % (dbhost, dbport, dbname)
>
>
> dbencs = ['UTF8', 'UTF-8', 'utf8', 'utf-8', 'latin1', 'ascii', None]
>
> for dbenc in dbencs:
> print('=' * 70)
> print('db encoding: %s' % dbenc)
> print('-' * 30)
>
> if dbenc is None:
> connection = cx_Oracle.connect(dbusr, dbpwd, dbconstr)
> else:
> connection = cx_Oracle.connect(dbusr, dbpwd, dbconstr, 
> encoding=dbenc)
> cursor = connection.cursor()
>
> try:
> r = cursor.execute("SELECT owner, table_name FROM all_tables  
> WHERE owner LIKE 'äöüßÄÖÜœ'")
> recs = list()
> for owner, table_name in cursor:
> recs.append({'owner': owner, 'table': table_name})
> df = pd.DataFrame(recs)
> print('SUCCESS: sql query with db encoding %s succeeded!' % dbenc)
> except Exception as e:
> print('ERROR: sql query with db encoding %s failed (%s)' % 
> (dbenc, e))
>
> cursor.close()
> connection.close()
>
>
> everything works as expected.
>
> SUCCESS: sql query with db encoding UTF8 succeeded!
> SUCCESS: sql query with db encoding UTF-8 succeeded!
> SUCCESS: sql query with db encoding utf8 succeeded!
> SUCCESS: sql query with db encoding utf-8 succeeded!
> SUCCESS: sql query with db encoding latin1 succeeded!
> ERROR: sql query with db encoding ascii failed ('ascii' codec can't 
> encode characters in position 60-66: ordinal not in range(128))
> ERROR: sql query with db encoding None failed ('ascii' codec can't encode 
> characters in position 

[sqlalchemy] SQLAlchemy: UnicodeEncodeError: 'ascii' codec can't encode characters (db engine encoding ignored?)

2020-05-06 Thread Anno Nühm
I am currently engaged in evaluating SQLAlchemy for a new project. When 
trying to execute queries containing non-ascii characters an exception is 
raised.

The SQL statement used for carrying out the evaluation:

SELECT owner, table_name FROM all_tables  WHERE owner LIKE 'äöüßÄÖÜœ';


Executing this statement in SQL*Plus, SQL Developer results--as 
expected--in an empty list.

In order to connect to an Oracle database the following code is being used:

from sqlalchemy import create_engine, MetaData, Table, inspect, select
import pandas as pd
import keyring

dbtype = 'Oracle'
dbenv = 'LOCAL'
dbname = 'MYDB'
dbsys = '%s%s' % (dbtype, dbenv)
dbusr = 'myusr'
dbpwd = keyring.get_password(dbsys, dbusr)
dbhost = 'mydbhost'
dbport = 1521
dbconstr = 'oracle+cx_oracle://%s:%s@%s:%s/%s' % (dbusr, dbpwd, dbhost, 
dbport, dbname)


To evaluate the database engine encoding: 

dbencs = ['UTF8', 'UTF-8', 'utf8', 'utf-8', 'latin1', 'ascii', None]

for dbenc in dbencs:
if dbenc is None:
engine = create_engine(dbconstr)
else:
engine = create_engine(dbconstr, encoding=dbenc)
con = engine.connect()

try:
df = pd.read_sql_query(u'SELECT owner, table_name FROM 
all_tables  WHERE owner LIKE \'äöüßÄÖÜœ\'', con)
print('SUCCESS: sql query with db encoding %s succeeded!' % 
dbenc)
except Exception as e:
print('ERROR: sql query with db encoding %s failed (%s)' % (
dbenc, e))

con.close()
engine.dispose()


Regardless of the encoding specified when creating the db engine, every 
single attempt to executed the query raises an exception

ERROR: sql query with db encoding UTF8 failed ('ascii' codec can't 
encode characters in position 60-66: ordinal not in range(128))
ERROR: sql query with db encoding UTF-8 failed ('ascii' codec can't 
encode characters in position 60-66: ordinal not in range(128))
ERROR: sql query with db encoding utf8 failed ('ascii' codec can't 
encode characters in position 60-66: ordinal not in range(128))
ERROR: sql query with db encoding utf-8 failed ('ascii' codec can't 
encode characters in position 60-66: ordinal not in range(128))
ERROR: sql query with db encoding latin1 failed ('ascii' codec can't 
encode characters in position 60-66: ordinal not in range(128))
ERROR: sql query with db encoding ascii failed ('ascii' codec can't 
encode characters in position 60-66: ordinal not in range(128))
ERROR: sql query with db encoding None failed ('ascii' codec can't 
encode characters in position 60-66: ordinal not in range(128))


When connecting to the database directly with cx_Oracle (without SQLAlchemy)

import cx_Oracle
import pandas as pd
import keyring

dbtype = 'Oracle'
dbenv = 'LOCAL'
dbname = 'MYDB'
dbsys = '%s%s' % (dbtype, dbenv)
dbusr = 'myusr'
dbpwd = keyring.get_password(dbsys, dbusr)
dbhost = 'mydbhost'
dbport = 1521
dbconstr = '%s:%s/%s' % (dbhost, dbport, dbname)


dbencs = ['UTF8', 'UTF-8', 'utf8', 'utf-8', 'latin1', 'ascii', None]

for dbenc in dbencs:
print('=' * 70)
print('db encoding: %s' % dbenc)
print('-' * 30)

if dbenc is None:
connection = cx_Oracle.connect(dbusr, dbpwd, dbconstr)
else:
connection = cx_Oracle.connect(dbusr, dbpwd, dbconstr, encoding=
dbenc)
cursor = connection.cursor()

try:
r = cursor.execute("SELECT owner, table_name FROM all_tables 
 WHERE owner LIKE 'äöüßÄÖÜœ'")
recs = list()
for owner, table_name in cursor:
recs.append({'owner': owner, 'table': table_name})
df = pd.DataFrame(recs)
print('SUCCESS: sql query with db encoding %s succeeded!' % 
dbenc)
except Exception as e:
print('ERROR: sql query with db encoding %s failed (%s)' % (
dbenc, e))

cursor.close()
connection.close()


everything works as expected.

SUCCESS: sql query with db encoding UTF8 succeeded!
SUCCESS: sql query with db encoding UTF-8 succeeded!
SUCCESS: sql query with db encoding utf8 succeeded!
SUCCESS: sql query with db encoding utf-8 succeeded!
SUCCESS: sql query with db encoding latin1 succeeded!
ERROR: sql query with db encoding ascii failed ('ascii' codec can't 
encode characters in position 60-66: ordinal not in range(128))
ERROR: sql query with db encoding None failed ('ascii' codec can't 
encode characters in position 60-66: ordinal not in range(128))


What do I have to do differently in order to have SQLAlchemy approache 
rendering the identical results like the cx_Oracle one does?

My environment comprises of

   - Ubuntu linux 16.04LTS;
   - Python 3.8;
   - SQLAlchemy 1.3.16;
   - cx_Oracle 7.3.0;
   - psycopg2 2.8.5;
   - local Oracle 18c Instant client;
   - remote