[ https://issues.apache.org/jira/browse/AIRFLOW-179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jeremiah Lowin resolved AIRFLOW-179. ------------------------------------ Resolution: Fixed Fix Version/s: (was: Airflow 1.8) Closed in https://github.com/apache/incubator-airflow/pull/1550 > DbApiHook string serialization fails when string contains non-ASCII characters > ------------------------------------------------------------------------------ > > Key: AIRFLOW-179 > URL: https://issues.apache.org/jira/browse/AIRFLOW-179 > Project: Apache Airflow > Issue Type: Bug > Components: hooks > Reporter: John Bodley > Assignee: John Bodley > > The DbApiHook.insert_rows(...) method tries to serialize all values to > strings using the ASCII codec, this is problematic if the cell contains > non-ASCII characters, i.e. > >>> from airflow.hooks import DbApiHook > >>> DbApiHook._serialize_cell('Nguyễn Tấn Dũng') > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > File > "/usr/local/lib/python2.7/dist-packages/airflow/hooks/dbapi_hook.py", line > 196, in _serialize_cell > return "'" + str(cell).replace("'", "''") + "'" > File "/usr/local/lib/python2.7/dist-packages/future/types/newstr.py", > line 102, in __new__ > return super(newstr, cls).__new__(cls, value) > UnicodeDecodeError: 'ascii' codec can't decode byte 0xe1 in position 4: > ordinal not in range(128) > Rather than manually trying to serialize and escape values to an ASCII string > one should try to serialize the value to string using the character set of > the corresponding target database leveraging the connection to mutate the > object to the SQL string literal. > Additionally the escaping logic for single quotes (') within the > _serialize_cell method seems wrong, i.e. > str(cell).replace("'", "''") > would escape the string "you're" to be "'you''ve'" as opposed to "'you\'ve'". > Note an exception should still be thrown if the target encoding is not > compatible with the source encoding. -- This message was sent by Atlassian JIRA (v6.3.4#6332)