[ 
https://issues.apache.org/jira/browse/AIRFLOW-179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303076#comment-15303076
 ] 

Chris Riccomini commented on AIRFLOW-179:
-----------------------------------------

Derp, just saw https://github.com/apache/incubator-airflow/pull/1550

> DbApiHook string serialization fails when string contains non-ASCII characters
> ------------------------------------------------------------------------------
>
>                 Key: AIRFLOW-179
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-179
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: hooks
>            Reporter: John Bodley
>            Assignee: John Bodley
>
> The DbApiHook.insert_rows(...) method tries to serialize all values to 
> strings using the ASCII codec,  this is problematic if the cell contains 
> non-ASCII characters, i.e.
>     >>> from airflow.hooks import DbApiHook
>     >>> DbApiHook._serialize_cell('Nguyễn Tấn Dũng')
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in <module>
>       File 
> "/usr/local/lib/python2.7/dist-packages/airflow/hooks/dbapi_hook.py", line 
> 196, in _serialize_cell
>         return "'" + str(cell).replace("'", "''") + "'"
>       File "/usr/local/lib/python2.7/dist-packages/future/types/newstr.py", 
> line 102, in __new__
>         return super(newstr, cls).__new__(cls, value)
>     UnicodeDecodeError: 'ascii' codec can't decode byte 0xe1 in position 4: 
> ordinal not in range(128)
> Rather than manually trying to serialize and escape values to an ASCII string 
> one should try to serialize the value to string using the character set of 
> the corresponding target database leveraging the connection to mutate the 
> object to the SQL string literal.
> Additionally the escaping logic for single quotes (') within the 
> _serialize_cell method seems wrong, i.e. 
>     str(cell).replace("'", "''")
> would escape the string "you're" to be "'you''ve'" as opposed to "'you\'ve'".
> Note an exception should still be thrown if the target encoding is not 
> compatible with the source encoding.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to