Maximilian Roos created AIRFLOW-3921:
----------------------------------------

             Summary: Logging bytes fails in Python 2
                 Key: AIRFLOW-3921
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-3921
             Project: Apache Airflow
          Issue Type: Bug
          Components: utils
    Affects Versions: 1.10.2
            Reporter: Maximilian Roos


We just upgraded to 1.10.2. Thanks for the cadence of releases.

 

We've hit one small but critical issue though: when we log a Python2 string 
(i.e. bytes) that contain non-ascii characters, airflow raises an error.

 

This is because airflow uses a `\n` character that is unicode encoded here: 
[https://github.com/apache/airflow/blob/master/airflow/utils/log/logging_mixin.py#L102,]
 because `from __future__import unicode_literals` is placed here: 
[https://github.com/apache/airflow/blob/master/airflow/utils/log/logging_mixin.py#L23]

(I think this is why, and the repro below supports that, but I'm frequently 
hitting unicode issues, so please correct me if I'm mistaken)

 

You can see the issue reproduced:

 

 
{code:java}
# non-ascii character
In [16]: print(u"\u00E9")
é

# non-ascii encoded into bytes
In [11]: u"\u00E9aoeu".encode('utf-8')
Out[11]: '\xc3\xa9aoeu'
# works fine when compared with `b"\n"`
In [18]: u"\u00E9aoeu".encode('utf-8').endswith(b"\n")
Out[18]: False
# fails when compared with `u"\n"`
In [15]: '\xc3\xa9aoeu'.endswith(u"\n")
---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-15-93bd1ca7fa67> in <module>()
----> 1 '\xc3\xa9aoeu'.endswith(u"\n")

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal 
not in range(128)

{code}
 

I'm not sure there's any workaround without something as drastic as removing 
the `from __future__import unicode_literals`, or changing all our logging to 
emit unicode (which would break lots of other processes in Python 2). Is there 
any temporary workaround?

 

Thanks



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to