[ https://issues.apache.org/jira/browse/AIRFLOW-1558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Adam Wentz updated AIRFLOW-1558: -------------------------------- Description: When running under python3 the S3FileTransformOperator fails with the following error: {noformat} [2017-09-01 18:44:54,440] {models.py:1427} ERROR - write() argument must be str, not bytes [2017-09-01 18:44:54,443] {base_task_runner.py:95} INFO - Subtask: Traceback (most recent call last): [2017-09-01 18:44:54,444] {base_task_runner.py:95} INFO - Subtask: File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 1384, in run [2017-09-01 18:44:54,445] {base_task_runner.py:95} INFO - Subtask: result = task_copy.execute(context=context) [2017-09-01 18:44:54,445] {base_task_runner.py:95} INFO - Subtask: File "/usr/local/lib/python3.6/site-packages/airflow/operators/s3_file_transform_operator.py", line 87, in execute [2017-09-01 18:44:54,446] {base_task_runner.py:95} INFO - Subtask: source_s3_key_object.get_contents_to_file(f_source) [2017-09-01 18:44:54,446] {base_task_runner.py:95} INFO - Subtask: File "/usr/local/airflow/.local/lib/python3.6/site-packages/boto/s3/key.py", line 1662, in get_contents_to_file [2017-09-01 18:44:54,447] {base_task_runner.py:95} INFO - Subtask: response_headers=response_headers) [2017-09-01 18:44:54,447] {base_task_runner.py:95} INFO - Subtask: File "/usr/local/airflow/.local/lib/python3.6/site-packages/boto/s3/key.py", line 1494, in get_file [2017-09-01 18:44:54,447] {base_task_runner.py:95} INFO - Subtask: query_args=None) [2017-09-01 18:44:54,448] {base_task_runner.py:95} INFO - Subtask: File "/usr/local/airflow/.local/lib/python3.6/site-packages/boto/s3/key.py", line 1548, in _get_file_internal [2017-09-01 18:44:54,448] {base_task_runner.py:95} INFO - Subtask: fp.write(bytes) [2017-09-01 18:44:54,448] {base_task_runner.py:95} INFO - Subtask: File "/usr/local/lib/python3.6/tempfile.py", line 483, in func_wrapper [2017-09-01 18:44:54,449] {base_task_runner.py:95} INFO - Subtask: return func(*args, **kwargs) [2017-09-01 18:44:54,449] {base_task_runner.py:95} INFO - Subtask: TypeError: write() argument must be str, not bytes [2017-09-01 18:44:54,450] {base_task_runner.py:95} INFO - Subtask: [2017-09-01 18:44:54,443] {models.py:1451} INFO - Marking task as FAILED. {noformat} The solution is to open the `NamedTemporaryFile`s with mode `wb` rather than `w`. I have an incoming PR for this. was: When running under python3 the S3FileTransformOperator fails with the following error: ``` [2017-09-01 18:44:54,440] {models.py:1427} ERROR - write() argument must be str, not bytes [2017-09-01 18:44:54,443] {base_task_runner.py:95} INFO - Subtask: Traceback (most recent call last): [2017-09-01 18:44:54,444] {base_task_runner.py:95} INFO - Subtask: File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 1384, in run [2017-09-01 18:44:54,445] {base_task_runner.py:95} INFO - Subtask: result = task_copy.execute(context=context) [2017-09-01 18:44:54,445] {base_task_runner.py:95} INFO - Subtask: File "/usr/local/lib/python3.6/site-packages/airflow/operators/s3_file_transform_operator.py", line 87, in execute [2017-09-01 18:44:54,446] {base_task_runner.py:95} INFO - Subtask: source_s3_key_object.get_contents_to_file(f_source) [2017-09-01 18:44:54,446] {base_task_runner.py:95} INFO - Subtask: File "/usr/local/airflow/.local/lib/python3.6/site-packages/boto/s3/key.py", line 1662, in get_contents_to_file [2017-09-01 18:44:54,447] {base_task_runner.py:95} INFO - Subtask: response_headers=response_headers) [2017-09-01 18:44:54,447] {base_task_runner.py:95} INFO - Subtask: File "/usr/local/airflow/.local/lib/python3.6/site-packages/boto/s3/key.py", line 1494, in get_file [2017-09-01 18:44:54,447] {base_task_runner.py:95} INFO - Subtask: query_args=None) [2017-09-01 18:44:54,448] {base_task_runner.py:95} INFO - Subtask: File "/usr/local/airflow/.local/lib/python3.6/site-packages/boto/s3/key.py", line 1548, in _get_file_internal [2017-09-01 18:44:54,448] {base_task_runner.py:95} INFO - Subtask: fp.write(bytes) [2017-09-01 18:44:54,448] {base_task_runner.py:95} INFO - Subtask: File "/usr/local/lib/python3.6/tempfile.py", line 483, in func_wrapper [2017-09-01 18:44:54,449] {base_task_runner.py:95} INFO - Subtask: return func(*args, **kwargs) [2017-09-01 18:44:54,449] {base_task_runner.py:95} INFO - Subtask: TypeError: write() argument must be str, not bytes [2017-09-01 18:44:54,450] {base_task_runner.py:95} INFO - Subtask: [2017-09-01 18:44:54,443] {models.py:1451} INFO - Marking task as FAILED. ``` The solution is to open the `NamedTemporaryFile`s with mode `wb` rather than `w`. I have an incoming PR for this. > S3FileTransformOperator fails in Python 3 due to file mode > ---------------------------------------------------------- > > Key: AIRFLOW-1558 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1558 > Project: Apache Airflow > Issue Type: Bug > Components: operators > Affects Versions: Airflow 1.8 > Environment: python3 > Reporter: Adam Wentz > Priority: Minor > > When running under python3 the S3FileTransformOperator fails with the > following error: > {noformat} > [2017-09-01 18:44:54,440] {models.py:1427} ERROR - write() argument must be > str, not bytes > [2017-09-01 18:44:54,443] {base_task_runner.py:95} INFO - Subtask: Traceback > (most recent call last): > [2017-09-01 18:44:54,444] {base_task_runner.py:95} INFO - Subtask: File > "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 1384, in run > [2017-09-01 18:44:54,445] {base_task_runner.py:95} INFO - Subtask: result > = task_copy.execute(context=context) > [2017-09-01 18:44:54,445] {base_task_runner.py:95} INFO - Subtask: File > "/usr/local/lib/python3.6/site-packages/airflow/operators/s3_file_transform_operator.py", > line 87, in execute > [2017-09-01 18:44:54,446] {base_task_runner.py:95} INFO - Subtask: > source_s3_key_object.get_contents_to_file(f_source) > [2017-09-01 18:44:54,446] {base_task_runner.py:95} INFO - Subtask: File > "/usr/local/airflow/.local/lib/python3.6/site-packages/boto/s3/key.py", line > 1662, in get_contents_to_file > [2017-09-01 18:44:54,447] {base_task_runner.py:95} INFO - Subtask: > response_headers=response_headers) > [2017-09-01 18:44:54,447] {base_task_runner.py:95} INFO - Subtask: File > "/usr/local/airflow/.local/lib/python3.6/site-packages/boto/s3/key.py", line > 1494, in get_file > [2017-09-01 18:44:54,447] {base_task_runner.py:95} INFO - Subtask: > query_args=None) > [2017-09-01 18:44:54,448] {base_task_runner.py:95} INFO - Subtask: File > "/usr/local/airflow/.local/lib/python3.6/site-packages/boto/s3/key.py", line > 1548, in _get_file_internal > [2017-09-01 18:44:54,448] {base_task_runner.py:95} INFO - Subtask: > fp.write(bytes) > [2017-09-01 18:44:54,448] {base_task_runner.py:95} INFO - Subtask: File > "/usr/local/lib/python3.6/tempfile.py", line 483, in func_wrapper > [2017-09-01 18:44:54,449] {base_task_runner.py:95} INFO - Subtask: return > func(*args, **kwargs) > [2017-09-01 18:44:54,449] {base_task_runner.py:95} INFO - Subtask: TypeError: > write() argument must be str, not bytes > [2017-09-01 18:44:54,450] {base_task_runner.py:95} INFO - Subtask: > [2017-09-01 18:44:54,443] {models.py:1451} INFO - Marking task as FAILED. > {noformat} > The solution is to open the `NamedTemporaryFile`s with mode `wb` rather than > `w`. I have an incoming PR for this. -- This message was sent by Atlassian JIRA (v6.4.14#64029)