[jira] [Commented] (AIRFLOW-3419) S3_hook.select_key is broken on Python3
[ https://issues.apache.org/jira/browse/AIRFLOW-3419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719217#comment-16719217 ] Ash Berlin-Taylor commented on AIRFLOW-3419: My memory is that Python2 doesn't have a distinction between bytes and str, so I don't see how this can affect Python2? > S3_hook.select_key is broken on Python3 > --- > > Key: AIRFLOW-3419 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3419 > Project: Apache Airflow > Issue Type: Bug > Components: boto3, hooks >Affects Versions: 1.10.1 >Reporter: Maria Rebelka >Priority: Major > > Hello, > Using select_key throws an error: > {quote}text = S3Hook('aws_conn').select_key(key='my_key', > bucket_name='my_bucket', > expression='SELECT * FROM S3Object s', > expression_type='SQL', > input_serialization={'JSON': \{'Type': > 'DOCUMENT'}}, > output_serialization={'JSON': {}}){quote} > Traceback (most recent call last): > {quote} File "db.py", line 31, in > output_serialization={'JSON': {}}) > File "/usr/local/lib/python3.4/site-packages/airflow/hooks/S3_hook.py", > line 262, in select_key > for event in response['Payload'] > TypeError: sequence item 0: expected str instance, bytes found{quote} > Seems that the problem is in this line: > S3_hook.py, line 262: return ''.join(event['Records']['Payload'] > which probably should be return > ''.join(event['Records']['Payload'].decode('utf-8') > From example in Amazon blog: > https://aws.amazon.com/blogs/aws/s3-glacier-select/ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-3419) S3_hook.select_key is broken on Python3
[ https://issues.apache.org/jira/browse/AIRFLOW-3419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719209#comment-16719209 ] jack commented on AIRFLOW-3419: --- I'm not sure this is the reason... the decode is the same for Python 3 and Python 2.7 > S3_hook.select_key is broken on Python3 > --- > > Key: AIRFLOW-3419 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3419 > Project: Apache Airflow > Issue Type: Bug > Components: boto3, hooks >Affects Versions: 1.10.1 >Reporter: Maria Rebelka >Priority: Major > > Hello, > Using select_key throws an error: > {quote}text = S3Hook('aws_conn').select_key(key='my_key', > bucket_name='my_bucket', > expression='SELECT * FROM S3Object s', > expression_type='SQL', > input_serialization={'JSON': \{'Type': > 'DOCUMENT'}}, > output_serialization={'JSON': {}}){quote} > Traceback (most recent call last): > {quote} File "db.py", line 31, in > output_serialization={'JSON': {}}) > File "/usr/local/lib/python3.4/site-packages/airflow/hooks/S3_hook.py", > line 262, in select_key > for event in response['Payload'] > TypeError: sequence item 0: expected str instance, bytes found{quote} > Seems that the problem is in this line: > S3_hook.py, line 262: return ''.join(event['Records']['Payload'] > which probably should be return > ''.join(event['Records']['Payload'].decode('utf-8') > From example in Amazon blog: > https://aws.amazon.com/blogs/aws/s3-glacier-select/ -- This message was sent by Atlassian JIRA (v7.6.3#76005)