[ https://issues.apache.org/jira/browse/BEAM-5953?focusedWorklogId=188046&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-188046 ]
ASF GitHub Bot logged work on BEAM-5953: ---------------------------------------- Author: ASF GitHub Bot Created on: 22/Jan/19 08:49 Start Date: 22/Jan/19 08:49 Worklog Time Spent: 10m Work Description: robertwb commented on pull request #7521: [BEAM-5953] Fix py3 type error in bundle_processor URL: https://github.com/apache/beam/pull/7521#discussion_r248998087 ########## File path: sdks/python/apache_beam/runners/worker/operation_specs.py ########## @@ -354,7 +354,10 @@ def get_coder_from_spec(coder_spec): # We pass coders in the form "<coder_name>$<pickled_data>" to make the job # description JSON more readable. - return coders.coders.deserialize_coder(coder_spec['@type']) + coder = coder_spec['@type'] + if not isinstance(coder, bytes): Review comment: Here coder_spec is a cloud object dictionary, typically parsed from JSON, hence the unicode. We can unconditionally encode this for deserialization, but it's quite possible that utf-8 would not be the "right" encoding in this case for pickle. Due to issues of passing arbitrary bytes through cloud protos, we actually base64 encode our serialized data in internal.pickler.loads/dumps, including here. As such, it should be safe to encode this with 'ascii' which would throw errors if there happen to be any higher code points (which there should not be, but if any creep in, it'd be better to have an explicit error here than a harder-to-deciper one later). ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 188046) Time Spent: 3h 50m (was: 3h 40m) > Support DataflowRunner on Python 3 > ---------------------------------- > > Key: BEAM-5953 > URL: https://issues.apache.org/jira/browse/BEAM-5953 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core > Reporter: Mark Liu > Assignee: Mark Liu > Priority: Major > Time Spent: 3h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)