[ https://issues.apache.org/jira/browse/BEAM-5313?focusedWorklogId=189146&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-189146 ]
ASF GitHub Bot logged work on BEAM-5313: ---------------------------------------- Author: ASF GitHub Bot Created on: 23/Jan/19 19:13 Start Date: 23/Jan/19 19:13 Worklog Time Spent: 10m Work Description: RobbeSneyders commented on pull request #7583: [BEAM-5313] Python 3 port examples package URL: https://github.com/apache/beam/pull/7583#discussion_r250334435 ########## File path: sdks/python/apache_beam/examples/complete/tfidf_test.py ########## @@ -85,8 +85,8 @@ def test_basics(self): with open_shards(os.path.join( temp_folder, 'result-*-of-*')) as result_file: for line in result_file: - match = re.search(EXPECTED_LINE_RE, line) - logging.info('Result line: %s', line) + match = re.search(EXPECTED_LINE_RE, line.decode('utf-8')) Review comment: I can't find any occurrence where it's used with binary data, so we could do it. But we would probably still want to expose an 'encoding' parameter so we can add tests with different encodings in the future. We are also encoding the data in the same file when writing: https://github.com/apache/beam/blob/11f5efa2bbdf79ac86d321932d6637575904c089/sdks/python/apache_beam/examples/complete/tfidf_test.py#L49-L52 So I like how it's explicit. It is currently the responsibility of the tests to encode/decode correctly. What do you think? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 189146) Time Spent: 1h 20m (was: 1h 10m) > Finish Python 3 porting for examples module > ------------------------------------------- > > Key: BEAM-5313 > URL: https://issues.apache.org/jira/browse/BEAM-5313 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core > Reporter: Robbe > Assignee: Robbe > Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)