[ https://issues.apache.org/jira/browse/BEAM-6748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16787140#comment-16787140 ]
Valentyn Tymofieiev commented on BEAM-6748: ------------------------------------------- The block size is hardcoded in avro library and is different between Python2 and Python3: [https://github.com/apache/avro/blob/f173ae8d690b5f90e8cc5899b654762a9d11e17d/lang/py/src/avro/datafile.py#L39] [https://github.com/apache/avro/blob/f173ae8d690b5f90e8cc5899b654762a9d11e17d/lang/py3/avro/datafile.py#L57] Probably something similar happens in fastavro library. > Splitting logic in Avro IO tests behaves unexpectedly in Python 3 > ----------------------------------------------------------------- > > Key: BEAM-6748 > URL: https://issues.apache.org/jira/browse/BEAM-6748 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core > Reporter: Valentyn Tymofieiev > Assignee: Valentyn Tymofieiev > Priority: Major > > *apache_beam.io.avroio_test.TestAvro.test_split_points* > *apache_beam.io.avroio_test.TestFastAvro.test_split_points* > fail with: > > {code:java} > Traceback (most recent call last): > File "/home/robbe/workspace/beam/sdks/python/apache_beam/io/avroio_test.py", > line 308, in test_split_points > self.assertEquals(split_points_report[-10:], [(2, 1)] * 10) > AssertionError: Lists differ: [(10, 1), (10, 1), (10, 1), (10, 1), (10, 1[42 > chars], 1)] != [(2, 1), (2, 1), (2, 1), (2, 1), (2, 1), (2[32 chars], 1)] > First differing element 0: > (10, 1) > (2, 1) > + [(2, 1), (2, 1), (2, 1), (2, 1), (2, 1), (2, 1), (2, 1), (2, 1), (2, 1), > (2, 1)] > - [(10, 1), > - (10, 1), > - (10, 1), > - (10, 1), > - (10, 1), > - (10, 1), > - (10, 1), > - (10, 1), > - (10, 1), > - (10, 1)] {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)