Joris Van den Bossche created ARROW-8860: --------------------------------------------
Summary: [C++] Compressed Feather file with struct array roundtrips incorrectly Key: ARROW-8860 URL: https://issues.apache.org/jira/browse/ARROW-8860 Project: Apache Arrow Issue Type: Bug Components: C++ Reporter: Joris Van den Bossche When writing a table with a Struct typed column, this is read back with garbage values when using compression (which is the default): {code:python} >>> table = pa.table({'col': pa.StructArray.from_arrays([[0,1,2], [1,2,3]], >>> names=["f1", "f2"])}) >>> table.column("col") <pyarrow.lib.ChunkedArray object at 0x7f0b0c4d7458> [ -- is_valid: all not null -- child 0 type: int64 [ 0, 1, 2 ] -- child 1 type: int64 [ 1, 2, 3 ] ] # roundtrip through feather >>> feather.write_feather(table, "test_struct.feather") >>> table2 = feather.read_table("test_struct.feather") >>> table2.column("col") <pyarrow.lib.ChunkedArray object at 0x7f0b0c4d7728> [ -- is_valid: all not null -- child 0 type: int64 [ 24, 1261641627085906436, 1369095386551025664 ] -- child 1 type: int64 [ 24, 1405756815161762308, 281479842103296 ] ] {code} When not using compression, it is read back correctly: {code:python} >>> feather.write_feather(table, "test_struct.feather", >>> compression="uncompressed") >>> >>> table2 = feather.read_table("test_struct.feather") >>> >>> >>> table2.column("col") >>> >>> <pyarrow.lib.ChunkedArray object at 0x7f0b0e466778> [ -- is_valid: all not null -- child 0 type: int64 [ 0, 1, 2 ] -- child 1 type: int64 [ 1, 2, 3 ] ] {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)