[ https://issues.apache.org/jira/browse/ARROW-6520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16927496#comment-16927496 ]
Joris Van den Bossche commented on ARROW-6520: ---------------------------------------------- I can reproduce this on 0.14.1, but not any more on master. So it might already be fixed. Will do a PR with adding a test to see if it passes on CI as well. > [Python] Segmentation fault on writing tables with fixed size binary fields > ---------------------------------------------------------------------------- > > Key: ARROW-6520 > URL: https://issues.apache.org/jira/browse/ARROW-6520 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Affects Versions: 0.14.1 > Environment: python(3.7.3), pyarrow(0.14.1), arrow-cpp(0.14.1), > parquet-cpp(1.5.1), Arch Linux x86_64 > Reporter: Furkan Tektas > Priority: Critical > Labels: newbie > Fix For: 0.15.0 > > > I'm not sure if this should be reported to Parquet or here. > When I tried to serialize a pyarrow table with a fixed size binary field > (holds 16 byte UUID4 information) to a parquet file, segmentation fault > occurs. > Here is the minimal example to reproduce: > {{import pyarrow as pa}} > {{from pyarrow import parquet as pq}} > {{data = \{"col": pa.array([b"1234" for _ in range(10)])}}} > {{fields = [("col", pa.binary(4))]}} > {{schema = pa.schema(fields)}} > {{table = pa.table(data, schema)}} > {{pq.write_table(table, "test.parquet")}} > {{segmentation fault (core dumped) ipython}} > > Yet, it works if I don't specify the size of the binary field. > {{import pyarrow as pa}} > {{from pyarrow import parquet as pq}} > {{data = \{"col": pa.array([b"1234" for _ in range(10)])}}} > {{fields = [("col", pa.binary())]}} > {{schema = pa.schema(fields)}} > {{table = pa.table(data, schema)}} > {{pq.write_table(table, "test.parquet")}} > Thanks, -- This message was sent by Atlassian Jira (v8.3.2#803003)