[ 
https://issues.apache.org/jira/browse/ARROW-6038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16897133#comment-16897133
 ] 

Antoine Pitrou commented on ARROW-6038:
---------------------------------------

Ok, the issue here is that you are creating a Table column with different 
types. The second array is inferred to be an array of type "null". Arrow should 
prevent you from doing that instead of crashing.

However, comparing types can be a bit expensive (if e.g. they are nested 
types). [~wesmckinn] what do you think?

> [Python] pyarrow.Table.from_batches produces corrupted table if any of the 
> batches were empty
> ---------------------------------------------------------------------------------------------
>
>                 Key: ARROW-6038
>                 URL: https://issues.apache.org/jira/browse/ARROW-6038
>             Project: Apache Arrow
>          Issue Type: Bug
>    Affects Versions: 0.13.0, 0.14.0, 0.14.1
>            Reporter: Piotr Bajger
>            Priority: Minor
>              Labels: windows
>         Attachments: segfault_ex.py
>
>
> When creating a Table from a list/iterator of batches which contains an 
> "empty" RecordBatch a Table is produced but attempts to run any pyarrow 
> built-in functions (such as unique()) occasionally result in a Segfault.
> The MWE is attached: [^segfault_ex.py]
>  # The segfaults happen randomly, around 30% of the time.
>  # Commenting out line 10 in the MWE results in no segfaults.
>  # The segfault is triggered using the unique() function, but I doubt the 
> behaviour is specific to that function, from what I gather the problem lies 
> in Table creation.
> I'm on Windows 10, using Python 3.6 and pyarrow 0.14.0 installed through pip 
> (problem also occurs with 0.13.0 from conda-forge).



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to