[jira] [Commented] (SPARK-21187) Complete support for remaining Spark data types in Arrow Converters

Leif Walsh (JIRA) Mon, 24 Jul 2017 07:55:15 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-21187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16098524#comment-16098524
 ]


Leif Walsh commented on SPARK-21187:
------------------------------------

Also, if you're unfamiliar, {{object}} columns are rather slow in pandas, to do 
anything with them you have to go through the python interpreter.  It's 
generally better, when possible, to make sure your columns have primitive 
dtypes so that you can use vectorized operations on them.  For that reason, 
modeling a struct as a hierarchical index would probably be much faster to 
consume.

> Complete support for remaining Spark data types in Arrow Converters
> -------------------------------------------------------------------
>
>                 Key: SPARK-21187
>                 URL: https://issues.apache.org/jira/browse/SPARK-21187
>             Project: Spark
>          Issue Type: Umbrella
>          Components: PySpark, SQL
>    Affects Versions: 2.3.0
>            Reporter: Bryan Cutler
>
> This is to track adding the remaining type support in Arrow Converters.  
> Currently, only primitive data types are supported.  '
> Remaining types:
> * *Date*
> * *Timestamp*
> * *Complex*: Struct, Array, Map
> * *Decimal*



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-21187) Complete support for remaining Spark data types in Arrow Converters

Reply via email to