[ https://issues.apache.org/jira/browse/SPARK-23874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16586612#comment-16586612 ]
Bryan Cutler commented on SPARK-23874: -------------------------------------- For the Python fixes, yes the user would have to upgrade pyarrow. In Spark, there is no upgrade for pyarrow, just bumping up the minimum version which determines what version is installed with pyspark, which is still currently 0.8.0. Since we upgraded the Java library, those fixes are included. > Upgrade apache/arrow to 0.10.0 > ------------------------------ > > Key: SPARK-23874 > URL: https://issues.apache.org/jira/browse/SPARK-23874 > Project: Spark > Issue Type: Improvement > Components: PySpark > Affects Versions: 2.3.0 > Reporter: Xiao Li > Assignee: Bryan Cutler > Priority: Major > Fix For: 2.4.0 > > > Version 0.10.0 will allow for the following improvements and bug fixes: > * Allow for adding BinaryType support ARROW-2141 > * Bug fix related to array serialization ARROW-1973 > * Python2 str will be made into an Arrow string instead of bytes ARROW-2101 > * Python bytearrays are supported in as input to pyarrow ARROW-2141 > * Java has common interface for reset to cleanup complex vectors in Spark > ArrowWriter ARROW-1962 > * Cleanup pyarrow type equality checks ARROW-2423 > * ArrowStreamWriter should not hold references to ArrowBlocks ARROW-2632, > ARROW-2645 > * Improved low level handling of messages for RecordBatch ARROW-2704 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org