[ https://issues.apache.org/jira/browse/SPARK-20960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xiao Li resolved SPARK-20960. ----------------------------- Resolution: Fixed Assignee: Wenchen Fan Fix Version/s: 2.3.0 > make ColumnVector public > ------------------------ > > Key: SPARK-20960 > URL: https://issues.apache.org/jira/browse/SPARK-20960 > Project: Spark > Issue Type: New Feature > Components: SQL > Affects Versions: 2.3.0 > Reporter: Wenchen Fan > Assignee: Wenchen Fan > Fix For: 2.3.0 > > > ColumnVector is an internal interface in Spark SQL, which is only used for > vectorized parquet reader to represent the in-memory columnar format. > In Spark 2.3 we want to make ColumnVector public, so that we can provide a > more efficient way for data exchanges between Spark and external systems. For > example, we can use ColumnVector to build the columnar read API in data > source framework, we can use ColumnVector to build a more efficient UDF API, > etc. > We also want to introduce a new ColumnVector implementation based on Apache > Arrow(basically just a wrapper over Arrow), so that external systems(like > Python Pandas DataFrame) can build ColumnVector very easily. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org