[ 
https://issues.apache.org/jira/browse/SPARK-42774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Micah Kornfield updated SPARK-42774:
------------------------------------
    Priority: Minor  (was: Major)

> Expose VectorTypes API for DataSourceV2 Batch Scans
> ---------------------------------------------------
>
>                 Key: SPARK-42774
>                 URL: https://issues.apache.org/jira/browse/SPARK-42774
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.3.2
>            Reporter: Micah Kornfield
>            Priority: Minor
>
> SparkPlan's vectorType's attribute can be used to [specialize 
> codegen|https://github.com/apache/spark/blob/5556cfc59aa97a3ad4ea0baacebe19859ec0bcb7/sql/core/src/main/scala/org/apache/spark/sql/execution/Columnar.scala#L151]
>  however 
> [BatchScanExecBase|https://github.com/apache/spark/blob/6b6bb6fa20f40aeedea2fb87008e9cce76c54e28/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ScanExecBase.scala]
>  does not override this so we DSv2 sources do not get any benefit of concrete 
> class dispatch.
> This proposes adding an override to BatchScanExecBase which delegates to a 
> new default method on 
> [PartitionReaderFactory|https://github.com/apache/spark/blob/f1d42bb68d6d69d9a32f91a390270f9ec33c3207/sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/PartitionReaderFactory.java]
>  to expose vectoryTypes:
> {{
> default Optional<Iterable<String>> getVectorTypes()
> { return Optional.empty(); } }}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to