[ 
https://issues.apache.org/jira/browse/SPARK-47773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ke Jia updated SPARK-47773:
---------------------------
    Description: This SPIP outlines the integration of Gluten's physical plan 
conversion, validation, and fallback framework into Apache Spark. The goal is 
to enhance Spark's flexibility and robustness in executing physical plans and 
to leverage Gluten's performance optimizations. Currently, Spark lacks an 
official cross-platform execution support for physical plans. Gluten's 
mechanism, which employs the Substrait standard, can convert and optimize 
Spark's physical plans, thus improving portability, interoperability, and 
execution efficiency. The design proposal advocates for the incorporation of 
the TransformSupport interface and its specialized 
variants—LeafTransformSupport, UnaryTransformSupport, and 
BinaryTransformSupport. These are instrumental in streamlining the conversion 
of different operator types into a Substrait-based common format. The 
validation phase entails a thorough assessment of the Substrait plan against 
native backends to ensure compatibility. In instances where validation does not 
succeed, Spark's native operators will be deployed, with requisite 
transformations to adapt data formats accordingly. The proposal emphasizes the 
centrality of the plan transformation phase, positing it as the foundational 
step. The subsequent validation and fallback procedures are slated for 
consideration upon the successful establishment of the initial phase.  The 
integration of Gluten into Spark has already shown significant performance 
improvements with ClickHouse and Velox backends and has been successfully 
deployed in production by several customers. 

> Enhancing the Flexibility of Spark's Physical Plan to Enable Execution on 
> Various Native Engines
> ------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-47773
>                 URL: https://issues.apache.org/jira/browse/SPARK-47773
>             Project: Spark
>          Issue Type: Epic
>          Components: SQL
>    Affects Versions: 3.5.1
>            Reporter: Ke Jia
>            Priority: Major
>
> This SPIP outlines the integration of Gluten's physical plan conversion, 
> validation, and fallback framework into Apache Spark. The goal is to enhance 
> Spark's flexibility and robustness in executing physical plans and to 
> leverage Gluten's performance optimizations. Currently, Spark lacks an 
> official cross-platform execution support for physical plans. Gluten's 
> mechanism, which employs the Substrait standard, can convert and optimize 
> Spark's physical plans, thus improving portability, interoperability, and 
> execution efficiency. The design proposal advocates for the incorporation of 
> the TransformSupport interface and its specialized 
> variants—LeafTransformSupport, UnaryTransformSupport, and 
> BinaryTransformSupport. These are instrumental in streamlining the conversion 
> of different operator types into a Substrait-based common format. The 
> validation phase entails a thorough assessment of the Substrait plan against 
> native backends to ensure compatibility. In instances where validation does 
> not succeed, Spark's native operators will be deployed, with requisite 
> transformations to adapt data formats accordingly. The proposal emphasizes 
> the centrality of the plan transformation phase, positing it as the 
> foundational step. The subsequent validation and fallback procedures are 
> slated for consideration upon the successful establishment of the initial 
> phase.  The integration of Gluten into Spark has already shown significant 
> performance improvements with ClickHouse and Velox backends and has been 
> successfully deployed in production by several customers. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to