[ https://issues.apache.org/jira/browse/SPARK-47773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ke Jia updated SPARK-47773: --------------------------- Description: This SPIP outlines the integration of Gluten's physical plan conversion, validation, and fallback framework into Apache Spark. The goal is to enhance Spark's flexibility and robustness in executing physical plans and to leverage Gluten's performance optimizations. Currently, Spark lacks an official cross-platform execution support for physical plans. Gluten's mechanism, which employs the Substrait standard, can convert and optimize Spark's physical plans, thus improving portability, interoperability, and execution efficiency. The design proposal advocates for the incorporation of the TransformSupport interface and its specialized variants—LeafTransformSupport, UnaryTransformSupport, and BinaryTransformSupport. These are instrumental in streamlining the conversion of different operator types into a Substrait-based common format. The validation phase entails a thorough assessment of the Substrait plan against native backends to ensure compatibility. In instances where validation does not succeed, Spark's native operators will be deployed, with requisite transformations to adapt data formats accordingly. The proposal emphasizes the centrality of the plan transformation phase, positing it as the foundational step. The subsequent validation and fallback procedures are slated for consideration upon the successful establishment of the initial phase. The integration of Gluten into Spark has already shown significant performance improvements with ClickHouse and Velox backends and has been successfully deployed in production by several customers. > Enhancing the Flexibility of Spark's Physical Plan to Enable Execution on > Various Native Engines > ------------------------------------------------------------------------------------------------ > > Key: SPARK-47773 > URL: https://issues.apache.org/jira/browse/SPARK-47773 > Project: Spark > Issue Type: Epic > Components: SQL > Affects Versions: 3.5.1 > Reporter: Ke Jia > Priority: Major > > This SPIP outlines the integration of Gluten's physical plan conversion, > validation, and fallback framework into Apache Spark. The goal is to enhance > Spark's flexibility and robustness in executing physical plans and to > leverage Gluten's performance optimizations. Currently, Spark lacks an > official cross-platform execution support for physical plans. Gluten's > mechanism, which employs the Substrait standard, can convert and optimize > Spark's physical plans, thus improving portability, interoperability, and > execution efficiency. The design proposal advocates for the incorporation of > the TransformSupport interface and its specialized > variants—LeafTransformSupport, UnaryTransformSupport, and > BinaryTransformSupport. These are instrumental in streamlining the conversion > of different operator types into a Substrait-based common format. The > validation phase entails a thorough assessment of the Substrait plan against > native backends to ensure compatibility. In instances where validation does > not succeed, Spark's native operators will be deployed, with requisite > transformations to adapt data formats accordingly. The proposal emphasizes > the centrality of the plan transformation phase, positing it as the > foundational step. The subsequent validation and fallback procedures are > slated for consideration upon the successful establishment of the initial > phase. The integration of Gluten into Spark has already shown significant > performance improvements with ClickHouse and Velox backends and has been > successfully deployed in production by several customers. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org