[ 
https://issues.apache.org/jira/browse/SPARK-48660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17856122#comment-17856122
 ] 

Wei Guo commented on SPARK-48660:
---------------------------------

I am working on this and thank your for recommendation [~yangjie01] .

> The result of explain is incorrect for CreateTableAsSelect
> ----------------------------------------------------------
>
>                 Key: SPARK-48660
>                 URL: https://issues.apache.org/jira/browse/SPARK-48660
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.5.0, 4.0.0, 3.5.1
>            Reporter: Yuming Wang
>            Priority: Major
>
> How to reproduce:
> {code:sql}
> CREATE TABLE order_history_version_audit_rno (
>   eventid STRING,
>   id STRING,
>   referenceid STRING,
>   type STRING,
>   referencetype STRING,
>   sellerid BIGINT,
>   buyerid BIGINT,
>   producerid STRING,
>   versionid INT,
>   changedocuments ARRAY<STRUCT<clientId: STRING, type: STRING, timestamp: 
> BIGINT, changeDetails: STRING>>,
>   dt STRING,
>   hr STRING)
> USING parquet
> PARTITIONED BY (dt, hr);
> explain cost
> CREATE TABLE order_history_version_audit_rno
> USING parquet
> PARTITIONED BY (dt)
> CLUSTERED BY (id) INTO 1000 buckets
> AS SELECT * FROM order_history_version_audit_rno
> WHERE dt >= '2023-11-29';
> {code}
> {noformat}
> spark-sql (default)> 
>                    > explain cost
>                    > CREATE TABLE order_history_version_audit_rno
>                    > USING parquet
>                    > PARTITIONED BY (dt)
>                    > CLUSTERED BY (id) INTO 1000 buckets
>                    > AS SELECT * FROM order_history_version_audit_rno
>                    > WHERE dt >= '2023-11-29';
> == Optimized Logical Plan ==
> CreateDataSourceTableAsSelectCommand 
> `spark_catalog`.`default`.`order_history_version_audit_rno`, ErrorIfExists, 
> [eventid, id, referenceid, type, referencetype, sellerid, buyerid, 
> producerid, versionid, changedocuments, hr, dt]
>    +- Project [eventid#5, id#6, referenceid#7, type#8, referencetype#9, 
> sellerid#10L, buyerid#11L, producerid#12, versionid#13, changedocuments#14, 
> hr#16, dt#15]
>       +- Project [eventid#5, id#6, referenceid#7, type#8, referencetype#9, 
> sellerid#10L, buyerid#11L, producerid#12, versionid#13, changedocuments#14, 
> dt#15, hr#16]
>          +- Filter (dt#15 >= 2023-11-29)
>             +- SubqueryAlias 
> spark_catalog.default.order_history_version_audit_rno
>                +- Relation 
> spark_catalog.default.order_history_version_audit_rno[eventid#5,id#6,referenceid#7,type#8,referencetype#9,sellerid#10L,buyerid#11L,producerid#12,versionid#13,changedocuments#14,dt#15,hr#16]
>  parquet
> == Physical Plan ==
> Execute CreateDataSourceTableAsSelectCommand
>    +- CreateDataSourceTableAsSelectCommand 
> `spark_catalog`.`default`.`order_history_version_audit_rno`, ErrorIfExists, 
> [eventid, id, referenceid, type, referencetype, sellerid, buyerid, 
> producerid, versionid, changedocuments, hr, dt]
>          +- Project [eventid#5, id#6, referenceid#7, type#8, referencetype#9, 
> sellerid#10L, buyerid#11L, producerid#12, versionid#13, changedocuments#14, 
> hr#16, dt#15]
>             +- Project [eventid#5, id#6, referenceid#7, type#8, 
> referencetype#9, sellerid#10L, buyerid#11L, producerid#12, versionid#13, 
> changedocuments#14, dt#15, hr#16]
>                +- Filter (dt#15 >= 2023-11-29)
>                   +- SubqueryAlias 
> spark_catalog.default.order_history_version_audit_rno
>                      +- Relation 
> spark_catalog.default.order_history_version_audit_rno[eventid#5,id#6,referenceid#7,type#8,referencetype#9,sellerid#10L,buyerid#11L,producerid#12,versionid#13,changedocuments#14,dt#15,hr#16]
>  parquet
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to