[ https://issues.apache.org/jira/browse/SPARK-34137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17280219#comment-17280219 ]
Apache Spark commented on SPARK-34137: -------------------------------------- User 'AngersZhuuuu' has created a pull request for this issue: https://github.com/apache/spark/pull/31485 > The tree string does not contain statistics for nested scalar sub queries > ------------------------------------------------------------------------- > > Key: SPARK-34137 > URL: https://issues.apache.org/jira/browse/SPARK-34137 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 3.2.0 > Reporter: Yuming Wang > Priority: Major > > How to reproduce: > {code:scala} > spark.sql("create table t1 using parquet as select id as a, id as b from > range(1000)") > spark.sql("create table t2 using parquet as select id as c, id as d from > range(2000)") > spark.sql("ANALYZE TABLE t1 COMPUTE STATISTICS FOR ALL COLUMNS") > spark.sql("ANALYZE TABLE t2 COMPUTE STATISTICS FOR ALL COLUMNS") > spark.sql("set spark.sql.cbo.enabled=true") > spark.sql( > """ > |WITH max_store_sales AS > | (SELECT max(csales) tpcds_cmax > | FROM (SELECT > | sum(b) csales > | FROM t1 WHERE a < 100 ) x), > |best_ss_customer AS > | (SELECT > | c > | FROM t2 > | WHERE d > (SELECT * FROM max_store_sales)) > | > |SELECT c FROM best_ss_customer > |""".stripMargin).explain("cost") > {code} > Output: > {noformat} > == Optimized Logical Plan == > Project [c#4263L], Statistics(sizeInBytes=31.3 KiB, rowCount=2.00E+3) > +- Filter (isnotnull(d#4264L) AND (d#4264L > scalar-subquery#4262 [])), > Statistics(sizeInBytes=46.9 KiB, rowCount=2.00E+3) > : +- Aggregate [max(csales#4260L) AS tpcds_cmax#4261L] > : +- Aggregate [sum(b#4266L) AS csales#4260L] > : +- Project [b#4266L] > : +- Filter ((a#4265L < 100) AND isnotnull(a#4265L)) > : +- Relation default.t1[a#4265L,b#4266L] parquet, > Statistics(sizeInBytes=23.4 KiB, rowCount=1.00E+3) > +- Relation default.t2[c#4263L,d#4264L] parquet, > Statistics(sizeInBytes=46.9 KiB, rowCount=2.00E+3) > {noformat} > Another case is TPC-DS q23a. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org