[ https://issues.apache.org/jira/browse/SPARK-33954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17257159#comment-17257159 ]
Apache Spark commented on SPARK-33954: -------------------------------------- User 'wangyum' has created a pull request for this issue: https://github.com/apache/spark/pull/30987 > Some operator missing rowCount when enable CBO > ---------------------------------------------- > > Key: SPARK-33954 > URL: https://issues.apache.org/jira/browse/SPARK-33954 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 3.2.0 > Reporter: Yuming Wang > Priority: Major > > Some operator missing rowCount when enable CBO, for example: > {code:scala} > spark.range(1000).selectExpr("id as a", "id as b").write.saveAsTable("t1") > spark.sql("ANALYZE TABLE t1 COMPUTE STATISTICS FOR ALL COLUMNS") > spark.sql("set spark.sql.cbo.enabled=true") > spark.sql("set spark.sql.cbo.planStats.enabled=true") > spark.sql("select * from (select * from t1 distribute by a limit 100) > distribute by b").explain("cost") > {code} > Current: > {noformat} > == Optimized Logical Plan == > RepartitionByExpression [b#2129L], Statistics(sizeInBytes=2.3 KiB) > +- GlobalLimit 100, Statistics(sizeInBytes=2.3 KiB, rowCount=100) > +- LocalLimit 100, Statistics(sizeInBytes=23.4 KiB) > +- RepartitionByExpression [a#2128L], Statistics(sizeInBytes=23.4 KiB) > +- Relation[a#2128L,b#2129L] parquet, Statistics(sizeInBytes=23.4 > KiB, rowCount=1.00E+3) > {noformat} > Expected: > {noformat} > == Optimized Logical Plan == > RepartitionByExpression [b#2129L], Statistics(sizeInBytes=2.3 KiB, > rowCount=100) > +- GlobalLimit 100, Statistics(sizeInBytes=2.3 KiB, rowCount=100) > +- LocalLimit 100, Statistics(sizeInBytes=23.4 KiB, rowCount=1.00E+3) > +- RepartitionByExpression [a#2128L], Statistics(sizeInBytes=23.4 KiB, > rowCount=1.00E+3) > +- Relation[a#2128L,b#2129L] parquet, Statistics(sizeInBytes=23.4 > KiB, rowCount=1.00E+3) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org