[ https://issues.apache.org/jira/browse/SPARK-25185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000776#comment-17000776 ]
Zhenhua Wang commented on SPARK-25185: -------------------------------------- [~lishuming] yes, you can analyze on external table > CBO rowcount statistics doesn't work for partitioned parquet external table > --------------------------------------------------------------------------- > > Key: SPARK-25185 > URL: https://issues.apache.org/jira/browse/SPARK-25185 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL > Affects Versions: 2.2.1, 2.3.0 > Environment: > Tried on Ubuntu, FreBSD and windows, running spark-shell in local mode > reading data from local file system > Reporter: Amit > Priority: Major > > Created a dummy partitioned data with partition column on string type col1=a > and col1=b > added csv data-> read through spark -> created partitioned external table-> > msck repair table to load partition. Did analyze on all columns and partition > column as well. > ~println(spark.sql("select * from test_p where > e='1a'").queryExecution.toStringWithStats)~ > ~val op = spark.sql("select * from test_p where > e='1a'").queryExecution.optimizedPlan~ > // e is the partitioned column > ~val stat = op.stats(spark.sessionState.conf)~ > ~print(stat.rowCount)~ > > Created the same way in parquet the rowcount comes up correctly in case of > csv but in parquet it shows as None. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org