[ https://issues.apache.org/jira/browse/SPARK-25185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000775#comment-17000775 ]
Zhenhua Wang edited comment on SPARK-25185 at 12/20/19 9:58 AM: ---------------------------------------------------------------- Hi, [~imamitsehgal] and [~raoyvn], could you please check if [SPARK-30269|https://issues.apache.org/jira/browse/SPARK-30269] solves your problem? was (Author: zhenhuawang): Hi, [~imamitsehgal] and [~raoyvn], could you please check if [SPARK-30269](https://issues.apache.org/jira/browse/SPARK-30269) solves your problem? > CBO rowcount statistics doesn't work for partitioned parquet external table > --------------------------------------------------------------------------- > > Key: SPARK-25185 > URL: https://issues.apache.org/jira/browse/SPARK-25185 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL > Affects Versions: 2.2.1, 2.3.0 > Environment: > Tried on Ubuntu, FreBSD and windows, running spark-shell in local mode > reading data from local file system > Reporter: Amit > Priority: Major > > Created a dummy partitioned data with partition column on string type col1=a > and col1=b > added csv data-> read through spark -> created partitioned external table-> > msck repair table to load partition. Did analyze on all columns and partition > column as well. > ~println(spark.sql("select * from test_p where > e='1a'").queryExecution.toStringWithStats)~ > ~val op = spark.sql("select * from test_p where > e='1a'").queryExecution.optimizedPlan~ > // e is the partitioned column > ~val stat = op.stats(spark.sessionState.conf)~ > ~print(stat.rowCount)~ > > Created the same way in parquet the rowcount comes up correctly in case of > csv but in parquet it shows as None. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org