Daniel created SPARK-33326: ------------------------------ Summary: Partition Parameters are not updated even after ANALYZE TABLE command Key: SPARK-33326 URL: https://issues.apache.org/jira/browse/SPARK-33326 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.0.1 Reporter: Daniel
Here are the reproduction steps: {code:java} scala> spark.sql("CREATE TABLE t (a string,b string) PARTITIONED BY (p string) STORED AS PARQUET") Hive Session ID = d44e21ee-2d5c-48ab-91bf-26cb25775486 res0: org.apache.spark.sql.DataFrame = [] scala> spark.sql("INSERT INTO t PARTITION(p='p1') VALUES ('aaa', 'bbb')") res1: org.apache.spark.sql.DataFrame = [] scala> spark.sql("INSERT INTO t PARTITION(p='p1') VALUES ('ccc', 'ddd')") res2: org.apache.spark.sql.DataFrame = [] scala> spark.sql("ANALYZE TABLE t PARTITION(p='p1') COMPUTE STATISTICS") res3: org.apache.spark.sql.DataFrame = [] scala> spark.sql("DESCRIBE FORMATTED t PARTITION (p='p1')").show(50, false) ... |Partition Parameters |{rawDataSize=0, numFiles=1, numFilesErasureCoded=0, transient_lastDdlTime=1604404640, totalSize=532, COLUMN_STATS_ACCURATE={"BASIC_STATS":"true","COLUMN_STATS":{"a":"true","b":"true"}}, numRows=0}| | ... |Partition Statistics |1064 bytes, 2 rows | | ... {code} My expectation would be that the Partition Parameters should be updated after ANALYZE TABLE. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org