Maxim Gekk created SPARK-34084: ---------------------------------- Summary: ALTER TABLE .. ADD PARTITION does not update table stats Key: SPARK-34084 URL: https://issues.apache.org/jira/browse/SPARK-34084 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.0.2, 3.2.0, 3.1.1 Environment: strong text Reporter: Maxim Gekk
The example below portraits the issue: {code:sql} spark-sql> create table tbl (col0 int, part int) partitioned by (part); spark-sql> insert into tbl partition (part = 0) select 0; spark-sql> set spark.sql.statistics.size.autoUpdate.enabled=true; spark-sql> alter table tbl add partition (part = 1); {code} There are no stats: {code:sql} spark-sql> describe table extended tbl; col0 int NULL part int NULL # Partition Information # col_name data_type comment part int NULL # Detailed Table Information Database default Table tbl Owner maximgekk Created Time Tue Jan 12 12:00:03 MSK 2021 Last Access UNKNOWN Created By Spark 3.2.0-SNAPSHOT Type MANAGED Provider hive Table Properties [transient_lastDdlTime=1610442003] Location file:/Users/maximgekk/proj/fix-stats-in-add-partition/spark-warehouse/tbl Serde Library org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe InputFormat org.apache.hadoop.mapred.TextInputFormat OutputFormat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Storage Properties [serialization.format=1] Partition Provider Catalog {code} *As we can see there is no stats.* For instance, ALTER TABLE .. DROP PARTITION updates stats: {code:sql} spark-sql> alter table tbl drop partition (part = 1); spark-sql> describe table extended tbl; col0 int NULL part int NULL # Partition Information # col_name data_type comment part int NULL # Detailed Table Information ... Statistics 2 bytes {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org