Maxim Gekk created SPARK-34084:
----------------------------------

             Summary: ALTER TABLE .. ADD PARTITION does not update table stats
                 Key: SPARK-34084
                 URL: https://issues.apache.org/jira/browse/SPARK-34084
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 3.0.2, 3.2.0, 3.1.1
         Environment: strong text
            Reporter: Maxim Gekk


The example below portraits the issue:
{code:sql}
spark-sql> create table tbl (col0 int, part int) partitioned by (part);
spark-sql> insert into tbl partition (part = 0) select 0;
spark-sql> set spark.sql.statistics.size.autoUpdate.enabled=true;
spark-sql> alter table tbl add partition (part = 1);
{code}
There are no stats:
{code:sql}
spark-sql> describe table extended tbl;
col0    int     NULL
part    int     NULL
# Partition Information
# col_name      data_type       comment
part    int     NULL

# Detailed Table Information
Database        default
Table   tbl
Owner   maximgekk
Created Time    Tue Jan 12 12:00:03 MSK 2021
Last Access     UNKNOWN
Created By      Spark 3.2.0-SNAPSHOT
Type    MANAGED
Provider        hive
Table Properties        [transient_lastDdlTime=1610442003]
Location        
file:/Users/maximgekk/proj/fix-stats-in-add-partition/spark-warehouse/tbl
Serde Library   org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
InputFormat     org.apache.hadoop.mapred.TextInputFormat
OutputFormat    org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
Storage Properties      [serialization.format=1]
Partition Provider      Catalog
{code}

*As we can see there is no stats.* For instance, ALTER TABLE .. DROP PARTITION 
updates stats:
{code:sql}
spark-sql> alter table tbl drop partition (part = 1);
spark-sql> describe table extended tbl;
col0    int     NULL
part    int     NULL
# Partition Information
# col_name      data_type       comment
part    int     NULL

# Detailed Table Information
...
Statistics      2 bytes
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to