[ 
https://issues.apache.org/jira/browse/IMPALA-9779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201660#comment-17201660
 ] 

Tim Armstrong commented on IMPALA-9779:
---------------------------------------

I noticed that with the drop stats code paths, they don't actually remove the 
row counts from the partitions in all cases. So reloading the partitions is 
actually preventing them from getting stale.

> Unnecessarily reloading file metadata in some DDLs
> --------------------------------------------------
>
>                 Key: IMPALA-9779
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9779
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Catalog
>    Affects Versions: Impala 2.8.0, Impala 2.9.0, Impala 2.10.0, Impala 
> 2.11.0, Impala 3.0, Impala 2.12.0, Impala 3.1.0, Impala 3.2.0, Impala 3.3.0, 
> Impala 3.4.0
>            Reporter: Quanlong Huang
>            Assignee: Tim Armstrong
>            Priority: Critical
>
> Some DDLs are not modifying the actual table data. We don't need to reload 
> file meta for them. These DDLs include:
> * Compute (incremental) stats
> * Drop stats
> * Alter table set row format
> * Alter table set file format
> Code paths of them both call CatalogOpExecutor.bulkAlterPartitions(). The 
> related partitions are marked as "dirty" anyway. Dirty partitions will be 
> dropped and reloaded at the end of 
> CatalogOpExecutor.alterTable(TAlterTableParams, TDdlExecResponse). See the 
> details in HdfsTable.updatePartitionsFromHms().
> We can consider not marking related partitions as "dirty" in these DDLs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to