[ 
https://issues.apache.org/jira/browse/IMPALA-6853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Brown deleted IMPALA-6853:
----------------------------------


> COMPUTE STATS does an unnecessary REFRESH after writing to the Metastore
> ------------------------------------------------------------------------
>
>                 Key: IMPALA-6853
>                 URL: https://issues.apache.org/jira/browse/IMPALA-6853
>             Project: IMPALA
>          Issue Type: Bug
>            Reporter: Alexander Behm
>            Assignee: Tianyi Wang
>            Priority: Critical
>              Labels: compute-stats, perfomance
>
> COMPUTE STATS and possibly other DDL operations unnecessarily do the 
> equivalent of a REFRESH after writing to the Hive Metastore. This unnecessary 
> operation can be very expensive, so should be avoided.
> The behavior can be confirmed from the catalogd logs:
> {code}
> compute stats functional_parquet.alltypes;
> +-------------------------------------------+
> | summary                                   |
> +-------------------------------------------+
> | Updated 24 partition(s) and 11 column(s). |
> +-------------------------------------------+
> Relevant catalogd.INFO snippet
> I0413 14:40:24.210749 27295 HdfsTable.java:1263] Incrementally loading table 
> metadata for: functional_parquet.alltypes
> I0413 14:40:24.242122 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=1: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.244634 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=10: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.247174 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=11: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.249713 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=12: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.252288 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=2: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.254629 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=3: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.256991 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=4: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.259464 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=5: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.262197 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=6: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.264463 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=7: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.266736 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=8: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.269210 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=9: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.271800 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=1: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.274348 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=10: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.277053 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=11: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.282152 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=12: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.285684 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=2: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.288921 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=3: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.292757 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=4: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.303673 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=5: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.308387 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=6: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.311506 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=7: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.314600 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=8: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.317709 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=9: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.317873 27295 HdfsTable.java:1273] Incrementally loaded table 
> metadata for: functional_parquet.alltypes
> {code}
> The relevant code starts from CatalogOpExecutor:
> {code}
> boolean reloadMetadata = true;
> catalog_.getLock().writeLock().unlock();
> if (tbl instanceof KuduTable && altersKuduTable(params.getAlter_type())) {
>   alterKuduTable(params, response, (KuduTable) tbl, newCatalogVersion);
>   return;
> }
> switch (params.getAlter_type()) {
> ...
>         case UPDATE_STATS:
>           Preconditions.checkState(params.isSetUpdate_stats_params());
>           Reference<Long> numUpdatedColumns = new Reference<>(0L);
>           alterTableUpdateStats(tbl, params.getUpdate_stats_params(),
>               numUpdatedPartitions, numUpdatedColumns);
>           reloadTableSchema = true;
>           addSummary(response, "Updated " + numUpdatedPartitions.getRef() +
>               " partition(s) and " + numUpdatedColumns.getRef() + " 
> column(s).");
>           break;
> ....
> }
> if (reloadMetadata) { <-- REFRESH here
>   loadTableMetadata(tbl, newCatalogVersion, reloadFileMetadata,
>       reloadTableSchema, null);
>   addTableToCatalogUpdate(tbl, response.result);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to