Michael Brown created IMPALA-7910: ------------------------------------- Summary: COMPUTE STATS does an unnecessary REFRESH after writing to the Metastore Key: IMPALA-7910 URL: https://issues.apache.org/jira/browse/IMPALA-7910 Project: IMPALA Issue Type: Bug Components: Catalog Affects Versions: Impala 2.12.0, Impala 2.11.0, Impala 2.9.0 Reporter: Michael Brown Assignee: Tianyi Wang
COMPUTE STATS and possibly other DDL operations unnecessarily do the equivalent of a REFRESH after writing to the Hive Metastore. This unnecessary operation can be very expensive, so should be avoided. The behavior can be confirmed from the catalogd logs: {code} compute stats functional_parquet.alltypes; +-------------------------------------------+ | summary | +-------------------------------------------+ | Updated 24 partition(s) and 11 column(s). | +-------------------------------------------+ Relevant catalogd.INFO snippet I0413 14:40:24.210749 27295 HdfsTable.java:1263] Incrementally loading table metadata for: functional_parquet.alltypes I0413 14:40:24.242122 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=1: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 I0413 14:40:24.244634 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=10: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 I0413 14:40:24.247174 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=11: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 I0413 14:40:24.249713 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=12: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 I0413 14:40:24.252288 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=2: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 I0413 14:40:24.254629 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=3: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 I0413 14:40:24.256991 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=4: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 I0413 14:40:24.259464 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=5: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 I0413 14:40:24.262197 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=6: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 I0413 14:40:24.264463 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=7: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 I0413 14:40:24.266736 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=8: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 I0413 14:40:24.269210 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=9: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 I0413 14:40:24.271800 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=1: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 I0413 14:40:24.274348 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=10: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 I0413 14:40:24.277053 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=11: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 I0413 14:40:24.282152 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=12: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 I0413 14:40:24.285684 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=2: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 I0413 14:40:24.288921 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=3: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 I0413 14:40:24.292757 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=4: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 I0413 14:40:24.303673 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=5: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 I0413 14:40:24.308387 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=6: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 I0413 14:40:24.311506 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=7: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 I0413 14:40:24.314600 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=8: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 I0413 14:40:24.317709 27295 HdfsTable.java:555] Refreshed file metadata for functional_parquet.alltypes Path: hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=9: Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 I0413 14:40:24.317873 27295 HdfsTable.java:1273] Incrementally loaded table metadata for: functional_parquet.alltypes {code} The relevant code starts from CatalogOpExecutor: {code} boolean reloadMetadata = true; catalog_.getLock().writeLock().unlock(); if (tbl instanceof KuduTable && altersKuduTable(params.getAlter_type())) { alterKuduTable(params, response, (KuduTable) tbl, newCatalogVersion); return; } switch (params.getAlter_type()) { ... case UPDATE_STATS: Preconditions.checkState(params.isSetUpdate_stats_params()); Reference<Long> numUpdatedColumns = new Reference<>(0L); alterTableUpdateStats(tbl, params.getUpdate_stats_params(), numUpdatedPartitions, numUpdatedColumns); reloadTableSchema = true; addSummary(response, "Updated " + numUpdatedPartitions.getRef() + " partition(s) and " + numUpdatedColumns.getRef() + " column(s)."); break; .... } if (reloadMetadata) { <-- REFRESH here loadTableMetadata(tbl, newCatalogVersion, reloadFileMetadata, reloadTableSchema, null); addTableToCatalogUpdate(tbl, response.result); } {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org