[ https://issues.apache.org/jira/browse/IMPALA-6853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael Brown deleted IMPALA-6853: ---------------------------------- > COMPUTE STATS does an unnecessary REFRESH after writing to the Metastore > ------------------------------------------------------------------------ > > Key: IMPALA-6853 > URL: https://issues.apache.org/jira/browse/IMPALA-6853 > Project: IMPALA > Issue Type: Bug > Reporter: Alexander Behm > Assignee: Tianyi Wang > Priority: Critical > Labels: compute-stats, perfomance > > COMPUTE STATS and possibly other DDL operations unnecessarily do the > equivalent of a REFRESH after writing to the Hive Metastore. This unnecessary > operation can be very expensive, so should be avoided. > The behavior can be confirmed from the catalogd logs: > {code} > compute stats functional_parquet.alltypes; > +-------------------------------------------+ > | summary | > +-------------------------------------------+ > | Updated 24 partition(s) and 11 column(s). | > +-------------------------------------------+ > Relevant catalogd.INFO snippet > I0413 14:40:24.210749 27295 HdfsTable.java:1263] Incrementally loading table > metadata for: functional_parquet.alltypes > I0413 14:40:24.242122 27295 HdfsTable.java:555] Refreshed file metadata for > functional_parquet.alltypes Path: > hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=1: > Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 > I0413 14:40:24.244634 27295 HdfsTable.java:555] Refreshed file metadata for > functional_parquet.alltypes Path: > hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=10: > Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 > I0413 14:40:24.247174 27295 HdfsTable.java:555] Refreshed file metadata for > functional_parquet.alltypes Path: > hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=11: > Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 > I0413 14:40:24.249713 27295 HdfsTable.java:555] Refreshed file metadata for > functional_parquet.alltypes Path: > hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=12: > Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 > I0413 14:40:24.252288 27295 HdfsTable.java:555] Refreshed file metadata for > functional_parquet.alltypes Path: > hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=2: > Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 > I0413 14:40:24.254629 27295 HdfsTable.java:555] Refreshed file metadata for > functional_parquet.alltypes Path: > hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=3: > Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 > I0413 14:40:24.256991 27295 HdfsTable.java:555] Refreshed file metadata for > functional_parquet.alltypes Path: > hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=4: > Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 > I0413 14:40:24.259464 27295 HdfsTable.java:555] Refreshed file metadata for > functional_parquet.alltypes Path: > hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=5: > Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 > I0413 14:40:24.262197 27295 HdfsTable.java:555] Refreshed file metadata for > functional_parquet.alltypes Path: > hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=6: > Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 > I0413 14:40:24.264463 27295 HdfsTable.java:555] Refreshed file metadata for > functional_parquet.alltypes Path: > hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=7: > Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 > I0413 14:40:24.266736 27295 HdfsTable.java:555] Refreshed file metadata for > functional_parquet.alltypes Path: > hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=8: > Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 > I0413 14:40:24.269210 27295 HdfsTable.java:555] Refreshed file metadata for > functional_parquet.alltypes Path: > hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=9: > Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 > I0413 14:40:24.271800 27295 HdfsTable.java:555] Refreshed file metadata for > functional_parquet.alltypes Path: > hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=1: > Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 > I0413 14:40:24.274348 27295 HdfsTable.java:555] Refreshed file metadata for > functional_parquet.alltypes Path: > hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=10: > Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 > I0413 14:40:24.277053 27295 HdfsTable.java:555] Refreshed file metadata for > functional_parquet.alltypes Path: > hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=11: > Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 > I0413 14:40:24.282152 27295 HdfsTable.java:555] Refreshed file metadata for > functional_parquet.alltypes Path: > hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=12: > Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 > I0413 14:40:24.285684 27295 HdfsTable.java:555] Refreshed file metadata for > functional_parquet.alltypes Path: > hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=2: > Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 > I0413 14:40:24.288921 27295 HdfsTable.java:555] Refreshed file metadata for > functional_parquet.alltypes Path: > hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=3: > Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 > I0413 14:40:24.292757 27295 HdfsTable.java:555] Refreshed file metadata for > functional_parquet.alltypes Path: > hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=4: > Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 > I0413 14:40:24.303673 27295 HdfsTable.java:555] Refreshed file metadata for > functional_parquet.alltypes Path: > hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=5: > Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 > I0413 14:40:24.308387 27295 HdfsTable.java:555] Refreshed file metadata for > functional_parquet.alltypes Path: > hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=6: > Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 > I0413 14:40:24.311506 27295 HdfsTable.java:555] Refreshed file metadata for > functional_parquet.alltypes Path: > hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=7: > Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 > I0413 14:40:24.314600 27295 HdfsTable.java:555] Refreshed file metadata for > functional_parquet.alltypes Path: > hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=8: > Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 > I0413 14:40:24.317709 27295 HdfsTable.java:555] Refreshed file metadata for > functional_parquet.alltypes Path: > hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=9: > Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0 > I0413 14:40:24.317873 27295 HdfsTable.java:1273] Incrementally loaded table > metadata for: functional_parquet.alltypes > {code} > The relevant code starts from CatalogOpExecutor: > {code} > boolean reloadMetadata = true; > catalog_.getLock().writeLock().unlock(); > if (tbl instanceof KuduTable && altersKuduTable(params.getAlter_type())) { > alterKuduTable(params, response, (KuduTable) tbl, newCatalogVersion); > return; > } > switch (params.getAlter_type()) { > ... > case UPDATE_STATS: > Preconditions.checkState(params.isSetUpdate_stats_params()); > Reference<Long> numUpdatedColumns = new Reference<>(0L); > alterTableUpdateStats(tbl, params.getUpdate_stats_params(), > numUpdatedPartitions, numUpdatedColumns); > reloadTableSchema = true; > addSummary(response, "Updated " + numUpdatedPartitions.getRef() + > " partition(s) and " + numUpdatedColumns.getRef() + " > column(s)."); > break; > .... > } > if (reloadMetadata) { <-- REFRESH here > loadTableMetadata(tbl, newCatalogVersion, reloadFileMetadata, > reloadTableSchema, null); > addTableToCatalogUpdate(tbl, response.result); > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)