Hello Bharath Vissapragada, Tianyi Wang, Impala Public Jenkins, Vuk Ercegovac,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/11280

to look at the new patch set (#7).

Change subject: IMPALA-7469. Invalidate LocalCatalog cache based on topic 
updates
......................................................................

IMPALA-7469. Invalidate LocalCatalog cache based on topic updates

This implements cache invalidation inside CatalogdMetaProvider. The
design is as follows:

- when the catalogd collects updates into the statestore topic, it now
  adds an additional entry for each table and database. These additional
  entries are minimal - they only include the object's name, but no
  metadata. This new behavior is conditional on a new flag
  --catalog_topic_mode. The default mode is to keep the old style, but
  it can be configured to mixed (support both v1 and v2) or v2-only.

- the old-style topic entries are prefixed with a '1:' whereas the new
  minimal entries are prefixed with a '2:'. The impalad will subscribe
  to one or the other prefix depending on whether it is running with
  --use_local_catalog. Thus, old impalads will not be confused by the
  new entries and vice versa.

- when the impalad gets these topic updates, it forwards them through to
  the catalog implementation. The LocalCatalog implementation forwards
  them to the CatalogdMetaProvider, which uses them to invalidate
  cached metadata as appropriate.

This patch includes some basic unit tests. I also did some manual
testing by connecting to different impalads and verifying that a session
connected to impalad #1 saw the effects of DDLs made by impalad #2
within a short period of time (the statestore topic update frequency).

Existing end-to-end tests cover these code paths pretty thoroughly:

- if we didn't automatically invalidate the cache on a coordinator
  in response to DDL operations, then any test which expects to
  "read its own writes" (eg access a table after creating one)
  would fail
- if we didn't propagate invalidations via the statestore, then
  all of the tests that use sync_ddl would fail.

I verified the test coverage above using some of the tests in
test_ddl.py -- I selectively commented out a few of the invalidation
code paths in the new code and verified that tests failed until I
re-introduced them. Along the way I also improved test_ddl so that, when
this code is broken, it properly fails with a timeout. It also has a bit
of expanded coverage for both the SYNC_DDL and non-SYNC cases.

I also wrote a new custom-cluster test for LocalCatalog that verifies
a few of the specific edge cases like detecting catalogd restart.

One notable exception here is the implementation of INVALIDATE METADATA
This turned out to be complex to implement, so I left a lengthy TODO
describing the issue and filed a JIRA.

Change-Id: I615f9e6bd167b36cd8d93da59426dd6813ae4984
---
M be/src/catalog/catalog-server.cc
M be/src/service/impala-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java
M fe/src/main/java/org/apache/impala/common/Pair.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/FeCatalogManager.java
M fe/src/test/java/org/apache/impala/catalog/PartialCatalogInfoTest.java
M fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java
M fe/src/test/java/org/apache/impala/catalog/local/LocalCatalogTest.java
M tests/common/custom_cluster_test_suite.py
A tests/custom_cluster/test_local_catalog.py
M tests/metadata/test_ddl.py
21 files changed, 773 insertions(+), 104 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/11280/7
--
To view, visit http://gerrit.cloudera.org:8080/11280
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I615f9e6bd167b36cd8d93da59426dd6813ae4984
Gerrit-Change-Number: 11280
Gerrit-PatchSet: 7
Gerrit-Owner: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Tianyi Wang <tw...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: Vuk Ercegovac <vercego...@cloudera.com>

Reply via email to