[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 27: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/1913/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 27 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 30 Jan 2019 17:50:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. IMPALA-7970 : Add support for metastore event based automatic invalidate This change adds support to CatalogD to poll metastore events to issue invalidate on tables automatically. It adds basic infrastructure to poll HMS notifications events at a configurable frequency using a backend config called hms_event_polling_interval_s flag. Currently, it issues invalidate at tables when it received alter events on table and partitions. It also adds tables/databases and removes tables from catalogD when it receives create_table/create_database and drop_table/drop_database events. The default value of hms_event_polling_interval_s is 0 which disables the feature. A non-zero value in seconds of this configuration can be used to enable the feature and set the polling frequency. In order to process each event atomically, this feature relies on version lock in CatalogServiceCatalog. It adds new methods in CatalogServiceCatalog which takes a write lock on version so that readers are blocked until the catalog state is updated based on the events. In case of processing events, the metastore operation is already completed and only catalog state needs to be updated. Hence we do not need to make new metastore calls while processing the events and only version lock is sufficient to serialize updates to the catalog objects based on events. This locking protocol is similar to what is done in case of DDL processing in CatalogOpExecutor except it does not need to take metastoreDdlLock since no metastore operations are needed during event processing. The change also adds a new test class to test the basic functionality for each of the event type which is supported. Note that this feature is still a work in progress and additional improvements will be done in subsequent patches. By default the feature is turned off. Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Reviewed-on: http://gerrit.cloudera.org:8080/12118 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreNotificationException.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreNotificationFetchException.java A fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java A fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java A fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java M fe/src/test/resources/postgresql-hive-site.xml.template 14 files changed, 2,553 insertions(+), 3 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 30 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 29: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 29 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 30 Jan 2019 03:32:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 29: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3681/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 29 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 29 Jan 2019 21:27:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 29: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 29 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 29 Jan 2019 21:27:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 28: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 28 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 29 Jan 2019 21:26:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Vihang Karajgaonkar has uploaded a new patch set (#28). ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. IMPALA-7970 : Add support for metastore event based automatic invalidate This change adds support to CatalogD to poll metastore events to issue invalidate on tables automatically. It adds basic infrastructure to poll HMS notifications events at a configurable frequency using a backend config called hms_event_polling_interval_s flag. Currently, it issues invalidate at tables when it received alter events on table and partitions. It also adds tables/databases and removes tables from catalogD when it receives create_table/create_database and drop_table/drop_database events. The default value of hms_event_polling_interval_s is 0 which disables the feature. A non-zero value in seconds of this configuration can be used to enable the feature and set the polling frequency. In order to process each event atomically, this feature relies on version lock in CatalogServiceCatalog. It adds new methods in CatalogServiceCatalog which takes a write lock on version so that readers are blocked until the catalog state is updated based on the events. In case of processing events, the metastore operation is already completed and only catalog state needs to be updated. Hence we do not need to make new metastore calls while processing the events and only version lock is sufficient to serialize updates to the catalog objects based on events. This locking protocol is similar to what is done in case of DDL processing in CatalogOpExecutor except it does not need to take metastoreDdlLock since no metastore operations are needed during event processing. The change also adds a new test class to test the basic functionality for each of the event type which is supported. Note that this feature is still a work in progress and additional improvements will be done in subsequent patches. By default the feature is turned off. Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 --- M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreNotificationException.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreNotificationFetchException.java A fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java A fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java A fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java M fe/src/test/resources/postgresql-hive-site.xml.template 14 files changed, 2,553 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/12118/28 -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 28 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 27: Code-Review+2 Thanks for patiently addressing all the review comments. Nice first contribution to the project :-). I can kick-off a GVO once you fix the line overflows. -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 27 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 29 Jan 2019 18:19:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 27: (3 comments) http://gerrit.cloudera.org:8080/#/c/12118/27/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java: http://gerrit.cloudera.org:8080/#/c/12118/27/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@448 PS27, Line 448: + "invalidated since it does not exist anymore", getFullyQualifiedTblName())); line too long (94 > 90) http://gerrit.cloudera.org:8080/#/c/12118/27/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@464 PS27, Line 464: Pair result = catalog_.renameOrAddTableIfNotExists(oldTTableName, line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/12118/27/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/12118/27/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@758 PS27, Line 758: private void createDatabaseFromImpala(String dbName, String desc) throws ImpalaException { line too long (92 > 90) -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 27 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 29 Jan 2019 01:34:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Vihang Karajgaonkar has uploaded a new patch set (#27). ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. IMPALA-7970 : Add support for metastore event based automatic invalidate This change adds support to CatalogD to poll metastore events to issue invalidate on tables automatically. It adds basic infrastructure to poll HMS notifications events at a configurable frequency using a backend config called hms_event_polling_interval_s flag. Currently, it issues invalidate at tables when it received alter events on table and partitions. It also adds tables/databases and removes tables from catalogD when it receives create_table/create_database and drop_table/drop_database events. The default value of hms_event_polling_interval_s is 0 which disables the feature. A non-zero value in seconds of this configuration can be used to enable the feature and set the polling frequency. In order to process each event atomically, this feature relies on version lock in CatalogServiceCatalog. It adds new methods in CatalogServiceCatalog which takes a write lock on version so that readers are blocked until the catalog state is updated based on the events. In case of processing events, the metastore operation is already completed and only catalog state needs to be updated. Hence we do not need to make new metastore calls while processing the events and only version lock is sufficient to serialize updates to the catalog objects based on events. This locking protocol is similar to what is done in case of DDL processing in CatalogOpExecutor except it does not need to take metastoreDdlLock since no metastore operations are needed during event processing. The change also adds a new test class to test the basic functionality for each of the event type which is supported. Note that this feature is still a work in progress and additional improvements will be done in subsequent patches. By default the feature is turned off. Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 --- M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreNotificationException.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreNotificationFetchException.java A fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java A fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java A fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java M fe/src/test/resources/postgresql-hive-site.xml.template 14 files changed, 2,551 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/12118/27 -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 27 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 25: (10 comments) http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@308 PS25, Line 308: LOG.info(String.format("Metastore event processing is disabled. Event polling " > nit: fix formatting. Done http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java: http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@145 PS25, Line 145: no > not Done http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@166 PS25, Line 166: drop event later > Do we also need to remove the drop event or are we just letting it fail sil drop events are implemented as dropIfExists. So since the create event is skipped, the drop event processing becomes a no-op. I wanted to remove the drop event as well, but it was complicating this code here without any added benefit. I will add a comment here saying why its ok to not remove the drop event so that its clear in the future. http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@174 PS25, Line 174: fromIndex++; > Log how many events got filtered out? Helps diagnosing incase there are any Good point. Also, added logging in the isRemovedAfter impl for createTable and createDatabase events. http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@356 PS25, Line 356: if (this.dbName_.equalsIgnoreCase(dropTableEvent.dbName_) && this.tblName_ : .equalsIgnoreCase(dropTableEvent.tblName_)) { : return true; : } > nit: simplify to return dbName_.equals() && tblName_.equals(); Actually, added a log statement in the if block so can't do this anymore. http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@360 PS25, Line 360: else if (event.eventType_.equals(MetastoreEventType.ALTER_TABLE)) { : // if this create table was renamed to some other name in the future return true : AlterTableEvent alterTableEvent = (AlterTableEvent) event; : if (alterTableEvent.isRename_ && this.dbName_ : .equalsIgnoreCase(alterTableEvent.tableBefore_.getDbName()) && this.tblName_ : .equalsIgnoreCase(alterTableEvent.tableBefore_.getTableName())) { : return true; : } > clarify why rename qualifies as an inverse op for create? Done http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java: http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@295 PS25, Line 295: LOG.info(String : .format("Received %d events. Start event id : %d", response.getEvents().size(), : lastSyncedEventId_)); > nit: format to fewer lines. this statement cannot be converted to a 2 line statement without increasing the line width beyond 100 http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@319 PS25, Line 319: LOG.warn(String.format( : "Not processing notification events since event processing status " : + "is %s. Last synced event id is %d", currentStatus, : lastSyncedEventId_)); > nit: format to fewer lines. Reduced the text verbosity to make it to 2 lines. http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@339 PS25, Line 339: : // remove some partitions : // change some partitions > ? had kept a reminder for myself to add
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 22: (5 comments) I forgot to publish these comments earlier. Publishing it here again just for the record. http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java: http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@383 PS20, Line 383: > Nit: Done http://gerrit.cloudera.org:8080/#/c/12118/22/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java: http://gerrit.cloudera.org:8080/#/c/12118/22/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@360 PS22, Line 360: throw new MetastoreNotificationException(debugString( > To follow up on previous comments about errors. We may have the user busily Thats a fair point. Let me add more details on what can go wrong during this event processing. I would analyze what could be state of catalog at the event processing time based on the diagram I have in the event processor class. Case I: Catalog state is stale with respect to this event. a. It knows of a table with oldTblName but doesn't know about newTblName This maps to the regular renameTable by remove oldTbl and add newTbl. Case II: Catalog state is exactly at the same state as of event. a. This means that catalog doesn't know of oldTbl. It knows about newTbl. In this case also renameTable fails since it cannot find oldTbl. However, we can continue event processing. Basically, no action would be taken by this event and it needs to be ignored. Case III: Catalog state is ahead of event. This could mean a number of things. a. simple case is nothing exotic happens. OldTbl doesn't exist, newTbl exists --> case II b. user did a reverse rename from newtbl -> oldTbl post event. But we don't have any mechanism currently to distinguish this state from Case I without the version number support. c. User dropped the newTbl (or renamed it to something else). In this case, event processor will neither see a oldTbl or newTbl in the cache. It looks like there would be slight window of time where newTbl will appear until the actual event of the newTbl is processed and it is dropped (or renamed). Given these various possibilities, I think the right thing to do is Atomic rename by drop-and-add 1. remove oldTbl if it exists. 2. add newTbl if it doesn't exist. To compare a existing tbl we should use the creationTime and version. http://gerrit.cloudera.org:8080/#/c/12118/22/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@515 PS22, Line 515: LOG.info(debugString("Successfully removed database %s", dbName_)); > Briefly looked at these other error handling blocks. The exception/logging Thanks for taking a relook http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java: http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@334 PS21, Line 334: lastSyncedEventId_ = event.getEventId(); : } : } : } catch (MetastoreNotificationException | CatalogException ex) { : updateStatus(EventProcessorStatus.E > This blocks invalidate metadata since both of them try to get a writelock a changed the locking logic to use simple synchronized blocks. The code becomes a little more ugly but reduces the total blocking time for reset() as low as possible. http://gerrit.cloudera.org:8080/#/c/12118/22/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java: http://gerrit.cloudera.org:8080/#/c/12118/22/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@258 PS22, Line 258: public synchronized void start(long fromEventId) { > This is synchronized, which suggests that we might get multiple concurrent Both the start() and stop() methods do a Preconditions.checkState(). In this particular start method the check is on line 261. I will move it to the first couple of lines to improve readability. -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id:
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 25: (12 comments) lgtm. Final round of comments from my side. Given the state the patch is in, I think we should get the patch into the master branch which makes it easier for concurrency testing. Also since all of this code is behind a feature flag, it should not affect any existing deployments. http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@308 PS25, Line 308: LOG.info(String.format("Metastore event processing is disabled. Event polling " nit: fix formatting. http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1833 PS25, Line 1833: if (db == null) return null; : if (!db.containsTable(tblName)) return null nit: merge into a single stmt. http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java: http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@145 PS25, Line 145: no not http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@149 PS25, Line 149: * removed a table, the create event received will try to add the object again. This probably needs some thought during concurrency testing. http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@166 PS25, Line 166: drop event later Do we also need to remove the drop event or are we just letting it fail silently? http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@174 PS25, Line 174: fromIndex++; Log how many events got filtered out? Helps diagnosing incase there are any bugs in this area. http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@356 PS25, Line 356: if (this.dbName_.equalsIgnoreCase(dropTableEvent.dbName_) && this.tblName_ : .equalsIgnoreCase(dropTableEvent.tblName_)) { : return true; : } nit: simplify to return dbName_.equals() && tblName_.equals(); http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@360 PS25, Line 360: else if (event.eventType_.equals(MetastoreEventType.ALTER_TABLE)) { : // if this create table was renamed to some other name in the future return true : AlterTableEvent alterTableEvent = (AlterTableEvent) event; : if (alterTableEvent.isRename_ && this.dbName_ : .equalsIgnoreCase(alterTableEvent.tableBefore_.getDbName()) && this.tblName_ : .equalsIgnoreCase(alterTableEvent.tableBefore_.getTableName())) { : return true; : } clarify why rename qualifies as an inverse op for create? http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java: http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@295 PS25, Line 295: LOG.info(String : .format("Received %d events. Start event id : %d", response.getEvents().size(), : lastSyncedEventId_)); nit: format to fewer lines. http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@319 PS25, Line 319: LOG.warn(String.format( : "Not processing notification events since event processing status " : + "is %s. Last synced event id is %d", currentStatus, : lastSyncedEventId_)); nit: format to fewer lines. http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 25: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1897/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 25 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Sat, 26 Jan 2019 02:17:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 25: (4 comments) http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java: http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@426 PS25, Line 426: + "invalidated since it does not exist anymore", getFullyQualifiedTblName())); line too long (94 > 90) http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@481 PS25, Line 481: if (eventsProcessor_.getStatus() != EventProcessorStatus.ACTIVE) eventsProcessor_.start(); line too long (96 > 90) http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@725 PS25, Line 725: private void createDatabaseFromImpala(String dbName, String desc) throws ImpalaException { line too long (92 > 90) http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@751 PS25, Line 751: private void createTableFromImpala(String tblName, boolean isPartitioned) throws ImpalaException { line too long (100 > 90) -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 25 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Sat, 26 Jan 2019 01:41:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Vihang Karajgaonkar has uploaded a new patch set (#25). ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. IMPALA-7970 : Add support for metastore event based automatic invalidate This change adds support to CatalogD to poll metastore events to issue invalidate on tables automatically. It adds basic infrastructure to poll HMS notifications events at a configurable frequency using a backend config called hms_event_polling_interval_s flag. Currently, it issues invalidate at tables when it received alter events on table and partitions. It also adds tables/databases and removes tables from catalogD when it receives create_table/create_database and drop_table/drop_database events. The default value of hms_event_polling_interval_s is 0 which disables the feature. A non-zero value in seconds of this configuration can be used to enable the feature and set the polling frequency. In order to process each event atomically, this feature relies on version lock in CatalogServiceCatalog. It adds new methods in CatalogServiceCatalog which takes a write lock on version so that readers are blocked until the catalog state is updated based on the events. In case of processing events, the metastore operation is already completed and only catalog state needs to be updated. Hence we do not need to make new metastore calls while processing the events and only version lock is sufficient to serialize updates to the catalog objects based on events. This locking protocol is similar to what is done in case of DDL processing in CatalogOpExecutor except it does not need to take metastoreDdlLock since no metastore operations are needed during event processing. The change also adds a new test class to test the basic functionality for each of the event type which is supported. Note that this feature is still a work in progress and additional improvements will be done in subsequent patches. By default the feature is turned off. Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 --- M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreNotificationException.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreNotificationFetchException.java A fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java A fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java A fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java M fe/src/test/resources/postgresql-hive-site.xml.template 14 files changed, 2,458 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/12118/25 -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 25 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Paul Rogers has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 22: (3 comments) Thanks much for your great work in getting this in and helping us anticipate all the ways that things might go wrong in a production system. Thanks much also for discussing the code review comments in person. This review has just a few minor follow-on comments. After that, I suspect we'll be in good shape to merge the code: we are reaching the point of diminishing returns in reviews; the next step is to try the feature out in a live cluster, and for that you'll need the code merged. A feature flag lets us turn off the feature if we encounter anything missed during reviews. http://gerrit.cloudera.org:8080/#/c/12118/22/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java: http://gerrit.cloudera.org:8080/#/c/12118/22/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@360 PS22, Line 360: throw new MetastoreNotificationException(debugString( To follow up on previous comments about errors. We may have the user busily hammering the system with INVALIDATEs and queries, each of which may mutate the cache ahead of the event processing here. If we encounter such a case, we should soldier on rather than stopping event processing. So, the case that the old table does not exist, is that an expected error if the user issues an INVALIDATE? Or, does it mean that we have a likely bug and we should stop event processing? The case of the missing new table is different: it would seem that there would be no good reason not to add the new table. But, I wonder, does the renameTable() method handle the case in which the rename was already done? Something like: * Rename table in HMS. * Issue a query against the new name, causing the catalog to load metadata for the new table. * Get the rename event. * Find the old table and remove it, but be unable to add a new table since a table of that name was already loaded. http://gerrit.cloudera.org:8080/#/c/12118/22/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@515 PS22, Line 515: LOG.info(debugString("Successfully removed database %s", dbName_)); Briefly looked at these other error handling blocks. The exception/logging decisions look right on each of them. http://gerrit.cloudera.org:8080/#/c/12118/22/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java: http://gerrit.cloudera.org:8080/#/c/12118/22/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@258 PS22, Line 258: public synchronized void start(long fromEventId) { This is synchronized, which suggests that we might get multiple concurrent start/stop calls from different threads. Should the start() (and stop()) methods check if they are in the proper state before proceeding? That is, should this method check if we are already in the ACTIVE state and simply return if so? -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 22 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 23 Jan 2019 21:18:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 22: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1847/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 22 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 22 Jan 2019 22:41:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 22: (2 comments) http://gerrit.cloudera.org:8080/#/c/12118/22/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java: http://gerrit.cloudera.org:8080/#/c/12118/22/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@337 PS22, Line 337: + "invalidated since it does not exist anymore", getFullyQualifiedTblName())); line too long (94 > 90) http://gerrit.cloudera.org:8080/#/c/12118/22/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/12118/22/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@443 PS22, Line 443: if (eventsProcessor_.getStatus() != EventProcessorStatus.ACTIVE) eventsProcessor_.start(); line too long (96 > 90) -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 22 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 22 Jan 2019 21:47:51 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Vihang Karajgaonkar has uploaded a new patch set (#22). ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. IMPALA-7970 : Add support for metastore event based automatic invalidate This change adds support to CatalogD to poll metastore events to issue invalidate on tables automatically. It adds basic infrastructure to poll HMS notifications events at a configurable frequency using a backend config called hms_event_polling_interval_s flag. Currently, it issues invalidate at tables when it received alter events on table and partitions. It also adds tables/databases and removes tables from catalogD when it receives create_table/create_database and drop_table/drop_database events. The default value of hms_event_polling_interval_s is 0 which disables the feature. A non-zero value in seconds of this configuration can be used to enable the feature and set the polling frequency. In order to process each event atomically, this feature relies on version lock in CatalogServiceCatalog. It adds new methods in CatalogServiceCatalog which takes a write lock on version so that readers are blocked until the catalog state is updated based on the events. In case of processing events, the metastore operation is already completed and only catalog state needs to be updated. Hence we do not need to make new metastore calls while processing the events and only version lock is sufficient to serialize updates to the catalog objects based on events. This locking protocol is similar to what is done in case of DDL processing in CatalogOpExecutor except it does not need to take metastoreDdlLock since no metastore operations are needed during event processing. The change also adds a new test class to test the basic functionality for each of the event type which is supported. Note that this feature is still a work in progress and additional improvements will be done in subsequent patches. By default the feature is turned off. Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 --- M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreNotificationException.java A fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java A fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java A fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java M fe/src/test/resources/postgresql-hive-site.xml.template 13 files changed, 1,974 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/12118/22 -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 22 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 20: (36 comments) http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@307 PS21, Line 307: BackendConfig.INSTANCE.getHMSPollingIntervalInSeconds() > eventPollingInterval Done http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@319 PS21, Line 319: LOG.error("Unable to fetch the current notification event id from metastore." : + "Metastore event processing will be disabled.", : e); > nit: Multiple places in the code, try to format in fewer lines. Something l I used the command given in https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65868536 as advised by some of the other members of the team. Unfortunately, it doesn't like some of these lines. http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@324 PS21, Line 324: e); > Will the code that handles this exception write to the log both the message The LOG.error above logs the trace as well since it takes in the underlying cause exception as the second argument. The exception is uncaught in this particular case on the caller side, since we want CatalogD to not come up if it is configured to using event processing but for some reasons isn't able to instantiate one. http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1206 PS21, Line 1206: // Even though we get the current notification event id before stopping the event > clarification: reset() is equivalent to invalidating everything and fetchin This is similar to Tim's comment and our discussion. The line 1216 gets the currentEventID (latest) from HMS. And the event processing begins from that id. So the event processing jumps to the latest event id once the reset() is complete. There is still a slight chance that we will reprocess some of the events which are generated during reset() execution. It is better than missing those events which would lead to inconsistent state between catalog and HMS as far as event processing is concerned. The comment gives details of that case. http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1216 PS21, Line 1216: long currentEventId = metastoreEventProcessor_.getCurrentEventId(); > Should this be here? Or in the event processor itself? If here, why is it n It needs to be here to avoid the race condition which can lead to possible missed events. We want to get the latestEventId before triggering the reset() so that we can then restart the event processing after reset from this eventId. If we move the code in the start() method, there is a chance that we will miss processing some of the events which happen during after reset is complete, but before we started the event processing. See Tim's comment in the earlier review comments which gives a nice example. http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1360 PS21, Line 1360: Reference dbWasFound, Reference tblWasFound) { > I still think that using the reference is non-standard Java. A simple solut used the second option of return boolean and throw exception when db is not found. http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1363 PS21, Line 1363: Db db = getDb(dbName); > Let's think about this. We return the flags because we want to know if the if getting db before acquiring the lock has a race then it looks like addTable and renameTable also has a race. I will create separate JIRAs for fixing them http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1372 PS21, Line 1372: tblWasFound.setRef(true); > I wonder, do we really need to know if the table existed or not? Is it even consider a case where user do create, drop, create with the same table name from Impala. In case of if the create event from the first create statement, the table presented in the event is stale and should not be used to add to the catalog. I think it is useful to log this information to make sure that the create event was ignored in such a case.
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 21: (14 comments) Patch code flow and logic lgtm generally. Just had some nits/clarifications. Publishing these while I'm looking at the tests. http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@307 PS21, Line 307: BackendConfig.INSTANCE.getHMSPollingIntervalInSeconds() eventPollingInterval http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@319 PS21, Line 319: LOG.error("Unable to fetch the current notification event id from metastore." : + "Metastore event processing will be disabled.", : e); nit: Multiple places in the code, try to format in fewer lines. Something like LOG.error("..." + ".", e) http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1206 PS21, Line 1206: // Even though we get the current notification event id before stopping the event clarification: reset() is equivalent to invalidating everything and fetching from scratch from HMS. In that case, do we still restart the event processing from where we left off (before reset()) or can we start with the latest event ID from HMS? http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1804 PS21, Line 1804: Table incompleteTable = IncompleteTable.createUninitializedTable(db, tblName); move it after L1807? http://gerrit.cloudera.org:8080/#/c/12118/13/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java File fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java: http://gerrit.cloudera.org:8080/#/c/12118/13/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@195 PS13, Line 195: > curious to understand why do you think this method has a race? nvm missed the synchronized part. http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java: http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@155 PS21, Line 155: event %d of type %s on table %s I think it is cleaner to define a debugString(NotificationEvent event) {} that formats it cleanly and consistently across all invocations. We can use it to dump event state in multiple places. http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@160 PS21, Line 160: if (!event.getEventType().equalsIgnoreCase("CREATE_DATABASE") : && !event.getEventType().equalsIgnoreCase("DROP_DATABASE")) { : tblName_ = Preconditions.checkNotNull(event.getTableName()); : } else { : tblName_ = null; : } nit: Reformat to the following? (fewer branches and use enum) tblName_ = null; if (eventType_ == MetaStoreEventType.CREATE_TABLE) { tblName_ = ... } http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@181 PS21, Line 181: tblName_ checkNotNull()? http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@198 PS21, Line 198:private final String dbName_; : private final String tblName_; aren't these set in the parent c'tor? http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@347 PS21, Line 347: droppedTable_ isn't this more like tblTobeDropped_ ? http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@434 PS21, Line 434: // nit: remove, you are already in a comment block. http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java: http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@310 PS21, Line 310: if (currentEventId <= lastSyncedEventId_) { : return Collections.emptyList(); : } nit: single line
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Paul Rogers has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 21: (32 comments) Very nice improvements: the event classes greatly simplify the logic. This round focused a bit on error reporting and logging. Many specific comments. As always, these are just questions and suggestions, you should decide which make sense. http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@324 PS21, Line 324: e); Will the code that handles this exception write to the log both the message from this exception AND the message from the "cause" exception? The goal is that, if a failure occurs, we get the detailed error message as well as the "summary" message. If we log the stack trace, then we're good. http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1216 PS21, Line 1216: long currentEventId = metastoreEventProcessor_.getCurrentEventId(); Should this be here? Or in the event processor itself? If here, why is it needed other than to pass it along to the event processor? Best not to leak implementation details if possible. http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1360 PS21, Line 1360: Reference dbWasFound, Reference tblWasFound) { I still think that using the reference is non-standard Java. A simple solution is: public Pair addTableIfNotExists(String dbName, String tblName) ... Better would be: enum TableStatus (DB_NOT_FOUND, TBL_NOT_FOUND, OK) public TableStatus addTableIfNotExists(String dbName, String tblName) ... Or, suppose we do this, since this method currently returns void: * Throw an exception if the DB does not exist. * Return true/false to indicate if the table was created (true) or already existed (false). http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1363 PS21, Line 1363: Db db = getDb(dbName); Let's think about this. We return the flags because we want to know if the DB was found or not. We could just as easily do: DB db = catalog.get(dbName_); if (db == null) { handle error; } The reason we don't do the above is we want to prevent race conditions. But... We do the check here outside the lock. So, we have a race condition anyway. Should either 1) the DB lookup be done by the caller to simplify the code, or 2) be done in the lock to prevent race conditions? http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1372 PS21, Line 1372: tblWasFound.setRef(true); I wonder, do we really need to know if the table existed or not? Is it even useful? Consider. * Impala creates a table. It updates its own cache and informs HMS. * HMS sends a notification. In processing the notification, we determine that the table already exists. Or * A table is added in HMS. * The user immediately kicks of a query in Impala, causing a load of metadata. * Impala polls for notifications, learns that the table is new, tries to add it, and finds that the table already exists. In this cases, do we get anything by logging that the table exists? If not, do we need to return the exists-or-not result? http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1475 PS21, Line 1475: Reference tblWasfound, Reference tblMatched) { Same comments as above. http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java: http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@130 PS20, Line 130: public static abstract class MetastoreEvent { The new event class hierarchy looks very good. One nit: some events are DB-level events, others are table-level events. This base class includes the table-level stuff and the reader (and code) must know what table stuff is available and when it is not. So, a further small improvement would be to create another abstract class, MetastoreTableEvent, to hold the common table stuff. The table class extends this one. DB-level events also extend this one. http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@138 PS20, Line 138: protected final Logger LOG = Logger.getLogger(this.getClass()); protected
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 21: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1812/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 21 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Fri, 18 Jan 2019 01:47:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 21: (1 comment) http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java: http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@364 PS21, Line 364: + "table event. Check if %s is set to true in metastore configuration", line too long (91 > 90) -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 21 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Fri, 18 Jan 2019 01:11:37 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 20: (1 comment) http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java File fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java: http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java@33 PS20, Line 33:* Get the current event id. Useful for restarting the event processing from a event id > Maybe " on the HMS". Just to be clear that it's not the last event ID proce Done -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 20 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Fri, 18 Jan 2019 01:11:00 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Vihang Karajgaonkar has uploaded a new patch set (#21). ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. IMPALA-7970 : Add support for metastore event based automatic invalidate This change adds support to CatalogD to poll metastore events to issue invalidate on tables automatically. It adds basic infrastructure to poll HMS notifications events at a configurable frequency using a backend config called hms_event_polling_interval_s flag. Currently, it issues invalidate at tables when it received alter events on table and partitions. It also adds tables/databases and removes tables from catalogD when it receives create_table/create_database and drop_table/drop_database events. The default value of hms_event_polling_interval_s is 0 which disables the feature. A non-zero value in seconds of this configuration can be used to enable the feature and set the polling frequency. In order to process each event atomically, this feature relies on version lock in CatalogServiceCatalog. It adds new methods in CatalogServiceCatalog which takes a write lock on version so that readers are blocked until the catalog state is updated based on the events. In case of processing events, the metastore operation is already completed and only catalog state needs to be updated. Hence we do not need to make new metastore calls while processing the events and only version lock is sufficient to serialize updates to the catalog objects based on events. This locking protocol is similar to what is done in case of DDL processing in CatalogOpExecutor except it does not need to take metastoreDdlLock since no metastore operations are needed during event processing. The change also adds a new test class to test the basic functionality for each of the event type which is supported. Note that this feature is still a work in progress and additional improvements will be done in subsequent patches. By default the feature is turned off. Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 --- M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreNotificationException.java A fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java A fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java A fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java M fe/src/test/resources/postgresql-hive-site.xml.template 13 files changed, 1,959 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/12118/21 -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 21 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 20: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1809/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 20 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Fri, 18 Jan 2019 00:47:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 20: Code-Review+1 (1 comment) Thanks for addressing my comment. The aspects I looked at are good (I haven't reviewed the full thing) http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java File fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java: http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java@33 PS20, Line 33:* Get the current event id. Useful for restarting the event processing from a event id Maybe " on the HMS". Just to be clear that it's not the last event ID processed. Ok to ignore. -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 20 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Fri, 18 Jan 2019 00:33:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 20: (1 comment) http://gerrit.cloudera.org:8080/#/c/12118/19/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java: http://gerrit.cloudera.org:8080/#/c/12118/19/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@269 PS19, Line 269: } > I think fetching the event ID here can lead to a race where we miss a notif Thanks for catching this. Fixed it the latest patch set. Although as discussed, there is still going to be small window of time where some of the events will be reprocessed. I have added a comment in CatalogServiceCatalog giving details about that case. -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 20 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Fri, 18 Jan 2019 00:17:16 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 20: (1 comment) http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java: http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@364 PS20, Line 364: + "table event. Check if %s is set to true in metastore configuration", line too long (91 > 90) -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 20 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Fri, 18 Jan 2019 00:13:58 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Vihang Karajgaonkar has uploaded a new patch set (#20). ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. IMPALA-7970 : Add support for metastore event based automatic invalidate This change adds support to CatalogD to poll metastore events to issue invalidate on tables automatically. It adds basic infrastructure to poll HMS notifications events at a configurable frequency using a backend config called hms_event_polling_interval_s flag. Currently, it issues invalidate at tables when it received alter events on table and partitions. It also adds tables/databases and removes tables from catalogD when it receives create_table/create_database and drop_table/drop_database events. The default value of hms_event_polling_interval_s is 0 which disables the feature. A non-zero value in seconds of this configuration can be used to enable the feature and set the polling frequency. In order to process each event atomically, this feature relies on version lock in CatalogServiceCatalog. It adds new methods in CatalogServiceCatalog which takes a write lock on version so that readers are blocked until the catalog state is updated based on the events. In case of processing events, the metastore operation is already completed and only catalog state needs to be updated. Hence we do not need to make new metastore calls while processing the events and only version lock is sufficient to serialize updates to the catalog objects based on events. This locking protocol is similar to what is done in case of DDL processing in CatalogOpExecutor except it does not need to take metastoreDdlLock since no metastore operations are needed during event processing. The change also adds a new test class to test the basic functionality for each of the event type which is supported. Note that this feature is still a work in progress and additional improvements will be done in subsequent patches. By default the feature is turned off. Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 --- M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreNotificationException.java A fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java A fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java A fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java M fe/src/test/resources/postgresql-hive-site.xml.template 13 files changed, 1,958 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/12118/20 -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 20 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 19: (1 comment) Thanks for addressing my comments. I think there might be one (probably rare) race that is possible. http://gerrit.cloudera.org:8080/#/c/12118/19/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java: http://gerrit.cloudera.org:8080/#/c/12118/19/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@269 PS19, Line 269: lastSyncedEventId_ = I think fetching the event ID here can lead to a race where we miss a notification, e.g.: 1. CatalogServiceCatalog.reset() is called 2. CatalogServiceCatalog.reset() loads DB and table metadata, which does not include table X. 3. Table X is added to the HMS, e.g. via Hive 4. This function fetches the last event ID 5. The processor starts processing events, but doesn't see the table X create event. I think the solution is to fetch the last event ID earlier in CatalogServiceCatalog.reset() and pass it into this function. -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 19 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 17 Jan 2019 22:22:34 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 19: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1806/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 19 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 17 Jan 2019 21:42:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 18: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1805/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 18 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 17 Jan 2019 21:28:40 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 18: (1 comment) http://gerrit.cloudera.org:8080/#/c/12118/18/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java: http://gerrit.cloudera.org:8080/#/c/12118/18/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@364 PS18, Line 364: + "table event. Check if %s is set to true in metastore configuration", line too long (91 > 90) -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 18 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 17 Jan 2019 21:06:27 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 19: (1 comment) http://gerrit.cloudera.org:8080/#/c/12118/19/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java: http://gerrit.cloudera.org:8080/#/c/12118/19/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@364 PS19, Line 364: + "table event. Check if %s is set to true in metastore configuration", line too long (91 > 90) -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 19 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 17 Jan 2019 21:10:35 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Vihang Karajgaonkar has uploaded a new patch set (#19). ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. IMPALA-7970 : Add support for metastore event based automatic invalidate This change adds support to CatalogD to poll metastore events to issue invalidate on tables automatically. It adds basic infrastructure to poll HMS notifications events at a configurable frequency using a backend config called hms_event_polling_interval_s flag. Currently, it issues invalidate at tables when it received alter events on table and partitions. It also adds tables/databases and removes tables from catalogD when it receives create_table/create_database and drop_table/drop_database events. The default value of hms_event_polling_interval_s is 0 which disables the feature. A non-zero value in seconds of this configuration can be used to enable the feature and set the polling frequency. In order to process each event atomically, this feature relies on version lock in CatalogServiceCatalog. It adds new methods in CatalogServiceCatalog which takes a write lock on version so that readers are blocked until the catalog state is updated based on the events. In case of processing events, the metastore operation is already completed and only catalog state needs to be updated. Hence we do not need to make new metastore calls while processing the events and only version lock is sufficient to serialize updates to the catalog objects based on events. This locking protocol is similar to what is done in case of DDL processing in CatalogOpExecutor except it does not need to take metastoreDdlLock since no metastore operations are needed during event processing. The change also adds a new test class to test the basic functionality for each of the event type which is supported. Note that this feature is still a work in progress and additional improvements will be done in subsequent patches. By default the feature is turned off. Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 --- M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreNotificationException.java A fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java A fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java A fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java M fe/src/test/resources/postgresql-hive-site.xml.template 13 files changed, 1,926 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/12118/19 -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 19 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Vihang Karajgaonkar has uploaded a new patch set (#18). ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. IMPALA-7970 : Add support for metastore event based automatic invalidate This change adds support to CatalogD to poll metastore events to issue invalidate on tables automatically. It adds basic infrastructure to poll HMS notifications events at a configurable frequency using a backend config called hms_event_polling_interval_s flag. Currently, it issues invalidate at tables when it received alter events on table and partitions. It also adds tables/databases and removes tables from catalogD when it receives create_table/create_database and drop_table/drop_database events. The default value of hms_event_polling_interval_s is 0 which disables the feature. A non-zero value in seconds of this configuration can be used to enable the feature and set the polling frequency. In order to process each event atomically, this feature relies on version lock in CatalogServiceCatalog. It adds new methods in CatalogServiceCatalog which takes a write lock on version so that readers are blocked until the catalog state is updated based on the events. In case of processing events, the metastore operation is already completed and only catalog state needs to be updated. Hence we do not need to make new metastore calls while processing the events and only version lock is sufficient to serialize updates to the catalog objects based on events. This locking protocol is similar to what is done in case of DDL processing in CatalogOpExecutor except it does not need to take metastoreDdlLock since no metastore operations are needed during event processing. The change also adds a new test class to test the basic functionality for each of the event type which is supported. Note that this feature is still a work in progress and additional improvements will be done in subsequent patches. By default the feature is turned off. Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 --- M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreNotificationException.java A fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java A fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java A fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java M fe/src/test/resources/postgresql-hive-site.xml.template 13 files changed, 1,926 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/12118/18 -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 18 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 17: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1792/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 17 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 16 Jan 2019 01:34:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 16: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1791/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 16 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 16 Jan 2019 01:05:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 17: (1 comment) http://gerrit.cloudera.org:8080/#/c/12118/17/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java: http://gerrit.cloudera.org:8080/#/c/12118/17/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@345 PS17, Line 345: + "table event. Check if %s is set to true in metastore configuration", line too long (91 > 90) -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 17 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 16 Jan 2019 00:41:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Vihang Karajgaonkar has uploaded a new patch set (#16). ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. IMPALA-7970 : Add support for metastore event based automatic invalidate This change adds support to CatalogD to poll metastore events to issue invalidate on tables automatically. It adds basic infrastructure to poll HMS notifications events at a configurable frequency using a backend config called hms_event_polling_interval_s flag. Currently, it issues invalidate at tables when it received alter events on table and partitions. It also adds tables/databases and removes tables from catalogD when it receives create_table/create_database and drop_table/drop_database events. The default value of hms_event_polling_interval_s is 0 which disables the feature. A non-zero value in seconds of this configuration can be used to enable the feature and set the polling frequency. In order to process each event atomically, this feature relies on version lock in CatalogServiceCatalog. It adds new methods in CatalogServiceCatalog which takes a write lock on version so that readers are blocked until the catalog state is updated based on the events. In case of processing events, the metastore operation is already completed and only catalog state needs to be updated. Hence we do not need to make new metastore calls while processing the events and only version lock is sufficient to serialize updates to the catalog objects based on events. This locking protocol is similar to what is done in case of DDL processing in CatalogOpExecutor except it does not need to take metastoreDdlLock since no metastore operations are needed during event processing. The change also adds a new test class to test the basic functionality for each of the event type which is supported. Note that this feature is still a work in progress and additional improvements will be done in subsequent patches. By default the feature is turned off. Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 --- M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreNotificationException.java A fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java A fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java A fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java M fe/src/test/resources/postgresql-hive-site.xml.template 13 files changed, 1,902 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/12118/16 -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 16 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Vihang Karajgaonkar has uploaded a new patch set (#17). ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. IMPALA-7970 : Add support for metastore event based automatic invalidate This change adds support to CatalogD to poll metastore events to issue invalidate on tables automatically. It adds basic infrastructure to poll HMS notifications events at a configurable frequency using a backend config called hms_event_polling_interval_s flag. Currently, it issues invalidate at tables when it received alter events on table and partitions. It also adds tables/databases and removes tables from catalogD when it receives create_table/create_database and drop_table/drop_database events. The default value of hms_event_polling_interval_s is 0 which disables the feature. A non-zero value in seconds of this configuration can be used to enable the feature and set the polling frequency. In order to process each event atomically, this feature relies on version lock in CatalogServiceCatalog. It adds new methods in CatalogServiceCatalog which takes a write lock on version so that readers are blocked until the catalog state is updated based on the events. In case of processing events, the metastore operation is already completed and only catalog state needs to be updated. Hence we do not need to make new metastore calls while processing the events and only version lock is sufficient to serialize updates to the catalog objects based on events. This locking protocol is similar to what is done in case of DDL processing in CatalogOpExecutor except it does not need to take metastoreDdlLock since no metastore operations are needed during event processing. The change also adds a new test class to test the basic functionality for each of the event type which is supported. Note that this feature is still a work in progress and additional improvements will be done in subsequent patches. By default the feature is turned off. Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 --- M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreNotificationException.java A fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java A fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java A fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java M fe/src/test/resources/postgresql-hive-site.xml.template 13 files changed, 1,900 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/12118/17 -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 17 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 16: (2 comments) http://gerrit.cloudera.org:8080/#/c/12118/16/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java: http://gerrit.cloudera.org:8080/#/c/12118/16/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@20 PS16, Line 20: import static org.apache.impala.catalog.events.MetastoreEventsProcessor.METASTORE_NOTIFICATIONS_ADD_THRIFT_OBJECTS_CONFIG_KEY; line too long (126 > 90) http://gerrit.cloudera.org:8080/#/c/12118/16/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@347 PS16, Line 347: + "table event. Check if %s is set to true in metastore configuration", line too long (91 > 90) -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 16 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 16 Jan 2019 00:34:38 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 15: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/1790/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 15 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 16 Jan 2019 00:14:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 15: (1 comment) http://gerrit.cloudera.org:8080/#/c/12118/15/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java: http://gerrit.cloudera.org:8080/#/c/12118/15/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@345 PS15, Line 345: + "table event. Check if %s is set to true in metastore configuration", line too long (91 > 90) -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 15 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 15 Jan 2019 23:41:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Vihang Karajgaonkar has uploaded a new patch set (#15). ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. IMPALA-7970 : Add support for metastore event based automatic invalidate This change adds support to CatalogD to poll metastore events to issue invalidate on tables automatically. It adds basic infrastructure to poll HMS notifications events at a configurable frequency using a backend config called hms_event_polling_interval_s flag. Currently, it issues invalidate at tables when it received alter events on table and partitions. It also adds tables/databases and removes tables from catalogD when it receives create_table/create_database and drop_table/drop_database events. The default value of hms_event_polling_interval_s is 0 which disables the feature. A non-zero value in seconds of this configuration can be used to enable the feature and set the polling frequency. In order to process each event atomically, this feature relies on version lock in CatalogServiceCatalog. It adds new methods in CatalogServiceCatalog which takes a write lock on version so that readers are blocked until the catalog state is updated based on the events. In case of processing events, the metastore operation is already completed and only catalog state needs to be updated. Hence we do not need to make new metastore calls while processing the events and only version lock is sufficient to serialize updates to the catalog objects based on events. This locking protocol is similar to what is done in case of DDL processing in CatalogOpExecutor except it does not need to take metastoreDdlLock since no metastore operations are needed during event processing. The change also adds a new test class to test the basic functionality for each of the event type which is supported. Note that this feature is still a work in progress and additional improvements will be done in subsequent patches. By default the feature is turned off. Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 --- M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java A fe/src/main/java/org/apache/impala/catalog/events/MetastoreNotificationException.java A fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java A fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java A fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java M fe/src/test/resources/postgresql-hive-site.xml.template 13 files changed, 1,886 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/12118/15 -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 15 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Paul Rogers has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 14: (21 comments) Nice improvements. Comments mostly relate to ways to simplify the processing logic. Feel free to ping me directly to discuss. http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@293 PS14, Line 293: if (metastoreEventProcessor_ != null) { The var != null pattern is error prone. Would be great to have the MetastoreEventsProcessor be an interface with two implementations: a null implementation (disabled) and a live implementation. Then, you just call the various methods without a check for null or not. http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@325 PS14, Line 325: } Can this move to a static factory method on the MetastoreEventsProcessor implementation? Doing so puts all the logic for that abstraction in one place. All this method would do here is choose between the dummy and live versions. http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1292 PS14, Line 1292: return addDb(dbName, msDb) != null; In master, addDb() returns a DB. This method returns a boolean. Is something amiss here? Is this really an "addOrGetDb"? Do you want the DB object -- either an existing one or a new one? Else, if you call this, get true, then look up the DB, is there a race condition? http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1344 PS14, Line 1344: public Table addTableIfNotExists(String dbName, String tblName, Checked. We do have existing methods called getOrLoadTable(). So, calling this one getOrAddTable might be more consistent. http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1345 PS14, Line 1345: Reference dbWasFound, Reference tblWasFound) { This is rather an odd C++ way to do things. A more typical Java way is to declare a class that will return the required values.. Or, you can declare a Pair with TableStatus an enum of {CREATED, ALREADY_EXISTS, DB_NOT_FOUND}. The same enum could be returned for other add-or-get operations. Pair addOrGetTable(String dbName, String tblName) ... http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1359 PS14, Line 1359: } else { The else is not necessary since prior block returned. Not a bug, just not necessary. http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1365 PS14, Line 1365: versionLock_.writeLock().unlock(); This is old school. We should change this so we can use try-with-resources: try(LockInstance lock =versionLock.writeLock()) { ... } LockInstance would implement AutoCloseable to make this happen. http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1462 PS14, Line 1462: Reference tblWasfound, Reference tblMatched) { Same issue with the references. We already have a removeTable() method which is implemented to do a "remove table if exists". Is there commonality that should be exploited here? http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1792 PS14, Line 1792: versionLock_.writeLock().lock(); Race condition? We got the DB above before the lock. The DB could be invalid now that we have the lock. http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1804 PS14, Line 1804: return db.getTable(tblName); The table was created above. We released the lock, then looked up the table again. Race conditions? http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/MetastoreEventHandlerUtils.java File fe/src/main/java/org/apache/impala/catalog/MetastoreEventHandlerUtils.java: http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/MetastoreEventHandlerUtils.java@52 PS14, Line 52: public static final String ALTER_PARTITION_EVENT_TYPE = "ALTER_PARTITION"; Given that we use the event names not just to translate from Thrift event to handler, perhaps this should be an enum. public EventType { CREATE_TABLE("CREATE_TABLE") ... } Then we can do: if (event == EventType.CREATE_TABLE) ... The enum form is both faster and more
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 14: (5 comments) I'm not planning on doing a full review since Bharath and Paul are on it. The concurrency aspects I looked at are much better in this version. I had some follow on questions about the interaction between the initial load of the catalog and the invalidation - I think you anticipated that to some extent. If it's too much to address in this patch we could consider merging this without addressing that but keeping it behind an experimental feature flag until some more of the concerns have been addressed. http://gerrit.cloudera.org:8080/#/c/12118/14/be/src/common/global-flags.cc File be/src/common/global-flags.cc: http://gerrit.cloudera.org:8080/#/c/12118/14/be/src/common/global-flags.cc@245 PS14, Line 245: DEFINE_int32(hms_event_polling_frequency_s, 0, Maybe make it hidden or mention that it's experimental in the text here if it's not ready for prime time yet? http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@299 PS14, Line 299:* Initializes Metastore event processor object if This occurs before the initial metadata is loaded in reset(), right? So we might process some events that occurred before we got an initial snapshot of the data. I think this is probably OK if the events are idempotent, but seems worth mentioning in the comment. Actually one question is how this interacts with reset(), either the initial reset() on startup or a later one from "INVALIDATE METADATA". What happens if events start to be processed when the catalog is empty before reset() loads all the data? And should we reset the current event ID when doing an invalidate? I'm wondering if those questions lead us to resetting currentNotificationId in reset() and stopping/pausing the even processing until reset() completes. That would at least reduce the number of possible races. http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@319 PS14, Line 319: LOG.fatal("Unable to fetch the current notification event id from metastore." This does actually terminate the process - it call to the glog LOG(FATAL) which calls abort(). I think we don't want this since it generates a core dump - Maybe LOG.error() + throwing an exception is the right behaviour. We've had issues before with misconfigured catalogds generating a ton of core dumps. http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1342 PS14, Line 1342: null ? http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java File fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java: http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@169 PS14, Line 169: // TODO figure out if CatalogD sync with HMS is in-process or completed so that we Ah, I think this is referring to the same kind of issue that I mentioned earlier with reset() running in parallel with this thread. -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 14 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Fri, 11 Jan 2019 00:03:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 14: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1731/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 14 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 08 Jan 2019 02:12:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Vihang Karajgaonkar has uploaded a new patch set (#14). ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. IMPALA-7970 : Add support for metastore event based automatic invalidate This change adds support to CatalogD to poll metastore events to issue invalidate on tables automatically. It adds basic infrastructure to poll HMS notifications events at a configurable frequency using a backend config called hms_event_polling_frequency_s flag. Currently, it issues invalidate at tables when it received alter events on table and partitions. It also adds tables/databases and removes tables from catalogD when it receives create_table/create_database and drop_table/drop_database events. The default value of hms_event_polling_frequency_s is 0 which disables the feature. A non-zero value in seconds of this configuration can be used to enable the feature and set the polling frequency. In order to process each event atomically, this feature relies on version lock in CatalogServiceCatalog. It adds new methods in CatalogServiceCatalog which takes a write lock on version so that readers are blocked until the catalog state is updated based on the events. In case of processing events, the metastore operation is already completed and only catalog state needs to be updated. Hence we do not need to make new metastore calls while processing the events and only version lock is sufficient to serialize updates to the catalog objects based on events. This locking protocol is similar to what is done in case of DDL processing in CatalogOpExecutor except it does not need to take metastoreDdlLock since no metastore operations are needed during event processing. The change also adds a new test class to test the basic functionality for each of the event type which is supported. Note that this feature is still a work in progress and additional improvements will be done in subsequent patches. Be default the feature is turned off. Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 --- M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/MetastoreEventHandlerUtils.java A fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java A fe/src/main/java/org/apache/impala/catalog/MetastoreNotificationException.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java A fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java M fe/src/test/resources/postgresql-hive-site.xml.template 10 files changed, 1,585 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/12118/14 -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 14 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 13: (24 comments) Firstly thanks for the clean-up. The new class structure makes it much clearer IMO. Publishing a bunch of minor comments. Please bear with my nits. Also, I had a brief chat with Vihang offline. It looks like the event handling races is still a WIP. My general concern was around the following 3 areas. - Races with conflicting local events. - What if Catalog already has the latest HMS state (due to a previous invalidation) - Failure scenarios if some event processing errors out. (Should we stop and invalidate everything, should be abort the thread etc.) Vihang mentioned that he is looking into HMS side versioning for objects to detect if the Catalog server already has the latest state and we can skip applying certain events. That probably requires some Hive side changes. Also we discussed about incorporating some of his thought process into class comments. http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@304 PS8, Line 304: : private void initializeMetastoreEventProcessor() { : long eventPollingInterval = BackendConfig.INSTANCE.getHMSPolli > done. Did you mean changing the log level to fatal as well? fatal() IIRC aborts the process. If metastore polling is configured and we are not able to init here, it looks to me like we should abort (instead of silently swallowing) since that can result in correctness issues later and is unexpected from a user POV. Thoughts? http://gerrit.cloudera.org:8080/#/c/12118/13/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/12118/13/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@292 PS13, Line 292: // start polling for metastore events : if (metastoreEventProcessor_ != null) { : metastoreEventProcessor_.start(); : } Any reason for not moving this to initializeMetastoreEventProcessor()? http://gerrit.cloudera.org:8080/#/c/12118/13/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@308 PS13, Line 308: debug nit: Make this info? http://gerrit.cloudera.org:8080/#/c/12118/13/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1460 PS13, Line 1460: if (tblToBeRemoved != null) { nit: invert the logic? if (tblToBeRemoved == null) return null; http://gerrit.cloudera.org:8080/#/c/12118/13/fe/src/main/java/org/apache/impala/catalog/MetastoreEventHandlerUtils.java File fe/src/main/java/org/apache/impala/catalog/MetastoreEventHandlerUtils.java: http://gerrit.cloudera.org:8080/#/c/12118/13/fe/src/main/java/org/apache/impala/catalog/MetastoreEventHandlerUtils.java@38 PS13, Line 38: static abstract class MetastoreEventHandler { Add doc about the thread-safety expectations here. http://gerrit.cloudera.org:8080/#/c/12118/13/fe/src/main/java/org/apache/impala/catalog/MetastoreEventHandlerUtils.java@39 PS13, Line 39: CatalogServiceCatalog catalog, Any reason for passing this around? Can't we instantiate this once and apply changes to it? http://gerrit.cloudera.org:8080/#/c/12118/13/fe/src/main/java/org/apache/impala/catalog/MetastoreEventHandlerUtils.java@42 PS13, Line 42: protected String dbName; Document what it means to have these set. It looks like the dbName is always set but tblName is optional http://gerrit.cloudera.org:8080/#/c/12118/13/fe/src/main/java/org/apache/impala/catalog/MetastoreEventHandlerUtils.java@43 PS13, Line 43: protected String tblName; nit: _ for class members http://gerrit.cloudera.org:8080/#/c/12118/13/fe/src/main/java/org/apache/impala/catalog/MetastoreEventHandlerUtils.java@47 PS13, Line 47: protected void validateAndInit(String expectedType, NotificationEvent event) { method doc. http://gerrit.cloudera.org:8080/#/c/12118/13/fe/src/main/java/org/apache/impala/catalog/MetastoreEventHandlerUtils.java@49 PS13, Line 49: event.getTableName() wouldn't this be null in CREATED_DB/DROP_DB case? Also, is it worth adding a TRACE log that dumps the event debug data? (Impala allows dynamically switching log levels for debugging). http://gerrit.cloudera.org:8080/#/c/12118/13/fe/src/main/java/org/apache/impala/catalog/MetastoreEventHandlerUtils.java@74 PS13, Line 74: static class CreateTableEventHandler extends MetastoreEventHandler { method docs for each event handlers?
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 13: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1706/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 13 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Fri, 04 Jan 2019 00:43:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Vihang Karajgaonkar has uploaded a new patch set (#13). ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. IMPALA-7970 : Add support for metastore event based automatic invalidate This change adds support to CatalogD to poll metastore events to issue invalidate on tables automatically. It adds basic infrastructure to poll HMS notifications events at a configurable frequency using a backend config called hms_event_polling_frequency_s flag. Currently, it issues invalidate at tables when it received alter events on table and partitions. It also adds tables/databases and removes tables from catalogD when it receives create_table/create_database and drop_table/drop_database events. The default value of hms_event_polling_frequency_s is 0 which disables the feature. A non-zero value in seconds of this configuration can be used to enable the feature and set the polling frequency. In order to process each event atomically, this feature relies on version lock in CatalogServiceCatalog. It adds new methods in CatalogServiceCatalog which takes a write lock on version so that readers are blocked until the catalog state is updated based on the events. In case of processing events, the metastore operation is already completed and only catalog state needs to be updated. Hence we do not need to make new metastore calls while processing the events and only version lock is sufficient to serialize updates to the catalog objects based on events. This locking protocol is similar to what is done in case of DDL processing in CatalogOpExecutor except it does not need to take metastoreDdlLock since no metastore operations are needed during event processing. The change also adds a new test class to test the basic functionality for each of the event type which is supported. Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 --- M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/MetastoreEventHandlerUtils.java A fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java A fe/src/main/java/org/apache/impala/catalog/MetastoreNotificationException.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java A fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java M fe/src/test/resources/postgresql-hive-site.xml.template 10 files changed, 1,375 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/12118/13 -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 13 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 12: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1696/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 12 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 02 Jan 2019 18:29:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Vihang Karajgaonkar has uploaded a new patch set (#12). ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. IMPALA-7970 : Add support for metastore event based automatic invalidate This change adds support to CatalogD to poll metastore events to issue invalidate on tables automatically. It adds basic infrastructure to poll HMS notifications events at a configurable frequency using a backend config called hms_event_polling_frequency_s flag. Currently, it issues invalidate at tables when it received alter events on table and partitions. It also adds tables/databases and removes tables from catalogD when it receives create_table/create_database and drop_table/drop_database events. The default value of hms_event_polling_frequency_s is 0 which disables the feature. A non-zero value in seconds of this configuration can be used to enable the feature and set the polling frequency. In order to process each event atomically, this feature relies on version lock in CatalogServiceCatalog. It adds new methods in CatalogServiceCatalog which takes a write lock on version so that readers are blocked until the catalog state is updated based on the events. In case of processing events, the metastore operation is already completed and only catalog state needs to be updated. Hence we do not need to make new metastore calls while processing the events and only version lock is sufficient to serialize updates to the catalog objects based on events. This locking protocol is similar to what is done in case of DDL processing in CatalogOpExecutor except it does not need to take metastoreDdlLock since no metastore operations are needed during event processing. The change also adds a new test class to test the basic functionality for each of the event type which is supported. Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 --- M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/MetastoreEventHandlerUtils.java A fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java A fe/src/main/java/org/apache/impala/catalog/MetastoreNotificationException.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java A fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java M fe/src/test/resources/postgresql-hive-site.xml.template 10 files changed, 1,367 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/12118/12 -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 12 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 11: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/1692/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 11 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Sat, 29 Dec 2018 02:21:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 10: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1691/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 10 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Sat, 29 Dec 2018 02:15:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Vihang Karajgaonkar has uploaded a new patch set (#11). ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. IMPALA-7970 : Add support for metastore event based automatic invalidate This change adds support to CatalogD to poll metastore events to issue invalidate on tables automatically. It adds basic infrastructure to poll HMS notifications events at a configurable frequency using a backend config called hms_event_polling_frequency_s flag. Currently, it issues invalidate at tables when it received alter events on table and partitions. It also adds tables/databases and removes tables from catalogD when it receives create_table/create_database and drop_table/drop_database events. The default value of hms_event_polling_frequency_s is 0 which disables the feature. A non-zero value in seconds of this configuration can be used to enable the feature and set the polling frequency. In order to process each event atomically, this feature relies on version lock in CatalogServiceCatalog. It adds new methods in CatalogServiceCatalog which takes a write lock on version so that readers are blocked until the catalog state is updated based on the events. In case of processing events, the metastore operation is already completed and only catalog state needs to be updated. Hence we do not need to make new metastore calls while processing the events and only version lock is sufficient to serialize updates to the catalog objects based on events. This locking protocol is similar to what is done in case of DDL processing in CatalogOpExecutor except it does not need to take metastoreDdlLock since no metastore operations are needed during event processing. The change also adds a new test class to test the basic functionality for each of the event type which is supported. Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 --- M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/MetastoreEventHandlerUtils.java A fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java A fe/src/main/java/org/apache/impala/catalog/MetastoreNotificationException.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java A fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java M fe/src/test/resources/postgresql-hive-site.xml.template 10 files changed, 1,350 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/12118/11 -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 11 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Vihang Karajgaonkar has uploaded a new patch set (#10). ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. IMPALA-7970 : Add support for metastore event based automatic invalidate This change adds support to CatalogD to poll metastore events to issue invalidate on tables automatically. It adds basic infrastructure to poll HMS notifications events at a configurable frequency using a backend config called hms_event_polling_frequency_s flag. Currently, it issues invalidate at tables when it received alter events on table and partitions. It also adds tables/databases and removes tables from catalogD when it receives create_table/create_database and drop_table/drop_database events. The default value of hms_event_polling_frequency_s is 0 which disables the feature. A non-zero value in seconds of this configuration can be used to enable the feature and set the polling frequency. The change also adds a new test class to test the basic functionality for each of the event type which is supported. Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 --- M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java A fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java M fe/src/test/resources/postgresql-hive-site.xml.template 8 files changed, 1,280 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/12118/10 -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 10 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Paul Rogers has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 9: (20 comments) Remaining comments, mostly about the tests. High level, seems the open questions are: * That bit about replacing the guts of the DB object. * Error handling and reporting * Whether the event handler should attempt to partially load DBs and tables, or just do an invalidate. * Invalidation of specific partitions rather than all of them. http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java@64 PS9, Line 64: private static MetastoreEventsProcessor eventsProcessor; Not thrilled with the convention, but the team likes an underscore after each field name. http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java@109 PS9, Line 109: assertNotNull(catalog.getDb(TEST_DB_NAME)); Per prior review comments, should not add the new DB. http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java@120 PS9, Line 120: createDatabase("database_to_be_dropped"); If we don't add the DB automatically, should reference it and assert it is in the cache. http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java@122 PS9, Line 122: assertTrue(catalog.getDb("database_to_be_dropped") != null); assertNotNull(...) http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java@125 PS9, Line 125: assertTrue("Database should not be found after processing drop_database event", assertNull(...) Oddly, later tests do use assertNull(). So, modify the code here, and the other cases where assertTrue is used to check for null/not null. http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java@136 PS9, Line 136: @Test(expected = DatabaseNotFoundException.class) This is confusing. If we expect an exception, then the exception terminates the execution. Logically, it should come from the last statement in the body. But, the last statement is a Unit assertion. So, something is strange. Often easier to do this: try { doSomethingThatShoudFail(); fail(); } catch (ExpectedException e) { // Expected } http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java@139 PS9, Line 139: final String TBL_TO_BE_DROPPED = "tbl_to_be_dropped"; Looks like a constant. Move outside the method and make static. http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java@148 PS9, Line 148: loadTable(TBL_TO_BE_DROPPED); Does this assert that the table exists in HMS, was loaded in the cache, and has the expected contents? Else, when we drop it, we can't tell if the drop code is broken if we don't know if the table was actually in the cache. http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java@155 PS9, Line 155: // throws DatabaseNotFoundException If it throws, we won't get to the assertTrue. http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java@161 PS9, Line 161: @Ignore("Disabled until we fix Hive bug to deserialize alter_database event messages") Reference HIVE JIRA ticket number. Also, should reference this bug in the event processor code. http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java@217 PS9, Line 217: createTable(TEST_TABLE_NAME_NONPARTITIONED, false); Start by asserting that the table is not yet in the cache? http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java@255 PS9, Line 255: List> partVals = new ArrayList<>(1); Allocating just 1 member up front, then adding four items. Maybe just omit the initial size arg; not helping us much here. http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java@277 PS9, Line 277: // change some partitions There is a recent fix that does specialized calls to HDFS to check for updated vs. new partitions. It will be tricky to combine that test with this one, but we should ensure that the automated way of updating a partition uses the call
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/Db.java File fe/src/main/java/org/apache/impala/catalog/Db.java: http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/Db.java@112 PS8, Line 112: synchronized (thriftDb_) { thriftDb_.setMetastore_db(msDb); } > I see your point. Initially, I wanted to model the changes to metastore db The locking scheme does require the cooperation of multiple classes so method access modifiers aren't likely to be sufficient to prevent other classes from doing bad things (it's good if we make fewer things public for sure though). I think the precedent to look at here are the various alterDatabase operations in CatalogOpExecutor. Those follow the protocol described in the CatalogOpExecutor class comment. In those cases metastoreDdlLock_ is the lock that's preventing concurrent updates to thriftDb_. I was trying to understand if anything prevents concurrent reads + writes to thriftDb_ though and I don't think there currently is anything, so it seems like some anomalies could result from that. There's also some logic in CatalogOpExecutor that mutates thriftDb_ in-place. -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 8 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 27 Dec 2018 20:11:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Paul Rogers has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 9: (39 comments) This will be a great feature. Detailed review from the perspective of 1) synchronization and 2) diagnosing issues when they occur in the field. This is a partial review, will review remaining files separately. http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@302 PS9, Line 302: if (BackendConfig.INSTANCE.getHMSPollingFrequencyInSeconds() <= 0) { Good check. Better to define this as behavior: to disable the polling, set the interval to 0 (or, trivially <0). This would be a good fall-back safety-valve for support cases, tests, etc. http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@313 PS9, Line 313: } catch (TException e) { Nice use of try-with-catch. No need for nested tries. Remove inner catch, move catch to outer try. http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2317 PS9, Line 2317: MetastoreEventsProcessor getMetastoreEventProcessor() { Should this be public? If only protected, please explicitly label it protected. http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/Db.java File fe/src/main/java/org/apache/impala/catalog/Db.java: http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/Db.java@112 PS9, Line 112: synchronized (thriftDb_) { thriftDb_.setMetastore_db(msDb); } What is the synchronization model? As written, we lock on the prior DB to set the new one. This means access is not synchronized. Better to add synchronized to the method itself, here and accessor. But, more broadly, if we synchronize only at the DB level, we'll have all manner of race conditions. One client will grab the DB with one version the thrift object, then another will be slotted in underneath. This probably needs more thought. Is the model documented in the write-up? http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/Db.java@159 PS9, Line 159: synchronized (thriftDb_) { return thriftDb_.deepCopy(); } How often do we do this? Do we want this to block on an HMS call about parameters? Maybe a better model is for the RPC calls to do this: - Obtain a lock - Get a pointer to the current thrift object - Release the lock - Make the RPC Not that there is no need to copy the thrift object above: the logic here implies that the Thrift object is immutable once set. All we are doing is making sure we use the same object throughout the RPC. Still, not convinced we have the right locking semantics. http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/Db.java@164 PS9, Line 164: synchronized (thriftDb_) { return thriftDb_.getDb_name(); } Now getting the name is synchronized on an HMS call. That can't be good for performance. The planner/analyzer gets the name all the time, so we really don't want to do this. The name should be invariant or all heck breaks loose in the planner. So, if we swap out Thrift objects, pull the name outside the thrift object and cache it in a final field. http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java File fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java: http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@50 PS9, Line 50: * in Hive metastore using the the public API get_next_notification These events could Impala generally does not use Javadoc markup, sadly. But, if you want to use it (we really should), then a reference to a Java method should be: {@link Class#method}. http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@63 PS9, Line 63: * CREATE_TABLE, CREATE_DATABASE Wrong form for HTML lists. I think you want a definition list: CREATE_TABLE, CREATE_DATABASE A new table/database is ... ... That sad, if Vuk was here, he'd have you rip all this out. So, a broader question is: do we want our Javadoc to follow C++ doc standard so the C++ folks can follow it, or Javadoc standard so the documentation looks correct when formatted? I prefer the Javadoc approach, so would encourage you to keep the (corrected) formatting.
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 9: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1677/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 9 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 27 Dec 2018 01:26:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
vih...@cloudera.com has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 9: (34 comments) Latest patch works through some of the comments. Still working through the rest of the comments http://gerrit.cloudera.org:8080/#/c/12118/8//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12118/8//COMMIT_MSG@7 PS8, Line 7: IMPALA-7970 : Add support for metastore event based automatic invalidate > nit: wrap lines to less than 72 chars. done http://gerrit.cloudera.org:8080/#/c/12118/8//COMMIT_MSG@21 PS8, Line 21: The change also adds a new test class to test the basic functionality > Is the self-event detection coming in a separate patch? Yes, self-event detection is still a work in progress and would be done as part of IMPALA-7972 http://gerrit.cloudera.org:8080/#/c/12118/8/be/src/common/global-flags.cc File be/src/common/global-flags.cc: http://gerrit.cloudera.org:8080/#/c/12118/8/be/src/common/global-flags.cc@246 PS8, Line 246: invalidate cached tab > This only invalidates right? (refresh has a different meaning in Impala's c Right now, its only invalidate but I plan to make use of refresh in IMPALA-7973. I will change it to invalidate since that is the current state of the feature. http://gerrit.cloudera.org:8080/#/c/12118/8/be/src/common/global-flags.cc@247 PS8, Line 247: These metastore events could > nit: rephrase to ...cached table metadata based on... done http://gerrit.cloudera.org:8080/#/c/12118/8/be/src/common/global-flags.cc@251 PS8, Line 251: \ > ..has.. done http://gerrit.cloudera.org:8080/#/c/12118/8/be/src/common/global-flags.cc@253 PS8, Line 253: "creation or removal of objects from metastore, catalogd adds or removes such " > Since this is a user-facing message, we should probably clarify that this i The first line of the message mentions metastore events. Do you think that is not clear to a user? Added one line which describes how the metastore events could be generated from external systems like Apache Hive or another instance of Impala cluster talking with the same metastore. http://gerrit.cloudera.org:8080/#/c/12118/8/be/src/common/global-flags.cc@260 PS8, Line 260: > nit: format to fewer lines like others. I had formatted it like the others, but the clang-format-diff script which I use to format the patch doesn't seem to like it. Any idea how can fix it automatically? http://gerrit.cloudera.org:8080/#/c/12118/8/be/src/common/global-flags.cc@258 PS8, Line 258: r the old generation to be almost " : "full and trigger an invalidation on recently unused tables"); : > Looks like this is already mentioned above. Also, is it needed on the coord removed the redundant line. Also, yes, not need on coordinators as of now. http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@229 PS8, Line 229: > Add a oneliner comment. Also, we use an "_" prefix for class variables. foo done http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@292 PS8, Line 292: > nit: sounds vague, rephrase? added some more meat to this. Please suggest improvements if you think something more is needed in the text. Thanks! http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@292 PS8, Line 292: : /** :* Initializes Metastore event > I think we can get rid of the implementation detail here, remove? Good point. Added check and return early logic. http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@304 PS8, Line 304:+ "interval is %d", eventPollingInterval)); : return; : } > Isn't this a fatal error? (Include the stack trace in the error log) done. Did you mean changing the log level to fatal as well? http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/Db.java File fe/src/main/java/org/apache/impala/catalog/Db.java: http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/Db.java@112 PS8, Line 112: synchronized (thriftDb_) { thriftDb_.setMetastore_db(msDb); } > This additional ad-hoc locking without explanation of how it integrates wit I see your point. Initially, I wanted to model the changes to metastore db object similar to how functions are added to its parameters. I noticed that Db class has public method to add a function which can modify the state of the metastoredb object
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
vih...@cloudera.com has uploaded a new patch set (#9). ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. IMPALA-7970 : Add support for metastore event based automatic invalidate This change adds support to CatalogD to poll metastore events to issue invalidate on tables automatically. It adds basic infrastructure to poll HMS notifications events at a configurable frequency using a backend config called hms_event_polling_frequency_s flag. Currently, it issues invalidate at tables when it received alter events on table and partitions. It also adds tables/databases and removes tables from catalogD when it receives create_table/create_database and drop_table/drop_database events. The default value of hms_event_polling_frequency_s is 0 which disables the feature. A non-zero value in seconds of this configuration can be used to enable the feature and set the polling frequency. The change also adds a new test class to test the basic functionality for each of the event type which is supported. Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 --- M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Db.java A fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java A fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java M fe/src/test/resources/postgresql-hive-site.xml.template 9 files changed, 1,050 insertions(+), 14 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/12118/9 -- To view, visit http://gerrit.cloudera.org:8080/12118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7 Gerrit-Change-Number: 12118 Gerrit-PatchSet: 9 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12118 ) Change subject: IMPALA-7970 : Add support for metastore event based automatic invalidate .. Patch Set 9: (22 comments) http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@294 PS9, Line 294:* Initializes Metastore event processor object if line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java File fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java: http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@46 PS9, Line 46: * org.apache.hadoop.hive.metastore.api.NotificationEvent. Metastore can be configured, line too long (100 > 90) http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@47 PS9, Line 47: * to work with Listeners which are called on various DDL operations like create/alter/drop line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@48 PS9, Line 48: * operations on database, table, partition etc. Each event has a unique incremental id and the line too long (95 > 90) http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@49 PS9, Line 49: * generated events are be fetched from Metastore to get incremental updates to the metadata stored line too long (99 > 90) http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@50 PS9, Line 50: * in Hive metastore using the the public API get_next_notification These events could line too long (99 > 90) http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@51 PS9, Line 51: * be generated by external Metastore clients like Apache Hive or Apache Spark as well as other line too long (95 > 90) http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@54 PS9, Line 54: * This class is used to poll metastore for new events at a given frequency. By applying such events line too long (100 > 90) http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@55 PS9, Line 55: * on the catalogD we can sync to external metadata operations by taking appropriate actions for line too long (96 > 90) http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@56 PS9, Line 56: * each event type. We keep track of the last synced event id in each polling iteration so the next line too long (99 > 90) http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@64 PS9, Line 64: * A new table/database is created in Catalog respectively. The newly created table/database are line too long (98 > 90) http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@69 PS9, Line 69: * In case of alter table event, currently the code issues a invalidate table command. There is a line too long (99 > 90) http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@70 PS9, Line 70: * special case of this event in case of renames, where the old table is removed and a new line too long (92 > 90) http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@73 PS9, Line 73: * In case of alter database events, currently only the case of changing default location, owner line too long (98 > 90) http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@76 PS9, Line 76: * Currently, in case of these events, we issue invalidate on the table. This can be optimized by line too long (99 > 90) http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@162 PS9, Line 162: // fetch the current notification event id. We assume that the polling interval is small line too long (94 > 90) http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@163 PS9, Line 163: // enough that most of these polling operations result in zero new events. In such a case,