[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-30 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 27:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/1913/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 27
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 30 Jan 2019 17:50:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..

IMPALA-7970 : Add support for metastore event based automatic invalidate

This change adds support to CatalogD to poll metastore events to issue
invalidate on tables automatically. It adds basic infrastructure to poll
HMS notifications events at a configurable frequency using a backend
config called hms_event_polling_interval_s flag. Currently, it issues
invalidate at tables when it received alter events on table and
partitions. It also adds tables/databases and removes tables from
catalogD when it receives create_table/create_database and
drop_table/drop_database events. The default value of
hms_event_polling_interval_s is 0 which disables the feature. A
non-zero value in seconds of this configuration can be used to enable
the feature and set the polling frequency.

In order to process each event atomically, this feature relies on
version lock in CatalogServiceCatalog. It adds new methods in
CatalogServiceCatalog which takes a write lock on version so that
readers are blocked until the catalog state is updated based on the
events. In case of processing events, the metastore operation is already
completed and only catalog state needs to be updated. Hence we do not
need to make new metastore calls while processing the events and only
version lock is sufficient to serialize updates to the catalog objects
based on events. This locking protocol is similar to what is done in
case of DDL processing in CatalogOpExecutor except it does not need to
take metastoreDdlLock since no metastore operations are needed during
event processing.

The change also adds a new test class to test the basic functionality
for each of the event type which is supported.

Note that this feature is still a work in progress and additional
improvements will be done in subsequent patches. By default the feature
is turned off.

Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Reviewed-on: http://gerrit.cloudera.org:8080/12118
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
A fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java
A fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java
A 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
A 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreNotificationException.java
A 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreNotificationFetchException.java
A fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
A 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
A 
fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/resources/postgresql-hive-site.xml.template
14 files changed, 2,553 insertions(+), 3 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 30
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 29: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 29
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 30 Jan 2019 03:32:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 29:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3681/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 29
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 29 Jan 2019 21:27:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 29: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 29
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 29 Jan 2019 21:27:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-29 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 28: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 28
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 29 Jan 2019 21:26:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-29 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has uploaded a new patch set (#28). ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..

IMPALA-7970 : Add support for metastore event based automatic invalidate

This change adds support to CatalogD to poll metastore events to issue
invalidate on tables automatically. It adds basic infrastructure to poll
HMS notifications events at a configurable frequency using a backend
config called hms_event_polling_interval_s flag. Currently, it issues
invalidate at tables when it received alter events on table and
partitions. It also adds tables/databases and removes tables from
catalogD when it receives create_table/create_database and
drop_table/drop_database events. The default value of
hms_event_polling_interval_s is 0 which disables the feature. A
non-zero value in seconds of this configuration can be used to enable
the feature and set the polling frequency.

In order to process each event atomically, this feature relies on
version lock in CatalogServiceCatalog. It adds new methods in
CatalogServiceCatalog which takes a write lock on version so that
readers are blocked until the catalog state is updated based on the
events. In case of processing events, the metastore operation is already
completed and only catalog state needs to be updated. Hence we do not
need to make new metastore calls while processing the events and only
version lock is sufficient to serialize updates to the catalog objects
based on events. This locking protocol is similar to what is done in
case of DDL processing in CatalogOpExecutor except it does not need to
take metastoreDdlLock since no metastore operations are needed during
event processing.

The change also adds a new test class to test the basic functionality
for each of the event type which is supported.

Note that this feature is still a work in progress and additional
improvements will be done in subsequent patches. By default the feature
is turned off.

Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
---
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
A fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java
A fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java
A 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
A 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreNotificationException.java
A 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreNotificationFetchException.java
A fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
A 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
A 
fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/resources/postgresql-hive-site.xml.template
14 files changed, 2,553 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/12118/28
--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 28
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-29 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 27: Code-Review+2

Thanks for patiently addressing all the review comments. Nice first 
contribution to the project :-). I can kick-off a GVO once you fix the line 
overflows.


--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 27
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 29 Jan 2019 18:19:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-28 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 27:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/12118/27/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java:

http://gerrit.cloudera.org:8080/#/c/12118/27/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@448
PS27, Line 448: + "invalidated since it does not exist 
anymore", getFullyQualifiedTblName()));
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/12118/27/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@464
PS27, Line 464: Pair result = 
catalog_.renameOrAddTableIfNotExists(oldTTableName,
line too long (91 > 90)


http://gerrit.cloudera.org:8080/#/c/12118/27/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/12118/27/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@758
PS27, Line 758:   private void createDatabaseFromImpala(String dbName, String 
desc) throws ImpalaException {
line too long (92 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 27
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 29 Jan 2019 01:34:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-28 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has uploaded a new patch set (#27). ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..

IMPALA-7970 : Add support for metastore event based automatic invalidate

This change adds support to CatalogD to poll metastore events to issue
invalidate on tables automatically. It adds basic infrastructure to poll
HMS notifications events at a configurable frequency using a backend
config called hms_event_polling_interval_s flag. Currently, it issues
invalidate at tables when it received alter events on table and
partitions. It also adds tables/databases and removes tables from
catalogD when it receives create_table/create_database and
drop_table/drop_database events. The default value of
hms_event_polling_interval_s is 0 which disables the feature. A
non-zero value in seconds of this configuration can be used to enable
the feature and set the polling frequency.

In order to process each event atomically, this feature relies on
version lock in CatalogServiceCatalog. It adds new methods in
CatalogServiceCatalog which takes a write lock on version so that
readers are blocked until the catalog state is updated based on the
events. In case of processing events, the metastore operation is already
completed and only catalog state needs to be updated. Hence we do not
need to make new metastore calls while processing the events and only
version lock is sufficient to serialize updates to the catalog objects
based on events. This locking protocol is similar to what is done in
case of DDL processing in CatalogOpExecutor except it does not need to
take metastoreDdlLock since no metastore operations are needed during
event processing.

The change also adds a new test class to test the basic functionality
for each of the event type which is supported.

Note that this feature is still a work in progress and additional
improvements will be done in subsequent patches. By default the feature
is turned off.

Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
---
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
A fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java
A fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java
A 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
A 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreNotificationException.java
A 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreNotificationFetchException.java
A fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
A 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
A 
fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/resources/postgresql-hive-site.xml.template
14 files changed, 2,551 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/12118/27
--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 27
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-28 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 25:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@308
PS25, Line 308:   LOG.info(String.format("Metastore event processing is 
disabled. Event polling "
> nit: fix formatting.
Done


http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java:

http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@145
PS25, Line 145: no
> not
Done


http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@166
PS25, Line 166: drop event later
> Do we also need to remove the drop event or are we just letting it fail sil
drop events are implemented as dropIfExists. So since the create event is 
skipped, the drop event processing becomes a no-op. I wanted to remove the drop 
event as well, but it was complicating this code here without any added 
benefit. I will add a comment here saying why its ok to not remove the drop 
event so that its clear in the future.


http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@174
PS25, Line 174: fromIndex++;
> Log how many events got filtered out? Helps diagnosing incase there are any
Good point. Also, added logging in the isRemovedAfter impl for createTable and 
createDatabase events.


http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@356
PS25, Line 356: if (this.dbName_.equalsIgnoreCase(dropTableEvent.dbName_) && 
this.tblName_
  :   .equalsIgnoreCase(dropTableEvent.tblName_)) {
  : return true;
  :   }
> nit: simplify to return dbName_.equals() && tblName_.equals();
Actually, added a log statement in the if block so can't do this anymore.


http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@360
PS25, Line 360: else if 
(event.eventType_.equals(MetastoreEventType.ALTER_TABLE)) {
  :   // if this create table was renamed to some other 
name in the future return true
  :   AlterTableEvent alterTableEvent = (AlterTableEvent) 
event;
  :   if (alterTableEvent.isRename_ && this.dbName_
  :   
.equalsIgnoreCase(alterTableEvent.tableBefore_.getDbName()) && this.tblName_
  :   
.equalsIgnoreCase(alterTableEvent.tableBefore_.getTableName())) {
  : return true;
  :   }
> clarify why rename qualifies as an inverse op for create?
Done


http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@295
PS25, Line 295: LOG.info(String
  :   .format("Received %d events. Start event id : %d", 
response.getEvents().size(),
  :   lastSyncedEventId_));
> nit: format to fewer lines.
this statement cannot be converted to a 2 line statement without increasing the 
line width beyond 100


http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@319
PS25, Line 319:   LOG.warn(String.format(
  : "Not processing notification events since event 
processing status "
  : + "is %s. Last synced event id is %d", 
currentStatus,
  : lastSyncedEventId_));
> nit: format to fewer lines.
Reduced the text verbosity to make it to 2 lines.


http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@339
PS25, Line 339:
  : // remove some partitions
  : // change some partitions
> ?
had kept a reminder for myself to add 

[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-28 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 22:

(5 comments)

I forgot to publish these comments earlier. Publishing it here again just for 
the record.

http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java:

http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@383
PS20, Line 383:
> Nit:
Done


http://gerrit.cloudera.org:8080/#/c/12118/22/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java:

http://gerrit.cloudera.org:8080/#/c/12118/22/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@360
PS22, Line 360:   throw new MetastoreNotificationException(debugString(
> To follow up on previous comments about errors. We may have the user busily
Thats a fair point. Let me add more details on what can go wrong during this 
event processing.
I would analyze what could be state of catalog at the event processing time 
based on the diagram I have in the event processor class.

Case I:
Catalog state is stale with respect to this event.
  a. It knows of a table with oldTblName but doesn't know about newTblName

This maps to the regular renameTable by remove oldTbl and add newTbl.

Case II:
Catalog state is exactly at the same state as of event.
 a. This means that catalog doesn't know of oldTbl. It knows about newTbl.

In this case also renameTable fails since it cannot find oldTbl. However, we 
can continue event processing. Basically, no action would be taken by this 
event and it needs to be ignored.

Case III:
Catalog state is ahead of event. This could mean a number of things.
 a. simple case is nothing exotic happens. OldTbl doesn't exist, newTbl exists 
--> case II

 b. user did a reverse rename from newtbl -> oldTbl post event. But we don't 
have any mechanism currently to distinguish this state from Case I without the 
version number support.

 c. User dropped the newTbl (or renamed it to something else). In this case, 
event processor will neither see a oldTbl or newTbl in the cache. It looks like 
there would be slight window of time where newTbl will appear until the actual 
event of the newTbl is processed and it is dropped (or renamed).

Given these various possibilities, I think the right thing to do is
Atomic rename by drop-and-add
  1. remove oldTbl if it exists.
  2. add newTbl if it doesn't exist. To compare a existing tbl we should use 
the creationTime and version.


http://gerrit.cloudera.org:8080/#/c/12118/22/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@515
PS22, Line 515: LOG.info(debugString("Successfully removed database 
%s", dbName_));
> Briefly looked at these other error handling blocks. The exception/logging
Thanks for taking a relook


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@334
PS21, Line 334:   lastSyncedEventId_ = event.getEventId();
  : }
  :   }
  : } catch (MetastoreNotificationException | CatalogException 
ex) {
  :   updateStatus(EventProcessorStatus.E
> This blocks invalidate metadata since both of them try to get a writelock a
changed the locking logic to use simple synchronized blocks. The code becomes a 
little more ugly but reduces the total blocking time for reset() as low as 
possible.


http://gerrit.cloudera.org:8080/#/c/12118/22/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/12118/22/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@258
PS22, Line 258:   public synchronized void start(long fromEventId) {
> This is synchronized, which suggests that we might get multiple concurrent
Both the start() and stop() methods do a Preconditions.checkState(). In this 
particular start method the check is on line 261. I will move it to the first 
couple of lines to improve readability.



--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: 

[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-28 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 25:

(12 comments)

lgtm. Final round of comments from my side. Given the state the patch is in, I 
think we should get the patch into the master branch which makes it easier for 
concurrency testing. Also since all of this code is behind a feature flag, it 
should not affect any existing deployments.

http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@308
PS25, Line 308:   LOG.info(String.format("Metastore event processing is 
disabled. Event polling "
nit: fix formatting.


http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1833
PS25, Line 1833: if (db == null) return null;
   :   if (!db.containsTable(tblName)) return null
nit: merge into a single stmt.


http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java:

http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@145
PS25, Line 145: no
not


http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@149
PS25, Line 149:  * removed a table, the create event received will try to 
add the object again.
This probably needs some thought during concurrency testing.


http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@166
PS25, Line 166: drop event later
Do we also need to remove the drop event or are we just letting it fail 
silently?


http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@174
PS25, Line 174: fromIndex++;
Log how many events got filtered out? Helps diagnosing incase there are any 
bugs in this area.


http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@356
PS25, Line 356: if (this.dbName_.equalsIgnoreCase(dropTableEvent.dbName_) && 
this.tblName_
  :   .equalsIgnoreCase(dropTableEvent.tblName_)) {
  : return true;
  :   }
nit: simplify to return dbName_.equals() && tblName_.equals();


http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@360
PS25, Line 360: else if 
(event.eventType_.equals(MetastoreEventType.ALTER_TABLE)) {
  :   // if this create table was renamed to some other 
name in the future return true
  :   AlterTableEvent alterTableEvent = (AlterTableEvent) 
event;
  :   if (alterTableEvent.isRename_ && this.dbName_
  :   
.equalsIgnoreCase(alterTableEvent.tableBefore_.getDbName()) && this.tblName_
  :   
.equalsIgnoreCase(alterTableEvent.tableBefore_.getTableName())) {
  : return true;
  :   }
clarify why rename qualifies as an inverse op for create?


http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@295
PS25, Line 295: LOG.info(String
  :   .format("Received %d events. Start event id : %d", 
response.getEvents().size(),
  :   lastSyncedEventId_));
nit: format to fewer lines.


http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@319
PS25, Line 319:   LOG.warn(String.format(
  : "Not processing notification events since event 
processing status "
  : + "is %s. Last synced event id is %d", 
currentStatus,
  : lastSyncedEventId_));
nit: format to fewer lines.


http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 25:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1897/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 25
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Sat, 26 Jan 2019 02:17:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 25:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java:

http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@426
PS25, Line 426: + "invalidated since it does not exist 
anymore", getFullyQualifiedTblName()));
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@481
PS25, Line 481:   if (eventsProcessor_.getStatus() != 
EventProcessorStatus.ACTIVE) eventsProcessor_.start();
line too long (96 > 90)


http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@725
PS25, Line 725:   private void createDatabaseFromImpala(String dbName, String 
desc) throws ImpalaException {
line too long (92 > 90)


http://gerrit.cloudera.org:8080/#/c/12118/25/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@751
PS25, Line 751:   private void createTableFromImpala(String tblName, boolean 
isPartitioned) throws ImpalaException {
line too long (100 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 25
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Sat, 26 Jan 2019 01:41:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-25 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has uploaded a new patch set (#25). ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..

IMPALA-7970 : Add support for metastore event based automatic invalidate

This change adds support to CatalogD to poll metastore events to issue
invalidate on tables automatically. It adds basic infrastructure to poll
HMS notifications events at a configurable frequency using a backend
config called hms_event_polling_interval_s flag. Currently, it issues
invalidate at tables when it received alter events on table and
partitions. It also adds tables/databases and removes tables from
catalogD when it receives create_table/create_database and
drop_table/drop_database events. The default value of
hms_event_polling_interval_s is 0 which disables the feature. A
non-zero value in seconds of this configuration can be used to enable
the feature and set the polling frequency.

In order to process each event atomically, this feature relies on
version lock in CatalogServiceCatalog. It adds new methods in
CatalogServiceCatalog which takes a write lock on version so that
readers are blocked until the catalog state is updated based on the
events. In case of processing events, the metastore operation is already
completed and only catalog state needs to be updated. Hence we do not
need to make new metastore calls while processing the events and only
version lock is sufficient to serialize updates to the catalog objects
based on events. This locking protocol is similar to what is done in
case of DDL processing in CatalogOpExecutor except it does not need to
take metastoreDdlLock since no metastore operations are needed during
event processing.

The change also adds a new test class to test the basic functionality
for each of the event type which is supported.

Note that this feature is still a work in progress and additional
improvements will be done in subsequent patches. By default the feature
is turned off.

Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
---
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
A fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java
A fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java
A 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
A 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreNotificationException.java
A 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreNotificationFetchException.java
A fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
A 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
A 
fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/resources/postgresql-hive-site.xml.template
14 files changed, 2,458 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/12118/25
--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 25
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-23 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 22:

(3 comments)

Thanks much for your great work in getting this in and helping us anticipate 
all the ways that things might go wrong in a production system. Thanks much 
also for discussing the code review comments in person.

This review has just a few minor follow-on comments. After that, I suspect 
we'll be in good shape to merge the code: we are reaching the point of 
diminishing returns in reviews; the next step is to try the feature out in a 
live cluster, and for that you'll need the code merged. A feature flag lets us 
turn off the feature if we encounter anything missed during reviews.

http://gerrit.cloudera.org:8080/#/c/12118/22/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java:

http://gerrit.cloudera.org:8080/#/c/12118/22/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@360
PS22, Line 360:   throw new MetastoreNotificationException(debugString(
To follow up on previous comments about errors. We may have the user busily 
hammering the system with INVALIDATEs and queries, each of which may mutate the 
cache ahead of the event processing here. If we encounter such a case, we 
should soldier on rather than stopping event processing.

So, the case that the old table does not exist, is that an expected error if 
the user issues an INVALIDATE? Or, does it mean that we have a likely bug and 
we should stop event processing?

The case of the missing new table is different: it would seem that there would 
be no good reason not to add the new table.

But, I wonder, does the renameTable() method handle the case in which the 
rename was already done? Something like:

* Rename table in HMS.
* Issue a query against the new name, causing the catalog to load metadata for 
the new table.
* Get the rename event.
* Find the old table and remove it, but be unable to add a new table since a 
table of that name was already loaded.


http://gerrit.cloudera.org:8080/#/c/12118/22/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@515
PS22, Line 515: LOG.info(debugString("Successfully removed database 
%s", dbName_));
Briefly looked at these other error handling blocks. The exception/logging 
decisions look right on each of them.


http://gerrit.cloudera.org:8080/#/c/12118/22/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/12118/22/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@258
PS22, Line 258:   public synchronized void start(long fromEventId) {
This is synchronized, which suggests that we might get multiple concurrent 
start/stop calls from different threads.

Should the start() (and stop()) methods check if they are in the proper state 
before proceeding? That is, should this method check if we are already in the 
ACTIVE state and simply return if so?



--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 22
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 23 Jan 2019 21:18:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 22:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1847/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 22
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 22 Jan 2019 22:41:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 22:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12118/22/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java:

http://gerrit.cloudera.org:8080/#/c/12118/22/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@337
PS22, Line 337: + "invalidated since it does not exist 
anymore", getFullyQualifiedTblName()));
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/12118/22/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/12118/22/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@443
PS22, Line 443:   if (eventsProcessor_.getStatus() != 
EventProcessorStatus.ACTIVE) eventsProcessor_.start();
line too long (96 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 22
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 22 Jan 2019 21:47:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-22 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has uploaded a new patch set (#22). ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..

IMPALA-7970 : Add support for metastore event based automatic invalidate

This change adds support to CatalogD to poll metastore events to issue
invalidate on tables automatically. It adds basic infrastructure to poll
HMS notifications events at a configurable frequency using a backend
config called hms_event_polling_interval_s flag. Currently, it issues
invalidate at tables when it received alter events on table and
partitions. It also adds tables/databases and removes tables from
catalogD when it receives create_table/create_database and
drop_table/drop_database events. The default value of
hms_event_polling_interval_s is 0 which disables the feature. A
non-zero value in seconds of this configuration can be used to enable
the feature and set the polling frequency.

In order to process each event atomically, this feature relies on
version lock in CatalogServiceCatalog. It adds new methods in
CatalogServiceCatalog which takes a write lock on version so that
readers are blocked until the catalog state is updated based on the
events. In case of processing events, the metastore operation is already
completed and only catalog state needs to be updated. Hence we do not
need to make new metastore calls while processing the events and only
version lock is sufficient to serialize updates to the catalog objects
based on events. This locking protocol is similar to what is done in
case of DDL processing in CatalogOpExecutor except it does not need to
take metastoreDdlLock since no metastore operations are needed during
event processing.

The change also adds a new test class to test the basic functionality
for each of the event type which is supported.

Note that this feature is still a work in progress and additional
improvements will be done in subsequent patches. By default the feature
is turned off.

Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
---
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
A fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java
A fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java
A 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
A 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreNotificationException.java
A fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
A 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
A 
fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/resources/postgresql-hive-site.xml.template
13 files changed, 1,974 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/12118/22
--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 22
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-18 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 20:

(36 comments)

http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@307
PS21, Line 307: BackendConfig.INSTANCE.getHMSPollingIntervalInSeconds()
> eventPollingInterval
Done


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@319
PS21, Line 319:  LOG.error("Unable to fetch the current notification event id 
from metastore."
  :   + "Metastore event processing will be disabled.",
  :   e);
> nit: Multiple places in the code, try to format in fewer lines. Something l
I used the command given in 
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65868536 as 
advised by some of the other members of the team. Unfortunately, it doesn't 
like some of these lines.


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@324
PS21, Line 324:   e);
> Will the code that handles this exception write to the log both the message
The LOG.error above logs the trace as well since it takes in the underlying 
cause exception as the second argument. The exception is uncaught in this 
particular case on the caller side, since we want CatalogD to not come up if it 
is configured to using event processing but for some reasons isn't able to 
instantiate one.


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1206
PS21, Line 1206: // Even though we get the current notification event id 
before stopping the event
> clarification: reset() is equivalent to invalidating everything and fetchin
This is similar to Tim's comment and our discussion. The line 1216 gets the 
currentEventID (latest) from HMS. And the event processing begins from that id. 
So the event processing jumps to the latest event id once the reset() is 
complete. There is still a slight chance that we will reprocess some of the 
events which are generated during reset() execution. It is better than missing 
those events which would lead to inconsistent state between catalog and HMS as 
far as event processing is concerned. The comment gives details of that case.


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1216
PS21, Line 1216: long currentEventId = 
metastoreEventProcessor_.getCurrentEventId();
> Should this be here? Or in the event processor itself? If here, why is it n
It needs to be here to avoid the race condition which can lead to possible 
missed events. We want to get the latestEventId before triggering the reset() 
so that we can then restart the event processing after reset from this eventId. 
If we move the code in the start() method, there is a chance that we will miss 
processing some of the events which happen during after reset is complete, but 
before we started the event processing. See Tim's comment in the earlier review 
comments which gives a nice example.


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1360
PS21, Line 1360:   Reference dbWasFound, Reference 
tblWasFound) {
> I still think that using the reference is non-standard Java. A simple solut
used the second option of return boolean and throw exception when db is not 
found.


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1363
PS21, Line 1363: Db db = getDb(dbName);
> Let's think about this. We return the flags because we want to know if the
if getting db before acquiring the lock has a race then it looks like addTable 
and renameTable also has a race. I will create separate JIRAs for fixing them


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1372
PS21, Line 1372: tblWasFound.setRef(true);
> I wonder, do we really need to know if the table existed or not? Is it even
consider a case where user do create, drop, create with the same table name 
from Impala. In case of if the create event from the first create statement, 
the table presented in the event is stale and should not be used to add to the 
catalog. I think it is useful to log this information to make sure that the 
create event was ignored in such a case.



[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-18 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 21:

(14 comments)

Patch code flow and logic lgtm generally. Just had some nits/clarifications. 
Publishing these while I'm looking at the tests.

http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@307
PS21, Line 307: BackendConfig.INSTANCE.getHMSPollingIntervalInSeconds()
eventPollingInterval


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@319
PS21, Line 319:  LOG.error("Unable to fetch the current notification event id 
from metastore."
  :   + "Metastore event processing will be disabled.",
  :   e);
nit: Multiple places in the code, try to format in fewer lines. Something like

LOG.error("..."
+ ".", e)


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1206
PS21, Line 1206: // Even though we get the current notification event id 
before stopping the event
clarification: reset() is equivalent to invalidating everything and fetching 
from scratch from HMS. In that case, do we still restart the event processing 
from where we left off (before reset()) or can we start with the latest event 
ID from HMS?


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1804
PS21, Line 1804: Table incompleteTable = 
IncompleteTable.createUninitializedTable(db, tblName);
move it after L1807?


http://gerrit.cloudera.org:8080/#/c/12118/13/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java
File fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/12118/13/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@195
PS13, Line 195:
> curious to understand why do you think this method has a race?
nvm missed the synchronized part.


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java:

http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@155
PS21, Line 155: event %d of type %s on table %s
I think it is cleaner to define a debugString(NotificationEvent event) {} that 
formats it cleanly and consistently across all invocations. We can use it to 
dump event state in multiple places.


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@160
PS21, Line 160: if (!event.getEventType().equalsIgnoreCase("CREATE_DATABASE")
  :   && 
!event.getEventType().equalsIgnoreCase("DROP_DATABASE")) {
  : tblName_ = 
Preconditions.checkNotNull(event.getTableName());
  :   } else {
  : tblName_ = null;
  :   }
nit: Reformat to the following? (fewer branches and use enum)

tblName_ = null;
if (eventType_ == MetaStoreEventType.CREATE_TABLE) {
  tblName_ = ...
}


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@181
PS21, Line 181: tblName_
checkNotNull()?


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@198
PS21, Line 198:private final String dbName_;
  : private final String tblName_;
aren't these set in the parent c'tor?


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@347
PS21, Line 347: droppedTable_
isn't this more like tblTobeDropped_ ?


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@434
PS21, Line 434: //
nit: remove, you are already in a comment block.


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@310
PS21, Line 310: if (currentEventId <= lastSyncedEventId_) {
  : return Collections.emptyList();
  :   }
nit: single line



[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-17 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 21:

(32 comments)

Very nice improvements: the event classes greatly simplify the logic.

This round focused a bit on error reporting and logging. Many specific 
comments. As always, these are just questions and suggestions, you should 
decide which make sense.

http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@324
PS21, Line 324:   e);
Will the code that handles this exception write to the log both the message 
from this exception AND the message from the "cause" exception?

The goal is that, if a failure occurs, we get the detailed error message as 
well as the "summary" message.

If we log the stack trace, then we're good.


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1216
PS21, Line 1216: long currentEventId = 
metastoreEventProcessor_.getCurrentEventId();
Should this be here? Or in the event processor itself? If here, why is it 
needed other than to pass it along to the event processor?

Best not to leak implementation details if possible.


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1360
PS21, Line 1360:   Reference dbWasFound, Reference 
tblWasFound) {
I still think that using the reference is non-standard Java. A simple solution 
is:

public Pair addTableIfNotExists(String dbName, String tblName) 
...

Better would be:

enum TableStatus (DB_NOT_FOUND, TBL_NOT_FOUND, OK)

public TableStatus addTableIfNotExists(String dbName, String tblName) ...

Or, suppose we do this, since this method currently returns void:

* Throw an exception if the DB does not exist.
* Return true/false to indicate if the table was created (true) or already 
existed (false).


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1363
PS21, Line 1363: Db db = getDb(dbName);
Let's think about this. We return the flags because we want to know if the DB 
was found or not. We could just as easily do:

DB db = catalog.get(dbName_);
if (db == null) { handle error; }

The reason we don't do the above is we want to prevent race conditions. But... 
We do the check here outside the lock. So, we have a race condition anyway.

Should either 1) the DB lookup be done by the caller to simplify the code, or 
2) be done in the lock to prevent race conditions?


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1372
PS21, Line 1372: tblWasFound.setRef(true);
I wonder, do we really need to know if the table existed or not? Is it even 
useful? Consider.

* Impala creates a table. It updates its own cache and informs HMS.
* HMS sends a notification. In processing the notification, we determine that 
the table already exists.

Or

* A table is added in HMS.
* The user immediately kicks of a query in Impala, causing a load of metadata.
* Impala polls for notifications, learns that the table is new, tries to add 
it, and finds that the table already exists.

In this cases, do we get anything by logging that the table exists? If not, do 
we need to return the exists-or-not result?


http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1475
PS21, Line 1475:   Reference tblWasfound, Reference 
tblMatched) {
Same comments as above.


http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java:

http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@130
PS20, Line 130:   public static abstract class MetastoreEvent {
The new event class hierarchy looks very good.

One nit: some events are DB-level events, others are table-level events. This 
base class includes the table-level stuff and the reader (and code) must know 
what table stuff is available and when it is not.

So, a further small improvement would be to create another abstract class, 
MetastoreTableEvent, to hold the common table stuff. The table class extends 
this one. DB-level events also extend this one.


http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@138
PS20, Line 138: protected final Logger LOG = 
Logger.getLogger(this.getClass());
protected 

[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 21:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1812/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 21
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 18 Jan 2019 01:47:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 21:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java:

http://gerrit.cloudera.org:8080/#/c/12118/21/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@364
PS21, Line 364: + "table event. Check if %s is set to true 
in metastore configuration",
line too long (91 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 21
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 18 Jan 2019 01:11:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-17 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 20:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java@33
PS20, Line 33:* Get the current event id. Useful for restarting the event 
processing from a event id
> Maybe " on the HMS". Just to be clear that it's not the last event ID proce
Done



--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 20
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 18 Jan 2019 01:11:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-17 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has uploaded a new patch set (#21). ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..

IMPALA-7970 : Add support for metastore event based automatic invalidate

This change adds support to CatalogD to poll metastore events to issue
invalidate on tables automatically. It adds basic infrastructure to poll
HMS notifications events at a configurable frequency using a backend
config called hms_event_polling_interval_s flag. Currently, it issues
invalidate at tables when it received alter events on table and
partitions. It also adds tables/databases and removes tables from
catalogD when it receives create_table/create_database and
drop_table/drop_database events. The default value of
hms_event_polling_interval_s is 0 which disables the feature. A
non-zero value in seconds of this configuration can be used to enable
the feature and set the polling frequency.

In order to process each event atomically, this feature relies on
version lock in CatalogServiceCatalog. It adds new methods in
CatalogServiceCatalog which takes a write lock on version so that
readers are blocked until the catalog state is updated based on the
events. In case of processing events, the metastore operation is already
completed and only catalog state needs to be updated. Hence we do not
need to make new metastore calls while processing the events and only
version lock is sufficient to serialize updates to the catalog objects
based on events. This locking protocol is similar to what is done in
case of DDL processing in CatalogOpExecutor except it does not need to
take metastoreDdlLock since no metastore operations are needed during
event processing.

The change also adds a new test class to test the basic functionality
for each of the event type which is supported.

Note that this feature is still a work in progress and additional
improvements will be done in subsequent patches. By default the feature
is turned off.

Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
---
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
A fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java
A fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java
A 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
A 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreNotificationException.java
A fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
A 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
A 
fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/resources/postgresql-hive-site.xml.template
13 files changed, 1,959 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/12118/21
--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 21
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 20:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1809/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 20
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 18 Jan 2019 00:47:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-17 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 20: Code-Review+1

(1 comment)

Thanks for addressing my comment. The aspects I looked at are good (I haven't 
reviewed the full thing)

http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java@33
PS20, Line 33:* Get the current event id. Useful for restarting the event 
processing from a event id
Maybe " on the HMS". Just to be clear that it's not the last event ID 
processed. Ok to ignore.



--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 20
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 18 Jan 2019 00:33:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-17 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 20:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12118/19/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/12118/19/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@269
PS19, Line 269: }
> I think fetching the event ID here can lead to a race where we miss a notif
Thanks for catching this. Fixed it the latest patch set. Although as discussed, 
there is still going to be small window of time where some of the events will 
be reprocessed. I have added a comment in CatalogServiceCatalog giving details 
about that case.



--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 20
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 18 Jan 2019 00:17:16 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 20:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java:

http://gerrit.cloudera.org:8080/#/c/12118/20/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@364
PS20, Line 364: + "table event. Check if %s is set to true 
in metastore configuration",
line too long (91 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 20
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 18 Jan 2019 00:13:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-17 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has uploaded a new patch set (#20). ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..

IMPALA-7970 : Add support for metastore event based automatic invalidate

This change adds support to CatalogD to poll metastore events to issue
invalidate on tables automatically. It adds basic infrastructure to poll
HMS notifications events at a configurable frequency using a backend
config called hms_event_polling_interval_s flag. Currently, it issues
invalidate at tables when it received alter events on table and
partitions. It also adds tables/databases and removes tables from
catalogD when it receives create_table/create_database and
drop_table/drop_database events. The default value of
hms_event_polling_interval_s is 0 which disables the feature. A
non-zero value in seconds of this configuration can be used to enable
the feature and set the polling frequency.

In order to process each event atomically, this feature relies on
version lock in CatalogServiceCatalog. It adds new methods in
CatalogServiceCatalog which takes a write lock on version so that
readers are blocked until the catalog state is updated based on the
events. In case of processing events, the metastore operation is already
completed and only catalog state needs to be updated. Hence we do not
need to make new metastore calls while processing the events and only
version lock is sufficient to serialize updates to the catalog objects
based on events. This locking protocol is similar to what is done in
case of DDL processing in CatalogOpExecutor except it does not need to
take metastoreDdlLock since no metastore operations are needed during
event processing.

The change also adds a new test class to test the basic functionality
for each of the event type which is supported.

Note that this feature is still a work in progress and additional
improvements will be done in subsequent patches. By default the feature
is turned off.

Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
---
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
A fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java
A fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java
A 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
A 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreNotificationException.java
A fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
A 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
A 
fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/resources/postgresql-hive-site.xml.template
13 files changed, 1,958 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/12118/20
--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 20
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-17 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 19:

(1 comment)

Thanks for addressing my comments. I think there might be one (probably rare) 
race that is possible.

http://gerrit.cloudera.org:8080/#/c/12118/19/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/12118/19/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@269
PS19, Line 269:   lastSyncedEventId_ =
I think fetching the event ID here can lead to a race where we miss a 
notification, e.g.:

1. CatalogServiceCatalog.reset() is called
2. CatalogServiceCatalog.reset() loads DB and table metadata, which does not 
include table X.
3. Table X is added to the HMS, e.g. via Hive
4. This function fetches the last event ID
5. The processor starts processing events, but doesn't see the table X create 
event.

I think the solution is to fetch the last event ID earlier in 
CatalogServiceCatalog.reset() and pass it into this function.



--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 19
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 17 Jan 2019 22:22:34 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 19:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1806/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 19
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 17 Jan 2019 21:42:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 18:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1805/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 18
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 17 Jan 2019 21:28:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 18:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12118/18/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java:

http://gerrit.cloudera.org:8080/#/c/12118/18/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@364
PS18, Line 364: + "table event. Check if %s is set to true 
in metastore configuration",
line too long (91 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 18
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 17 Jan 2019 21:06:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 19:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12118/19/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java:

http://gerrit.cloudera.org:8080/#/c/12118/19/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@364
PS19, Line 364: + "table event. Check if %s is set to true 
in metastore configuration",
line too long (91 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 19
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 17 Jan 2019 21:10:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-17 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has uploaded a new patch set (#19). ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..

IMPALA-7970 : Add support for metastore event based automatic invalidate

This change adds support to CatalogD to poll metastore events to issue
invalidate on tables automatically. It adds basic infrastructure to poll
HMS notifications events at a configurable frequency using a backend
config called hms_event_polling_interval_s flag. Currently, it issues
invalidate at tables when it received alter events on table and
partitions. It also adds tables/databases and removes tables from
catalogD when it receives create_table/create_database and
drop_table/drop_database events. The default value of
hms_event_polling_interval_s is 0 which disables the feature. A
non-zero value in seconds of this configuration can be used to enable
the feature and set the polling frequency.

In order to process each event atomically, this feature relies on
version lock in CatalogServiceCatalog. It adds new methods in
CatalogServiceCatalog which takes a write lock on version so that
readers are blocked until the catalog state is updated based on the
events. In case of processing events, the metastore operation is already
completed and only catalog state needs to be updated. Hence we do not
need to make new metastore calls while processing the events and only
version lock is sufficient to serialize updates to the catalog objects
based on events. This locking protocol is similar to what is done in
case of DDL processing in CatalogOpExecutor except it does not need to
take metastoreDdlLock since no metastore operations are needed during
event processing.

The change also adds a new test class to test the basic functionality
for each of the event type which is supported.

Note that this feature is still a work in progress and additional
improvements will be done in subsequent patches. By default the feature
is turned off.

Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
---
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
A fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java
A fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java
A 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
A 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreNotificationException.java
A fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
A 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
A 
fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/resources/postgresql-hive-site.xml.template
13 files changed, 1,926 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/12118/19
--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 19
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-17 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has uploaded a new patch set (#18). ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..

IMPALA-7970 : Add support for metastore event based automatic invalidate

This change adds support to CatalogD to poll metastore events to issue
invalidate on tables automatically. It adds basic infrastructure to poll
HMS notifications events at a configurable frequency using a backend
config called hms_event_polling_interval_s flag. Currently, it issues
invalidate at tables when it received alter events on table and
partitions. It also adds tables/databases and removes tables from
catalogD when it receives create_table/create_database and
drop_table/drop_database events. The default value of
hms_event_polling_interval_s is 0 which disables the feature. A
non-zero value in seconds of this configuration can be used to enable
the feature and set the polling frequency.

In order to process each event atomically, this feature relies on
version lock in CatalogServiceCatalog. It adds new methods in
CatalogServiceCatalog which takes a write lock on version so that
readers are blocked until the catalog state is updated based on the
events. In case of processing events, the metastore operation is already
completed and only catalog state needs to be updated. Hence we do not
need to make new metastore calls while processing the events and only
version lock is sufficient to serialize updates to the catalog objects
based on events. This locking protocol is similar to what is done in
case of DDL processing in CatalogOpExecutor except it does not need to
take metastoreDdlLock since no metastore operations are needed during
event processing.

The change also adds a new test class to test the basic functionality
for each of the event type which is supported.

Note that this feature is still a work in progress and additional
improvements will be done in subsequent patches. By default the feature
is turned off.

Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
---
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
A fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java
A fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java
A 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
A 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreNotificationException.java
A fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
A 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
A 
fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/resources/postgresql-hive-site.xml.template
13 files changed, 1,926 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/12118/18
--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 18
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 17:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1792/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 17
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 16 Jan 2019 01:34:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 16:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1791/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 16
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 16 Jan 2019 01:05:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 17:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12118/17/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java:

http://gerrit.cloudera.org:8080/#/c/12118/17/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@345
PS17, Line 345: + "table event. Check if %s is set to true 
in metastore configuration",
line too long (91 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 17
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 16 Jan 2019 00:41:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-15 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has uploaded a new patch set (#16). ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..

IMPALA-7970 : Add support for metastore event based automatic invalidate

This change adds support to CatalogD to poll metastore events to issue
invalidate on tables automatically. It adds basic infrastructure to poll
HMS notifications events at a configurable frequency using a backend
config called hms_event_polling_interval_s flag. Currently, it issues
invalidate at tables when it received alter events on table and
partitions. It also adds tables/databases and removes tables from
catalogD when it receives create_table/create_database and
drop_table/drop_database events. The default value of
hms_event_polling_interval_s is 0 which disables the feature. A
non-zero value in seconds of this configuration can be used to enable
the feature and set the polling frequency.

In order to process each event atomically, this feature relies on
version lock in CatalogServiceCatalog. It adds new methods in
CatalogServiceCatalog which takes a write lock on version so that
readers are blocked until the catalog state is updated based on the
events. In case of processing events, the metastore operation is already
completed and only catalog state needs to be updated. Hence we do not
need to make new metastore calls while processing the events and only
version lock is sufficient to serialize updates to the catalog objects
based on events. This locking protocol is similar to what is done in
case of DDL processing in CatalogOpExecutor except it does not need to
take metastoreDdlLock since no metastore operations are needed during
event processing.

The change also adds a new test class to test the basic functionality
for each of the event type which is supported.

Note that this feature is still a work in progress and additional
improvements will be done in subsequent patches. By default the feature
is turned off.

Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
---
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
A fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java
A fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java
A 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
A 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreNotificationException.java
A fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
A 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
A 
fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/resources/postgresql-hive-site.xml.template
13 files changed, 1,902 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/12118/16
--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 16
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-15 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has uploaded a new patch set (#17). ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..

IMPALA-7970 : Add support for metastore event based automatic invalidate

This change adds support to CatalogD to poll metastore events to issue
invalidate on tables automatically. It adds basic infrastructure to poll
HMS notifications events at a configurable frequency using a backend
config called hms_event_polling_interval_s flag. Currently, it issues
invalidate at tables when it received alter events on table and
partitions. It also adds tables/databases and removes tables from
catalogD when it receives create_table/create_database and
drop_table/drop_database events. The default value of
hms_event_polling_interval_s is 0 which disables the feature. A
non-zero value in seconds of this configuration can be used to enable
the feature and set the polling frequency.

In order to process each event atomically, this feature relies on
version lock in CatalogServiceCatalog. It adds new methods in
CatalogServiceCatalog which takes a write lock on version so that
readers are blocked until the catalog state is updated based on the
events. In case of processing events, the metastore operation is already
completed and only catalog state needs to be updated. Hence we do not
need to make new metastore calls while processing the events and only
version lock is sufficient to serialize updates to the catalog objects
based on events. This locking protocol is similar to what is done in
case of DDL processing in CatalogOpExecutor except it does not need to
take metastoreDdlLock since no metastore operations are needed during
event processing.

The change also adds a new test class to test the basic functionality
for each of the event type which is supported.

Note that this feature is still a work in progress and additional
improvements will be done in subsequent patches. By default the feature
is turned off.

Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
---
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
A fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java
A fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java
A 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
A 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreNotificationException.java
A fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
A 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
A 
fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/resources/postgresql-hive-site.xml.template
13 files changed, 1,900 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/12118/17
--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 17
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 16:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12118/16/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java:

http://gerrit.cloudera.org:8080/#/c/12118/16/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@20
PS16, Line 20: import static 
org.apache.impala.catalog.events.MetastoreEventsProcessor.METASTORE_NOTIFICATIONS_ADD_THRIFT_OBJECTS_CONFIG_KEY;
line too long (126 > 90)


http://gerrit.cloudera.org:8080/#/c/12118/16/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@347
PS16, Line 347: + "table event. Check if %s is set to true 
in metastore configuration",
line too long (91 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 16
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 16 Jan 2019 00:34:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 15:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/1790/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 15
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 16 Jan 2019 00:14:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 15:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12118/15/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java:

http://gerrit.cloudera.org:8080/#/c/12118/15/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java@345
PS15, Line 345: + "table event. Check if %s is set to true 
in metastore configuration",
line too long (91 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 15
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 15 Jan 2019 23:41:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-15 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has uploaded a new patch set (#15). ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..

IMPALA-7970 : Add support for metastore event based automatic invalidate

This change adds support to CatalogD to poll metastore events to issue
invalidate on tables automatically. It adds basic infrastructure to poll
HMS notifications events at a configurable frequency using a backend
config called hms_event_polling_interval_s flag. Currently, it issues
invalidate at tables when it received alter events on table and
partitions. It also adds tables/databases and removes tables from
catalogD when it receives create_table/create_database and
drop_table/drop_database events. The default value of
hms_event_polling_interval_s is 0 which disables the feature. A
non-zero value in seconds of this configuration can be used to enable
the feature and set the polling frequency.

In order to process each event atomically, this feature relies on
version lock in CatalogServiceCatalog. It adds new methods in
CatalogServiceCatalog which takes a write lock on version so that
readers are blocked until the catalog state is updated based on the
events. In case of processing events, the metastore operation is already
completed and only catalog state needs to be updated. Hence we do not
need to make new metastore calls while processing the events and only
version lock is sufficient to serialize updates to the catalog objects
based on events. This locking protocol is similar to what is done in
case of DDL processing in CatalogOpExecutor except it does not need to
take metastoreDdlLock since no metastore operations are needed during
event processing.

The change also adds a new test class to test the basic functionality
for each of the event type which is supported.

Note that this feature is still a work in progress and additional
improvements will be done in subsequent patches. By default the feature
is turned off.

Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
---
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
A fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java
A fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventUtils.java
A 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
A 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreNotificationException.java
A fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
A 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
A 
fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/resources/postgresql-hive-site.xml.template
13 files changed, 1,886 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/12118/15
--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 15
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-10 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 14:

(21 comments)

Nice improvements. Comments mostly relate to ways to simplify the processing 
logic. Feel free to ping me directly to discuss.

http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@293
PS14, Line 293: if (metastoreEventProcessor_ != null) {
The var != null pattern is error prone. Would be great to have the 
MetastoreEventsProcessor be an interface with two implementations: a null 
implementation (disabled) and a live implementation. Then, you just call the 
various methods without a check for null or not.


http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@325
PS14, Line 325: }
Can this move to a static factory method on the MetastoreEventsProcessor 
implementation? Doing so puts all the logic for that abstraction in one place. 
All this method would do here is choose between the dummy and live versions.


http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1292
PS14, Line 1292: return addDb(dbName, msDb) != null;
In master, addDb() returns a DB. This method returns a boolean. Is something 
amiss here?

Is this really an "addOrGetDb"? Do you want the DB object -- either an existing 
one or a new one? Else, if you call this, get true, then look up the DB, is 
there a race condition?


http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1344
PS14, Line 1344:   public Table addTableIfNotExists(String dbName, String 
tblName,
Checked. We do have existing methods called getOrLoadTable(). So, calling this 
one getOrAddTable might be more consistent.


http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1345
PS14, Line 1345:   Reference dbWasFound, Reference 
tblWasFound) {
This is rather an odd C++ way to do things. A more typical Java way is to 
declare a class that will return the required values.. Or, you can declare a 
Pair with TableStatus an enum of {CREATED, ALREADY_EXISTS, 
DB_NOT_FOUND}. The same enum could be returned for other add-or-get operations.

Pair addOrGetTable(String dbName, String tblName) ...


http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1359
PS14, Line 1359:   } else {
The else is not necessary since prior block returned. Not a bug, just not 
necessary.


http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1365
PS14, Line 1365:   versionLock_.writeLock().unlock();
This is old school. We should change this so we can use try-with-resources:

try(LockInstance lock =versionLock.writeLock()) {
   ...
}

LockInstance would implement AutoCloseable to make this happen.


http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1462
PS14, Line 1462:   Reference tblWasfound, Reference 
tblMatched) {
Same issue with the references.

We already have a removeTable() method which is implemented to do a "remove 
table if exists". Is there commonality that should be exploited here?


http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1792
PS14, Line 1792: versionLock_.writeLock().lock();
Race condition? We got the DB above before the lock. The DB could be invalid 
now that we have the lock.


http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1804
PS14, Line 1804: return db.getTable(tblName);
The table was created above. We released the lock, then looked up the table 
again. Race conditions?


http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/MetastoreEventHandlerUtils.java
File fe/src/main/java/org/apache/impala/catalog/MetastoreEventHandlerUtils.java:

http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/MetastoreEventHandlerUtils.java@52
PS14, Line 52:   public static final String ALTER_PARTITION_EVENT_TYPE = 
"ALTER_PARTITION";
Given that we use the event names not just to translate from Thrift event to 
handler, perhaps this should be an enum.

public EventType {
  CREATE_TABLE("CREATE_TABLE") ...
}

Then we can do:

  if (event == EventType.CREATE_TABLE) ...

The enum form is both faster and more 

[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 14:

(5 comments)

I'm not planning on doing a full review since Bharath and Paul are on it. The 
concurrency aspects I looked at are much better in this version. I had some 
follow on questions about the interaction between the initial load of the 
catalog and the invalidation - I think you anticipated that to some extent.

If it's too much to address in this patch we could consider merging this 
without addressing that but keeping it behind an experimental feature flag 
until some more of the concerns have been addressed.

http://gerrit.cloudera.org:8080/#/c/12118/14/be/src/common/global-flags.cc
File be/src/common/global-flags.cc:

http://gerrit.cloudera.org:8080/#/c/12118/14/be/src/common/global-flags.cc@245
PS14, Line 245: DEFINE_int32(hms_event_polling_frequency_s, 0,
Maybe make it hidden or mention that it's experimental in the text here if it's 
not ready for prime time yet?


http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@299
PS14, Line 299:* Initializes Metastore event processor object if
This occurs before the initial metadata is loaded in reset(), right? So we 
might process some events that occurred before we got an initial snapshot of 
the data. I think this is probably OK if the events are idempotent, but seems 
worth mentioning in the comment.

Actually one question is how this interacts with reset(), either the initial 
reset() on startup or a later one from "INVALIDATE METADATA". What happens if 
events start to be processed when the catalog is empty before reset() loads all 
the data? And should we reset the current event ID when doing an invalidate?

I'm wondering if those questions lead us to resetting currentNotificationId in 
reset() and stopping/pausing the even processing until reset() completes. That 
would at least reduce the number of possible races.


http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@319
PS14, Line 319:   LOG.fatal("Unable to fetch the current notification event 
id from metastore."
This does actually terminate the process - it call to the glog LOG(FATAL) which 
calls abort(). I think we don't want this since it generates a core dump - 
Maybe LOG.error() + throwing an exception is the right behaviour.

We've had issues before with misconfigured catalogds generating a ton of core 
dumps.


http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1342
PS14, Line 1342: null
?


http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java
File fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/12118/14/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@169
PS14, Line 169: // TODO figure out if CatalogD sync with HMS is in-process 
or completed so that we
Ah, I think this is referring to the same kind of issue that I mentioned 
earlier with reset() running in parallel with this thread.



--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 14
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 11 Jan 2019 00:03:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 14:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1731/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 14
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 08 Jan 2019 02:12:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-07 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has uploaded a new patch set (#14). ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..

IMPALA-7970 : Add support for metastore event based automatic invalidate

This change adds support to CatalogD to poll metastore events to issue
invalidate on tables automatically. It adds basic infrastructure to poll
HMS notifications events at a configurable frequency using a backend
config called hms_event_polling_frequency_s flag. Currently, it issues
invalidate at tables when it received alter events on table and
partitions. It also adds tables/databases and removes tables from
catalogD when it receives create_table/create_database and
drop_table/drop_database events. The default value of
hms_event_polling_frequency_s is 0 which disables the feature. A
non-zero value in seconds of this configuration can be used to enable
the feature and set the polling frequency.

In order to process each event atomically, this feature relies on
version lock in CatalogServiceCatalog. It adds new methods in
CatalogServiceCatalog which takes a write lock on version so that
readers are blocked until the catalog state is updated based on the
events. In case of processing events, the metastore operation is already
completed and only catalog state needs to be updated. Hence we do not
need to make new metastore calls while processing the events and only
version lock is sufficient to serialize updates to the catalog objects
based on events. This locking protocol is similar to what is done in
case of DDL processing in CatalogOpExecutor except it does not need to
take metastoreDdlLock since no metastore operations are needed during
event processing.

The change also adds a new test class to test the basic functionality
for each of the event type which is supported.

Note that this feature is still a work in progress and additional
improvements will be done in subsequent patches. Be default the feature
is turned off.

Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
---
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
A fe/src/main/java/org/apache/impala/catalog/MetastoreEventHandlerUtils.java
A fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java
A fe/src/main/java/org/apache/impala/catalog/MetastoreNotificationException.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java
M fe/src/test/resources/postgresql-hive-site.xml.template
10 files changed, 1,585 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/12118/14
--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 14
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-04 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 13:

(24 comments)

Firstly thanks for the clean-up. The new class structure makes it much clearer 
IMO. Publishing a bunch of minor comments. Please bear with my nits.

Also, I had a brief chat with Vihang offline. It looks like the event handling 
races is still a WIP. My general concern was around the following 3 areas.

- Races with conflicting local events.
- What if Catalog already has the latest HMS state (due to a previous 
invalidation)
- Failure scenarios if some event processing errors out. (Should we stop and 
invalidate everything, should be abort the thread etc.)

Vihang mentioned that he is looking into HMS side versioning for objects to 
detect if the Catalog server already has the latest state and we can skip 
applying certain events. That probably requires some Hive side changes. Also we 
discussed about incorporating some of his thought process into class comments.

http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@304
PS8, Line 304:
 :   private void initializeMetastoreEventProcessor() {
 : long eventPollingInterval = 
BackendConfig.INSTANCE.getHMSPolli
> done. Did you mean changing the log level to fatal as well?
fatal() IIRC aborts the process. If metastore polling is configured  and we are 
not able to init here, it looks to me like we should abort (instead of silently 
swallowing) since that can result in correctness issues later and is unexpected 
from a user POV. Thoughts?


http://gerrit.cloudera.org:8080/#/c/12118/13/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/12118/13/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@292
PS13, Line 292: // start polling for metastore events
  : if (metastoreEventProcessor_ != null) {
  :   metastoreEventProcessor_.start();
  : }
Any reason for not moving this to initializeMetastoreEventProcessor()?


http://gerrit.cloudera.org:8080/#/c/12118/13/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@308
PS13, Line 308: debug
nit: Make this info?


http://gerrit.cloudera.org:8080/#/c/12118/13/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1460
PS13, Line 1460:   if (tblToBeRemoved != null) {
nit: invert the logic?

if (tblToBeRemoved == null) return null;



http://gerrit.cloudera.org:8080/#/c/12118/13/fe/src/main/java/org/apache/impala/catalog/MetastoreEventHandlerUtils.java
File fe/src/main/java/org/apache/impala/catalog/MetastoreEventHandlerUtils.java:

http://gerrit.cloudera.org:8080/#/c/12118/13/fe/src/main/java/org/apache/impala/catalog/MetastoreEventHandlerUtils.java@38
PS13, Line 38:   static abstract class MetastoreEventHandler {
Add doc about the thread-safety expectations here.


http://gerrit.cloudera.org:8080/#/c/12118/13/fe/src/main/java/org/apache/impala/catalog/MetastoreEventHandlerUtils.java@39
PS13, Line 39: CatalogServiceCatalog catalog,
Any reason for passing this around? Can't we instantiate this once and apply 
changes to it?


http://gerrit.cloudera.org:8080/#/c/12118/13/fe/src/main/java/org/apache/impala/catalog/MetastoreEventHandlerUtils.java@42
PS13, Line 42: protected String dbName;
Document what it means to have these set. It looks like the dbName is always 
set but tblName is optional


http://gerrit.cloudera.org:8080/#/c/12118/13/fe/src/main/java/org/apache/impala/catalog/MetastoreEventHandlerUtils.java@43
PS13, Line 43: protected String tblName;
nit: _ for class members


http://gerrit.cloudera.org:8080/#/c/12118/13/fe/src/main/java/org/apache/impala/catalog/MetastoreEventHandlerUtils.java@47
PS13, Line 47: protected void validateAndInit(String expectedType, 
NotificationEvent event) {
method doc.


http://gerrit.cloudera.org:8080/#/c/12118/13/fe/src/main/java/org/apache/impala/catalog/MetastoreEventHandlerUtils.java@49
PS13, Line 49: event.getTableName()
wouldn't this be null in CREATED_DB/DROP_DB case?

Also, is it worth adding a TRACE log that dumps the event debug data? (Impala 
allows dynamically switching log levels for debugging).


http://gerrit.cloudera.org:8080/#/c/12118/13/fe/src/main/java/org/apache/impala/catalog/MetastoreEventHandlerUtils.java@74
PS13, Line 74:   static class CreateTableEventHandler extends 
MetastoreEventHandler {
method docs for each event handlers?



[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 13:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1706/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 13
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 04 Jan 2019 00:43:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-03 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has uploaded a new patch set (#13). ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..

IMPALA-7970 : Add support for metastore event based automatic invalidate

This change adds support to CatalogD to poll metastore events to issue
invalidate on tables automatically. It adds basic infrastructure to poll
HMS notifications events at a configurable frequency using a backend
config called hms_event_polling_frequency_s flag. Currently, it issues
invalidate at tables when it received alter events on table and
partitions. It also adds tables/databases and removes tables from
catalogD when it receives create_table/create_database and
drop_table/drop_database events. The default value of
hms_event_polling_frequency_s is 0 which disables the feature. A
non-zero value in seconds of this configuration can be used to enable
the feature and set the polling frequency.

In order to process each event atomically, this feature relies on
version lock in CatalogServiceCatalog. It adds new methods in
CatalogServiceCatalog which takes a write lock on version so that
readers are blocked until the catalog state is updated based on the
events. In case of processing events, the metastore operation is already
completed and only catalog state needs to be updated. Hence we do not
need to make new metastore calls while processing the events and only
version lock is sufficient to serialize updates to the catalog objects
based on events. This locking protocol is similar to what is done in
case of DDL processing in CatalogOpExecutor except it does not need to
take metastoreDdlLock since no metastore operations are needed during
event processing.

The change also adds a new test class to test the basic functionality
for each of the event type which is supported.

Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
---
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
A fe/src/main/java/org/apache/impala/catalog/MetastoreEventHandlerUtils.java
A fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java
A fe/src/main/java/org/apache/impala/catalog/MetastoreNotificationException.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java
M fe/src/test/resources/postgresql-hive-site.xml.template
10 files changed, 1,375 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/12118/13
--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 13
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 12:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1696/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 12
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 02 Jan 2019 18:29:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2019-01-02 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has uploaded a new patch set (#12). ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..

IMPALA-7970 : Add support for metastore event based automatic invalidate

This change adds support to CatalogD to poll metastore events to issue
invalidate on tables automatically. It adds basic infrastructure to poll
HMS notifications events at a configurable frequency using a backend
config called hms_event_polling_frequency_s flag. Currently, it issues
invalidate at tables when it received alter events on table and
partitions. It also adds tables/databases and removes tables from
catalogD when it receives create_table/create_database and
drop_table/drop_database events. The default value of
hms_event_polling_frequency_s is 0 which disables the feature. A
non-zero value in seconds of this configuration can be used to enable
the feature and set the polling frequency.

In order to process each event atomically, this feature relies on
version lock in CatalogServiceCatalog. It adds new methods in
CatalogServiceCatalog which takes a write lock on version so that
readers are blocked until the catalog state is updated based on the
events. In case of processing events, the metastore operation is already
completed and only catalog state needs to be updated. Hence we do not
need to make new metastore calls while processing the events and only
version lock is sufficient to serialize updates to the catalog objects
based on events. This locking protocol is similar to what is done in
case of DDL processing in CatalogOpExecutor except it does not need to
take metastoreDdlLock since no metastore operations are needed during
event processing.

The change also adds a new test class to test the basic functionality
for each of the event type which is supported.

Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
---
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
A fe/src/main/java/org/apache/impala/catalog/MetastoreEventHandlerUtils.java
A fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java
A fe/src/main/java/org/apache/impala/catalog/MetastoreNotificationException.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java
M fe/src/test/resources/postgresql-hive-site.xml.template
10 files changed, 1,367 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/12118/12
--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 12
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2018-12-28 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 11:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/1692/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 11
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Sat, 29 Dec 2018 02:21:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2018-12-28 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 10:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1691/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 10
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Sat, 29 Dec 2018 02:15:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2018-12-28 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has uploaded a new patch set (#11). ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..

IMPALA-7970 : Add support for metastore event based automatic invalidate

This change adds support to CatalogD to poll metastore events to issue
invalidate on tables automatically. It adds basic infrastructure to poll
HMS notifications events at a configurable frequency using a backend
config called hms_event_polling_frequency_s flag. Currently, it issues
invalidate at tables when it received alter events on table and
partitions. It also adds tables/databases and removes tables from
catalogD when it receives create_table/create_database and
drop_table/drop_database events. The default value of
hms_event_polling_frequency_s is 0 which disables the feature. A
non-zero value in seconds of this configuration can be used to enable
the feature and set the polling frequency.

In order to process each event atomically, this feature relies on
version lock in CatalogServiceCatalog. It adds new methods in
CatalogServiceCatalog which takes a write lock on version so that
readers are blocked until the catalog state is updated based on the
events. In case of processing events, the metastore operation is already
completed and only catalog state needs to be updated. Hence we do not
need to make new metastore calls while processing the events and only
version lock is sufficient to serialize updates to the catalog objects
based on events. This locking protocol is similar to what is done in
case of DDL processing in CatalogOpExecutor except it does not need to
take metastoreDdlLock since no metastore operations are needed during
event processing.

The change also adds a new test class to test the basic functionality
for each of the event type which is supported.

Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
---
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
A fe/src/main/java/org/apache/impala/catalog/MetastoreEventHandlerUtils.java
A fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java
A fe/src/main/java/org/apache/impala/catalog/MetastoreNotificationException.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java
M fe/src/test/resources/postgresql-hive-site.xml.template
10 files changed, 1,350 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/12118/11
--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 11
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2018-12-28 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has uploaded a new patch set (#10). ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..

IMPALA-7970 : Add support for metastore event based automatic invalidate

This change adds support to CatalogD to poll metastore events to issue
invalidate on tables automatically. It adds basic infrastructure to poll
HMS notifications events at a configurable frequency using a backend
config called hms_event_polling_frequency_s flag. Currently, it issues
invalidate at tables when it received alter events on table and
partitions. It also adds tables/databases and removes tables from
catalogD when it receives create_table/create_database and
drop_table/drop_database events. The default value of
hms_event_polling_frequency_s is 0 which disables the feature. A
non-zero value in seconds of this configuration can be used to enable
the feature and set the polling frequency.

The change also adds a new test class to test the basic functionality
for each of the event type which is supported.

Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
---
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
A fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java
M fe/src/test/resources/postgresql-hive-site.xml.template
8 files changed, 1,280 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/12118/10
--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 10
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2018-12-27 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 9:

(20 comments)

Remaining comments, mostly about the tests.

High level, seems the open questions are:

* That bit about replacing the guts of the DB object.
* Error handling and reporting
* Whether the event handler should attempt to partially load DBs and tables, or 
just do an invalidate.
* Invalidation of specific partitions rather than all of them.

http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java@64
PS9, Line 64:   private static MetastoreEventsProcessor eventsProcessor;
Not thrilled with the convention, but the team likes an underscore after each 
field name.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java@109
PS9, Line 109: assertNotNull(catalog.getDb(TEST_DB_NAME));
Per prior review comments, should not add the new DB.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java@120
PS9, Line 120: createDatabase("database_to_be_dropped");
If we don't add the DB automatically, should reference it and assert it is in 
the cache.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java@122
PS9, Line 122: assertTrue(catalog.getDb("database_to_be_dropped") != null);
assertNotNull(...)


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java@125
PS9, Line 125: assertTrue("Database should not be found after processing 
drop_database event",
assertNull(...)

Oddly, later tests do use assertNull(). So, modify the code here, and the other 
cases where assertTrue is used to check for null/not null.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java@136
PS9, Line 136:   @Test(expected = DatabaseNotFoundException.class)
This is confusing. If we expect an exception, then the exception terminates the 
execution. Logically, it should come from the last statement in the body. But, 
the last statement is a Unit assertion. So, something is strange.

Often easier to do this:

try {
  doSomethingThatShoudFail();
  fail();
} catch (ExpectedException e) {
  // Expected
}


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java@139
PS9, Line 139: final String TBL_TO_BE_DROPPED = "tbl_to_be_dropped";
Looks like a constant. Move outside the method and make static.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java@148
PS9, Line 148: loadTable(TBL_TO_BE_DROPPED);
Does this assert that the table exists in HMS, was loaded in the cache, and has 
the expected contents? Else, when we drop it, we can't tell if the drop code is 
broken if we don't know if the table was actually in the cache.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java@155
PS9, Line 155: // throws DatabaseNotFoundException
If it throws, we won't get to the assertTrue.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java@161
PS9, Line 161:   @Ignore("Disabled until we fix Hive bug to deserialize 
alter_database event messages")
Reference HIVE JIRA ticket number.

Also, should reference this bug in the event processor code.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java@217
PS9, Line 217: createTable(TEST_TABLE_NAME_NONPARTITIONED, false);
Start by asserting that the table is not yet in the cache?


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java@255
PS9, Line 255: List> partVals = new ArrayList<>(1);
Allocating just 1 member up front, then adding four items. Maybe just omit the 
initial size arg; not helping us much here.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java@277
PS9, Line 277: // change some partitions
There is a recent fix that does specialized calls to HDFS to check for updated 
vs. new partitions. It will be tricky to combine that test with this one, but 
we should ensure that the automated way of updating a partition uses the call 

[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2018-12-27 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/Db.java
File fe/src/main/java/org/apache/impala/catalog/Db.java:

http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/Db.java@112
PS8, Line 112: synchronized (thriftDb_) { thriftDb_.setMetastore_db(msDb); }
> I see your point. Initially, I wanted to model the changes to metastore db
The locking scheme does require the cooperation of multiple classes so method 
access modifiers aren't likely to be sufficient to prevent other classes from 
doing bad things (it's good if we make fewer things public for sure though).

I think the precedent to look at here are the various alterDatabase operations 
in CatalogOpExecutor. Those follow the protocol described in the 
CatalogOpExecutor class comment. In those cases metastoreDdlLock_ is the lock 
that's preventing concurrent updates to thriftDb_.

I was trying to understand if anything prevents concurrent reads + writes to 
thriftDb_ though and I don't think there currently is anything, so it seems 
like some anomalies could result from that. There's also some logic in 
CatalogOpExecutor that mutates thriftDb_ in-place.



--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 8
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 27 Dec 2018 20:11:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2018-12-27 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 9:

(39 comments)

This will be a great feature. Detailed review from the perspective of 1) 
synchronization and 2) diagnosing issues when they occur in the field.

This is a partial review, will review remaining files separately.

http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@302
PS9, Line 302: if (BackendConfig.INSTANCE.getHMSPollingFrequencyInSeconds() 
<= 0) {
Good check. Better to define this as behavior: to disable the polling, set the 
interval to 0 (or, trivially <0). This would be a good fall-back safety-valve 
for support cases, tests, etc.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@313
PS9, Line 313:   } catch (TException e) {
Nice use of try-with-catch. No need for nested tries. Remove inner catch, move 
catch to outer try.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2317
PS9, Line 2317:   MetastoreEventsProcessor getMetastoreEventProcessor() {
Should this be public? If only protected, please explicitly label it protected.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/Db.java
File fe/src/main/java/org/apache/impala/catalog/Db.java:

http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/Db.java@112
PS9, Line 112: synchronized (thriftDb_) { thriftDb_.setMetastore_db(msDb); }
What is the synchronization model?

As written, we lock on the prior DB to set the new one. This means access is 
not synchronized. Better to add synchronized to the method itself, here and 
accessor.

But, more broadly, if we synchronize only at the DB level, we'll have all 
manner of race conditions. One client will grab the DB with one version the 
thrift object, then another will be slotted in underneath.

This probably needs more thought. Is the model documented in the write-up?


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/Db.java@159
PS9, Line 159: synchronized (thriftDb_) { return thriftDb_.deepCopy(); }
How often do we do this? Do we want this to block on an HMS call about 
parameters?

Maybe a better model is for the RPC calls to do this:

- Obtain a lock
- Get a pointer to the current thrift object
- Release the lock
- Make the RPC

Not that there is no need to copy the thrift object above: the logic here 
implies that the Thrift object is immutable once set. All we are doing is 
making sure we use the same object throughout the RPC.

Still, not convinced we have the right locking semantics.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/Db.java@164
PS9, Line 164: synchronized (thriftDb_) { return thriftDb_.getDb_name(); }
Now getting the name is synchronized on an HMS call. That can't be good for 
performance. The planner/analyzer gets the name all the time, so we really 
don't want to do this.

The name should be invariant or all heck breaks loose in the planner. So, if we 
swap out Thrift objects, pull the name outside the thrift object and cache it 
in a final field.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java
File fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@50
PS9, Line 50:  * in Hive metastore using the the public API 
get_next_notification These events could
Impala generally does not use Javadoc markup, sadly. But, if you want to use it 
(we really should), then a reference to a Java method should be: {@link 
Class#method}.


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@63
PS9, Line 63:  *   CREATE_TABLE, CREATE_DATABASE
Wrong form for HTML lists. I think you want a definition list:


  CREATE_TABLE, CREATE_DATABASE
  A new table/database is ... 
...

That sad, if Vuk was here, he'd have you rip all this out. So, a broader 
question is: do we want our Javadoc to follow C++ doc standard so the C++ folks 
can follow it, or Javadoc standard so the documentation looks correct when 
formatted? I prefer the Javadoc approach, so would encourage you to keep the 
(corrected) formatting.



[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2018-12-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 9:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1677/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 9
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 27 Dec 2018 01:26:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2018-12-26 Thread Anonymous Coward (Code Review)
vih...@cloudera.com has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 9:

(34 comments)

Latest patch works through some of the comments. Still working through the rest 
of the comments

http://gerrit.cloudera.org:8080/#/c/12118/8//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12118/8//COMMIT_MSG@7
PS8, Line 7: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
> nit: wrap lines to less than 72 chars.
done


http://gerrit.cloudera.org:8080/#/c/12118/8//COMMIT_MSG@21
PS8, Line 21: The change also adds a new test class to test the basic 
functionality
> Is the self-event detection coming in a separate patch?
Yes, self-event detection is still a work in progress and would be done as part 
of IMPALA-7972


http://gerrit.cloudera.org:8080/#/c/12118/8/be/src/common/global-flags.cc
File be/src/common/global-flags.cc:

http://gerrit.cloudera.org:8080/#/c/12118/8/be/src/common/global-flags.cc@246
PS8, Line 246: invalidate cached tab
> This only invalidates right? (refresh has a different meaning in Impala's c
Right now, its only invalidate but I plan to make use of refresh in 
IMPALA-7973. I will change it to invalidate since that is the current state of 
the feature.


http://gerrit.cloudera.org:8080/#/c/12118/8/be/src/common/global-flags.cc@247
PS8, Line 247: These metastore events could
> nit: rephrase to  ...cached table metadata based on...
done


http://gerrit.cloudera.org:8080/#/c/12118/8/be/src/common/global-flags.cc@251
PS8, Line 251: \
> ..has..
done


http://gerrit.cloudera.org:8080/#/c/12118/8/be/src/common/global-flags.cc@253
PS8, Line 253: "creation or removal of objects from metastore, catalogd 
adds or removes such "
> Since this is a user-facing message, we should probably clarify that this i
The first line of the message mentions metastore events. Do you think that is 
not clear to a user? Added one line which describes how the metastore events 
could be generated from external systems like Apache Hive or another instance 
of Impala cluster talking with the same metastore.


http://gerrit.cloudera.org:8080/#/c/12118/8/be/src/common/global-flags.cc@260
PS8, Line 260:
> nit: format to fewer lines like others.
I had formatted it like the others, but the clang-format-diff script which I 
use to format the patch doesn't seem to like it. Any idea how can fix it 
automatically?


http://gerrit.cloudera.org:8080/#/c/12118/8/be/src/common/global-flags.cc@258
PS8, Line 258: r the old generation to be almost "
 : "full and trigger an invalidation on recently unused 
tables");
 :
> Looks like this is already mentioned above. Also, is it needed on the coord
removed the redundant line. Also, yes, not need on coordinators as of now.


http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@229
PS8, Line 229:
> Add a oneliner comment. Also, we use an "_" prefix for class variables. foo
done


http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@292
PS8, Line 292:
> nit: sounds vague, rephrase?
added some more meat to this. Please suggest improvements if you think 
something more is needed in the text. Thanks!


http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@292
PS8, Line 292:
 :   /**
 :* Initializes Metastore event
> I think we can get rid of the implementation detail here, remove?
Good point. Added check and return early logic.


http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@304
PS8, Line 304:+ "interval is %d", eventPollingInterval));
 :   return;
 : }
> Isn't this a fatal error? (Include the stack trace in the error log)
done. Did you mean changing the log level to fatal as well?


http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/Db.java
File fe/src/main/java/org/apache/impala/catalog/Db.java:

http://gerrit.cloudera.org:8080/#/c/12118/8/fe/src/main/java/org/apache/impala/catalog/Db.java@112
PS8, Line 112: synchronized (thriftDb_) { thriftDb_.setMetastore_db(msDb); }
> This additional ad-hoc locking without explanation of how it integrates wit
I see your point. Initially, I wanted to model the changes to metastore db 
object similar to how functions are added to its parameters. I noticed that Db 
class has public method to add a function which can modify the state of the 
metastoredb object 

[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2018-12-26 Thread Anonymous Coward (Code Review)
vih...@cloudera.com has uploaded a new patch set (#9). ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..

IMPALA-7970 : Add support for metastore event based automatic invalidate

This change adds support to CatalogD to poll metastore events to issue
invalidate on tables automatically. It adds basic infrastructure to poll
HMS notifications events at a configurable frequency using a backend
config called hms_event_polling_frequency_s flag. Currently, it issues
invalidate at tables when it received alter events on table and
partitions. It also adds tables/databases and removes tables from
catalogD when it receives create_table/create_database and
drop_table/drop_database events. The default value of
hms_event_polling_frequency_s is 0 which disables the feature. A
non-zero value in seconds of this configuration can be used to enable
the feature and set the polling frequency.

The change also adds a new test class to test the basic functionality
for each of the event type which is supported.

Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
---
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
A fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreEventsProcessorTest.java
M fe/src/test/resources/postgresql-hive-site.xml.template
9 files changed, 1,050 insertions(+), 14 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/12118/9
--
To view, visit http://gerrit.cloudera.org:8080/12118
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic70b27780560b7ac9b33418d132b36cd0ca4abf7
Gerrit-Change-Number: 12118
Gerrit-PatchSet: 9
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7970 : Add support for metastore event based automatic invalidate

2018-12-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12118 )

Change subject: IMPALA-7970 : Add support for metastore event based automatic 
invalidate
..


Patch Set 9:

(22 comments)

http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@294
PS9, Line 294:* Initializes Metastore event processor object if
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java
File fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@46
PS9, Line 46:  * 
org.apache.hadoop.hive.metastore.api.NotificationEvent. Metastore 
can be configured,
line too long (100 > 90)


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@47
PS9, Line 47:  * to work with Listeners which are called on various DDL 
operations like create/alter/drop
line too long (91 > 90)


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@48
PS9, Line 48:  * operations on database, table, partition etc. Each event has a 
unique incremental id and the
line too long (95 > 90)


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@49
PS9, Line 49:  * generated events are be fetched from Metastore to get 
incremental updates to the metadata stored
line too long (99 > 90)


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@50
PS9, Line 50:  * in Hive metastore using the the public API 
get_next_notification These events could
line too long (99 > 90)


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@51
PS9, Line 51:  * be generated by external Metastore clients like Apache Hive or 
Apache Spark as well as other
line too long (95 > 90)


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@54
PS9, Line 54:  * This class is used to poll metastore for new events at a given 
frequency. By applying such events
line too long (100 > 90)


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@55
PS9, Line 55:  * on the catalogD we can sync to external metadata operations by 
taking appropriate actions for
line too long (96 > 90)


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@56
PS9, Line 56:  * each event type. We keep track of the last synced event id in 
each polling iteration so the next
line too long (99 > 90)


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@64
PS9, Line 64:  *   A new table/database is created in Catalog respectively. The 
newly created table/database are
line too long (98 > 90)


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@69
PS9, Line 69:  *   In case of alter table event, currently the code issues a 
invalidate table command. There is a
line too long (99 > 90)


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@70
PS9, Line 70:  *   special case of this event in case of renames, where the old 
table is removed and a new
line too long (92 > 90)


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@73
PS9, Line 73:  *   In case of alter database events, currently only the case of 
changing default location, owner
line too long (98 > 90)


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@76
PS9, Line 76:  *   Currently, in case of these events, we issue invalidate on 
the table. This can be optimized by
line too long (99 > 90)


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@162
PS9, Line 162:   // fetch the current notification event id. We assume that 
the polling interval is small
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/12118/9/fe/src/main/java/org/apache/impala/catalog/MetastoreEventsProcessor.java@163
PS9, Line 163:   // enough that most of these polling operations result in 
zero new events. In such a case,