[Impala-ASF-CR] IMPALA-12152: Add query option to wait for events sync up

Csaba Ringhofer (Code Review) Wed, 30 Aug 2023 03:01:08 -0700

Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20131 )


Change subject: IMPALA-12152: Add query option to wait for events sync up
......................................................................


Patch Set 12:

(9 comments)

Mainly high level comments, I haven't processed all the details yet.

http://gerrit.cloudera.org:8080/#/c/20131/12//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20131/12//COMMIT_MSG@23
PS12, Line 23: Note that the current implementation waits for the latest event 
id when
             : the WaitForHmsEvent RPC is received at catalogd side. We can 
improve it
             : once HIVE-27499 is resolved, so we can efficiently detect 
whether some
             : given tables have unprocessed events and just wait for the 
*largest* id
             : of them. Tables that have no unprocessed events don't need to 
block the
             : query planning.
Are there plans on how to implement this from Impala side?
What bugs me is the order of things when executing the query - currently 
WaitForHmsEvent() is called before creating the plan, but to get the list of 
tables involved, we will need some preliminary analyses before calling 
WaitForHmsEvent(). Even without HIVE-27499 we could profit a bit from this by 
skipping waiting for events for queries that do not reference any table.


http://gerrit.cloudera.org:8080/#/c/20131/12/be/src/service/client-request-state.h
File be/src/service/client-request-state.h:

http://gerrit.cloudera.org:8080/#/c/20131/12/be/src/service/client-request-state.h@468
PS12, Line 468: SetErrorMsg
The name could indicate that this should be used early in the life cycle of the 
query. For example SetEarlyErrorMsg().

+ same for line 660


http://gerrit.cloudera.org:8080/#/c/20131/12/be/src/service/impala-server.h
File be/src/service/impala-server.h:

http://gerrit.cloudera.org:8080/#/c/20131/12/be/src/service/impala-server.h@445
PS12, Line 445:   /// Wait until catalog server catches up the latest HMS event.
nit: catches up to?


http://gerrit.cloudera.org:8080/#/c/20131/12/common/thrift/CatalogService.thrift
File common/thrift/CatalogService.thrift:

http://gerrit.cloudera.org:8080/#/c/20131/12/common/thrift/CatalogService.thrift@691
PS12, Line 691: CatalogServiceVersion
I see that it is a common pattern to use it like this, but why don't we add the 
header to the TCatalogServiceRequestHeader?


http://gerrit.cloudera.org:8080/#/c/20131/12/common/thrift/CatalogService.thrift@751
PS12, Line 751: WaitForHmsEvent
High level question:
wouldn't it be better to just get the current current event id here and wait 
for statestore updates to propagate it? I don't know whather it is propagated 
now, but it doesn't seem hard to send the current synced event id with the 
catalog updates.

My concern with this RPC is that it can be very long if the timeout is high. An 
example where this could complicate thing is query cancellation. Another one is 
catalogd change in the background - waiting for synced event id from catalog 
topic update could allow to handle this correctly even if the catalogd is 
restarted.


http://gerrit.cloudera.org:8080/#/c/20131/12/common/thrift/Query.thrift
File common/thrift/Query.thrift:

http://gerrit.cloudera.org:8080/#/c/20131/12/common/thrift/Query.thrift@668
PS12, Line 668: sync_hms_events_strict_mode
I didn't find any test for this.


http://gerrit.cloudera.org:8080/#/c/20131/12/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/20131/12/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3810
PS12, Line 3810:     long waitForEventId = 
metastoreEventProcessor_.getCurrentEventId();
Maybe it would be better to move some of this logic inside 
MetastoreEventProcessor, e.g. a function like waitForSyncUpToCurrentEvent(long 
timeout)?

Some optimizations may be easier to implement inside EventProcessorStatus. One 
example is that getCurrentEventId() requests could be merged waitForHmsEvent() 
calls that come in a small window to avoid excessive number of RPCs to HMS. I 
don't mean to implement this now, but it may be useful in the future to 
efficiently handle large number of parallel queries.


http://gerrit.cloudera.org:8080/#/c/20131/12/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3817
PS12, Line 3817: while
The loop could also check MetastoreEventsProcessor state to stop waiting if the 
event processor had some error.


http://gerrit.cloudera.org:8080/#/c/20131/12/tests/metadata/test_event_processing.py
File tests/metadata/test_event_processing.py:

http://gerrit.cloudera.org:8080/#/c/20131/12/tests/metadata/test_event_processing.py@377
PS12, Line 377: N
Shouldn't we also set strict mode in tests like this? My understanding is that 
this won't cause an error if the event processor goes to an error state and 
doesn't sync up for 10s



--
To view, visit http://gerrit.cloudera.org:8080/20131
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36ac941bb2c2217b09fcfa2eb567b011b38efa2a
Gerrit-Change-Number: 20131
Gerrit-PatchSet: 12
Gerrit-Owner: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Michael Smith <michael.sm...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Reviewer: Sai Hemanth Gantasala <saihema...@cloudera.com>
Gerrit-Comment-Date: Wed, 30 Aug 2023 10:00:50 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-12152: Add query option to wait for events sync up

Reply via email to