Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17848 )

Change subject: IMPALA-9857: Batching of consecutive partition events
......................................................................


Patch Set 6:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/17848/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17848/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@955
PS5, Line 955:
> nit: too much indent
Done


http://gerrit.cloudera.org:8080/#/c/17848/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1027
PS5, Line 1027:
> nit: +2 indent
Done


http://gerrit.cloudera.org:8080/#/c/17848/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/17848/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2713
PS5, Line 2713:       return hmsPartToHdfsPart.size();
> Shouldn't we return hmsPartToHdfsPart.size() since those are the actual par
yes, thanks for spotting that. Fixed.


http://gerrit.cloudera.org:8080/#/c/17848/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/17848/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@279
PS2, Line 279:         if (!current.canBeBatched(next)) {
> Currently one of the conditions of batching two events together is - event
Yes, the events definitely can be intermingled from multiple tables. In case of 
ALTER_PARTITION event it is less likely to happen since these events are 
transactional in nature (publish all or none) and they hold a eventId lock in 
HMS when being generated. However, this is highly dependent on the source 
application which generates the events. For example if Hive makes 10 
alterPartition calls instead of 1 alterPartitions() for 10 partitions, the 
event stream may include events which are inter-mingled between different 
tables.

The batching logic currently optimizes for the common case of consecutive 
ALTER_PARTITION or INSERT events being generated by a single query. More 
sophisticated batching schemes like you mentioned can be explored but may 
increase the scope of the patch and I am inclined to commit this first and then 
incrementally improve it if needed. My primary concern about batching 
non-consecutive events is that we need to make sure that we are not introducing 
any subtle bugs because of the the fact that we are refreshing the partitions 
out of order then event stream.


http://gerrit.cloudera.org:8080/#/c/17848/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1641
PS2, Line 1641:               
MetastoreEventPropertyKey.CATALOG_VERSION.getKey(), "-1"));
> I understand that there may not be significant performance gain. I still fe
Okay. Let me look into this and see how much additional changes will need to 
made for this.


http://gerrit.cloudera.org:8080/#/c/17848/5/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/17848/5/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1811
PS5, Line 1811: Ref
> nit: indentation is off
My IDE seems to like doing this. Not sure how to fix it. Manually edited it.


http://gerrit.cloudera.org:8080/#/c/17848/5/fe/src/main/java/org/apache/impala/catalog/events/SelfEventContext.java
File fe/src/main/java/org/apache/impala/catalog/events/SelfEventContext.java:

http://gerrit.cloudera.org:8080/#/c/17848/5/fe/src/main/java/org/apache/impala/catalog/events/SelfEventContext.java@100
PS5, Line 100: dx);
> This precondition is checked by the Java runtime.
makes sense. Removed it.



--
To view, visit http://gerrit.cloudera.org:8080/17848
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5d27a68a64436d31731e9a219b1efd6fc842de73
Gerrit-Change-Number: 17848
Gerrit-PatchSet: 6
Gerrit-Owner: Vihang Karajgaonkar <vih...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Reviewer: Sourabh Goyal <soura...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vih...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Comment-Date: Tue, 21 Sep 2021 19:46:35 +0000
Gerrit-HasComments: Yes

Reply via email to