Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21031 )

Change subject: IMPALA-12709: Add support for hierarchical metastore event 
processing
......................................................................


Patch Set 34:

(34 comments)

http://gerrit.cloudera.org:8080/#/c/21031/33//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21031/33//COMMIT_MSG@20
PS33, Line 20: Existing metastore event processing is turned into multi-level
             : event processing with enable_hierarchical_event_processing flag.
> Added
Thank you.


http://gerrit.cloudera.org:8080/#/c/21031/34//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21031/34//COMMIT_MSG@42
PS34, Line 42:
nit: Unnecessary white spaces. Simply justify left.


http://gerrit.cloudera.org:8080/#/c/21031/34//COMMIT_MSG@54
PS34, Line 54:
nit: Unnecessary white spaces. Simply justify left.


http://gerrit.cloudera.org:8080/#/c/21031/34//COMMIT_MSG@63
PS34, Line 63: Following new events are added:
nit: It will be nice to have a shared interface for derived event kind so we 
can easily track them. For example:

  public interface DerivedMetastoreEvent {
    public MetastoreEvent getActualEvent();
  }


http://gerrit.cloudera.org:8080/#/c/21031/33/be/src/catalog/catalog-server.cc
File be/src/catalog/catalog-server.cc:

http://gerrit.cloudera.org:8080/#/c/21031/33/be/src/catalog/catalog-server.cc@237
PS33, Line 237: min_event_processor_idle_m
> Have changed it to time period now. min_event_processor_idle_ms is the mini
Done


http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/DBBarrierEvent.java
File fe/src/main/java/org/apache/impala/catalog/events/DBBarrierEvent.java:

http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/DBBarrierEvent.java@136
PS34, Line 136:   public void incrExpectedProceedCount() {
Please revisit the access level for some of these methods and turn them into 
protected if they are not currently needed by outside package.

For example, I think only DBProcessor and TableProcessors should be the only 
one eligible to call incrExpectedProceedCount() and decrExpectedProceedCount().


http://gerrit.cloudera.org:8080/#/c/21031/33/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java
File fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java:

http://gerrit.cloudera.org:8080/#/c/21031/33/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java@53
PS33, Line 53:
> Yes. Updated the description.
Ack


http://gerrit.cloudera.org:8080/#/c/21031/33/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java@71
PS33, Line 71:
> Done
Done


http://gerrit.cloudera.org:8080/#/c/21031/33/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java@73
PS33, Line 73:
> Done
Done


http://gerrit.cloudera.org:8080/#/c/21031/33/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java@91
PS33, Line 91:
> Done
Done


http://gerrit.cloudera.org:8080/#/c/21031/33/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java@125
PS33, Line 125:    * segregated from the input event queue as database events 
and table events. Table
              :    * events are dispatched to the a
> Done
Done


http://gerrit.cloudera.org:8080/#/c/21031/33/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java@231
PS33, Line 231:           }
              :         }
              :
> This the only place DBProcessor usable_ is set to false. IMO, we don't need
Ack


http://gerrit.cloudera.org:8080/#/c/21031/33/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java@439
PS33, Line 439:         "Executor: {} might have been cleared. Not decrementing 
outstanding event count",
              :         name_);
> Done
Done


http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java
File fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java:

http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java@73
PS34, Line 73: // Executor name
nit: // Executor name, like "DBEventExecutor-0".


http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java@111
PS34, Line 111: Lock contention normally do not occur at most often happening 
call
nit: "Lock contention should not occur at most frequent call sites "


http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java@166
PS34, Line 166: Lock contention normally do not occur at most often happening 
calls
nit: "Lock contention should not occur at most frequent call sites "


http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java@229
PS34, Line 229:             barrierEvent.incrExpectedProceedCount();
              :             tableProcessor.enqueue(barrierEvent);
>From this code, I understand that DBBarrierEvent.expectedProceedCount_ can 
>dynamically increased and decreased as new TableProcessor is added.

I suggest adding a DBBarrierEvent check inside TableProcessor.enqueue(), and do 
the increment there. Thus, tableCount parameter in DBBarrierEvent constructor 
is not needed anymore.


http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java@244
PS34, Line 244:       synchronized (usableLock_) {
              :         if (!isUsable()) {
This is obtaining usableLock_ twice. Is that OK?
I think the old code in ps33 is fine.


http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java@438
PS34, Line 438:     LOG.debug(
              :         "Executor: {} might have been cleared. Not decrementing 
outstanding event count",
              :         name_);
This does not seem right to me.
If Executor is cleared, then the clear method should proactively decrement 
outstandingEventCount_ though this decrOutstandingEventCount().

outstandingEventCount_ must always be >= 0 at any point of time (thus, L433 is 
always True).
Otherwise, there is a bug happen somewhere.


http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java@515
PS34, Line 515:       outstandingEventCount_ = 0;
This should be Precondition at the end of method rather than assignment.

Precondition.checkState(outstandingEventCount_ == 0);


http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/EventExecutorService.java
File 
fe/src/main/java/org/apache/impala/catalog/events/EventExecutorService.java:

http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/EventExecutorService.java@87
PS34, Line 87:   public void shutdown(boolean graceful) {
I think process() should drop any given MetastoreEvent once shutdown is called.

Maybe having is_active() or is_shutting_down() method is also convenient so 
that caller can react accordingly (ie., CatalogD should stop calling 
EventExecutorService.process()).


http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@588
PS34, Line 588: protected static final String CLUSTER_WIDE_TARGET = 
"CLUSTER_WIDE";
Add comment for this. What kind of event target CLUSTER_WIDE?


http://gerrit.cloudera.org:8080/#/c/21031/33/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/21031/33/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@661
PS33, Line 661:               .build());
              :
> Done. Have added EventExecutorService. Have added the EventExecutorServiceT
Done


http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@779
PS34, Line 779: Gets the count of events pending to process when hierarchical 
mode is enabled.
Adds:

Always return 0 if hierarchical mode is disabled.


http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/RenameTableBarrierEvent.java
File 
fe/src/main/java/org/apache/impala/catalog/events/RenameTableBarrierEvent.java:

http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/RenameTableBarrierEvent.java@79
PS34, Line 79: pseudoEvent_ = pseudoEvent;
Add precondition to check that pseudoEvent is either of DROP_TABLE or 
CREATE_TABLE and nothing else.


http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/RenameTableBarrierEvent.java@139
PS34, Line 139:  } else {
Add Precondition in this else block.

Precondition.checkState(MetastoreEventType.CREATE_TABLE.equals(pseudoEvent_.getEventType()));


http://gerrit.cloudera.org:8080/#/c/21031/33/fe/src/main/java/org/apache/impala/catalog/events/TableEventExecutor.java
File fe/src/main/java/org/apache/impala/catalog/events/TableEventExecutor.java:

http://gerrit.cloudera.org:8080/#/c/21031/33/fe/src/main/java/org/apache/impala/catalog/events/TableEventExecutor.java@55
PS33, Line 55: 
> Done
Done


http://gerrit.cloudera.org:8080/#/c/21031/33/fe/src/main/java/org/apache/impala/catalog/events/TableEventExecutor.java@57
PS33, Line 57:
> Done
Done


http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/TableEventExecutor.java
File fe/src/main/java/org/apache/impala/catalog/events/TableEventExecutor.java:

http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/TableEventExecutor.java@56
PS34, Line 56:   // Executor name
nit: // Executor name, like "DBEventExecutor-0.TableEventExecutor-0".


http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/TableEventExecutor.java@312
PS34, Line 312:     LOG.debug(
              :         "Executor: {} might have been cleared. Not decrementing 
outstanding event count",
              :         name_);
This also does not seem right to me.
If Executor is cleared, then the clear method should proactively decrement 
outstandingEventCount_ though this decrOutstandingEventCount().

outstandingEventCount_ must always be >= 0 at any point of time (thus, L307 is 
always True).
Otherwise, there is a bug happen somewhere.


http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/TableEventExecutor.java@333
PS34, Line 333:       outstandingEventCount_ = 0;
This should be Precondition at the end of method rather than assignment.

Precondition.checkState(outstandingEventCount_ == 0);


http://gerrit.cloudera.org:8080/#/c/21031/34/tests/custom_cluster/test_event_processing_perf.py
File tests/custom_cluster/test_event_processing_perf.py:

http://gerrit.cloudera.org:8080/#/c/21031/34/tests/custom_cluster/test_event_processing_perf.py@220
PS34, Line 220: run_queries
I suggest rename to __run_event_processing_tests to distinct from other method 
names at ImpalaTestSuite / CustomClusterTestSuite.


http://gerrit.cloudera.org:8080/#/c/21031/34/tests/custom_cluster/test_events_custom_configs.py
File tests/custom_cluster/test_events_custom_configs.py:

http://gerrit.cloudera.org:8080/#/c/21031/34/tests/custom_cluster/test_events_custom_configs.py@1399
PS34, Line 1399:     src_db = ImpalaTestSuite.get_random_name(unique_database) 
+ "_imp_src"
               :     target_db = 
ImpalaTestSuite.get_random_name(unique_database) + "_imp_target"
FYI, unique_database fixture has a way to declare/asking for more than 1 new 
unique database to create before test start.
https://github.com/apache/impala/blob/dc9066e7b29208e6dfdf8276a56c9045bd859637/tests/conftest.py#L319-L340

unique_database fixture is convenient because it cleans up the temporary 
databases once test is done, whether test is success or failed. So technically, 
you can annotate like this:

  @UniqueDatabase.parametrize(num_dbs=4, sync_ddl=True)
  def test_rename_table_hep_two_db_and_table_threads(self, unique_database):
    impala_src_db = unique_database
    impala_target_db = unique_database + '2'
    hive_src_db = unique_database + '3'
    hive_target_db = unique_database + '4'

Same comment for test_rename_table_hep_one_db_and_two_table_threads.


http://gerrit.cloudera.org:8080/#/c/21031/34/tests/custom_cluster/test_events_custom_configs.py@1445
PS34, Line 1445: rename_with_hierarchical_event_process
I think 'rename_table_with_hierarchical_event_process' is more accurate?



--
To view, visit http://gerrit.cloudera.org:8080/21031
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I76d8a739f9db6d40f01028bfd786a85d83f9e5d6
Gerrit-Change-Number: 21031
Gerrit-PatchSet: 34
Gerrit-Owner: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Reviewer: Sai Hemanth Gantasala <[email protected]>
Gerrit-Comment-Date: Tue, 21 Jan 2025 22:34:12 +0000
Gerrit-HasComments: Yes

Reply via email to