Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21031 )
Change subject: IMPALA-12709: Add support for hierarchical metastore event processing ...................................................................... Patch Set 34: (34 comments) http://gerrit.cloudera.org:8080/#/c/21031/33//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21031/33//COMMIT_MSG@20 PS33, Line 20: Existing metastore event processing is turned into multi-level : event processing with enable_hierarchical_event_processing flag. > Added Thank you. http://gerrit.cloudera.org:8080/#/c/21031/34//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21031/34//COMMIT_MSG@42 PS34, Line 42: nit: Unnecessary white spaces. Simply justify left. http://gerrit.cloudera.org:8080/#/c/21031/34//COMMIT_MSG@54 PS34, Line 54: nit: Unnecessary white spaces. Simply justify left. http://gerrit.cloudera.org:8080/#/c/21031/34//COMMIT_MSG@63 PS34, Line 63: Following new events are added: nit: It will be nice to have a shared interface for derived event kind so we can easily track them. For example: public interface DerivedMetastoreEvent { public MetastoreEvent getActualEvent(); } http://gerrit.cloudera.org:8080/#/c/21031/33/be/src/catalog/catalog-server.cc File be/src/catalog/catalog-server.cc: http://gerrit.cloudera.org:8080/#/c/21031/33/be/src/catalog/catalog-server.cc@237 PS33, Line 237: min_event_processor_idle_m > Have changed it to time period now. min_event_processor_idle_ms is the mini Done http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/DBBarrierEvent.java File fe/src/main/java/org/apache/impala/catalog/events/DBBarrierEvent.java: http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/DBBarrierEvent.java@136 PS34, Line 136: public void incrExpectedProceedCount() { Please revisit the access level for some of these methods and turn them into protected if they are not currently needed by outside package. For example, I think only DBProcessor and TableProcessors should be the only one eligible to call incrExpectedProceedCount() and decrExpectedProceedCount(). http://gerrit.cloudera.org:8080/#/c/21031/33/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java File fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java: http://gerrit.cloudera.org:8080/#/c/21031/33/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java@53 PS33, Line 53: > Yes. Updated the description. Ack http://gerrit.cloudera.org:8080/#/c/21031/33/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java@71 PS33, Line 71: > Done Done http://gerrit.cloudera.org:8080/#/c/21031/33/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java@73 PS33, Line 73: > Done Done http://gerrit.cloudera.org:8080/#/c/21031/33/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java@91 PS33, Line 91: > Done Done http://gerrit.cloudera.org:8080/#/c/21031/33/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java@125 PS33, Line 125: * segregated from the input event queue as database events and table events. Table : * events are dispatched to the a > Done Done http://gerrit.cloudera.org:8080/#/c/21031/33/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java@231 PS33, Line 231: } : } : > This the only place DBProcessor usable_ is set to false. IMO, we don't need Ack http://gerrit.cloudera.org:8080/#/c/21031/33/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java@439 PS33, Line 439: "Executor: {} might have been cleared. Not decrementing outstanding event count", : name_); > Done Done http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java File fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java: http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java@73 PS34, Line 73: // Executor name nit: // Executor name, like "DBEventExecutor-0". http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java@111 PS34, Line 111: Lock contention normally do not occur at most often happening call nit: "Lock contention should not occur at most frequent call sites " http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java@166 PS34, Line 166: Lock contention normally do not occur at most often happening calls nit: "Lock contention should not occur at most frequent call sites " http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java@229 PS34, Line 229: barrierEvent.incrExpectedProceedCount(); : tableProcessor.enqueue(barrierEvent); >From this code, I understand that DBBarrierEvent.expectedProceedCount_ can >dynamically increased and decreased as new TableProcessor is added. I suggest adding a DBBarrierEvent check inside TableProcessor.enqueue(), and do the increment there. Thus, tableCount parameter in DBBarrierEvent constructor is not needed anymore. http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java@244 PS34, Line 244: synchronized (usableLock_) { : if (!isUsable()) { This is obtaining usableLock_ twice. Is that OK? I think the old code in ps33 is fine. http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java@438 PS34, Line 438: LOG.debug( : "Executor: {} might have been cleared. Not decrementing outstanding event count", : name_); This does not seem right to me. If Executor is cleared, then the clear method should proactively decrement outstandingEventCount_ though this decrOutstandingEventCount(). outstandingEventCount_ must always be >= 0 at any point of time (thus, L433 is always True). Otherwise, there is a bug happen somewhere. http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/DBEventExecutor.java@515 PS34, Line 515: outstandingEventCount_ = 0; This should be Precondition at the end of method rather than assignment. Precondition.checkState(outstandingEventCount_ == 0); http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/EventExecutorService.java File fe/src/main/java/org/apache/impala/catalog/events/EventExecutorService.java: http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/EventExecutorService.java@87 PS34, Line 87: public void shutdown(boolean graceful) { I think process() should drop any given MetastoreEvent once shutdown is called. Maybe having is_active() or is_shutting_down() method is also convenient so that caller can react accordingly (ie., CatalogD should stop calling EventExecutorService.process()). http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@588 PS34, Line 588: protected static final String CLUSTER_WIDE_TARGET = "CLUSTER_WIDE"; Add comment for this. What kind of event target CLUSTER_WIDE? http://gerrit.cloudera.org:8080/#/c/21031/33/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java: http://gerrit.cloudera.org:8080/#/c/21031/33/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@661 PS33, Line 661: .build()); : > Done. Have added EventExecutorService. Have added the EventExecutorServiceT Done http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java: http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@779 PS34, Line 779: Gets the count of events pending to process when hierarchical mode is enabled. Adds: Always return 0 if hierarchical mode is disabled. http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/RenameTableBarrierEvent.java File fe/src/main/java/org/apache/impala/catalog/events/RenameTableBarrierEvent.java: http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/RenameTableBarrierEvent.java@79 PS34, Line 79: pseudoEvent_ = pseudoEvent; Add precondition to check that pseudoEvent is either of DROP_TABLE or CREATE_TABLE and nothing else. http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/RenameTableBarrierEvent.java@139 PS34, Line 139: } else { Add Precondition in this else block. Precondition.checkState(MetastoreEventType.CREATE_TABLE.equals(pseudoEvent_.getEventType())); http://gerrit.cloudera.org:8080/#/c/21031/33/fe/src/main/java/org/apache/impala/catalog/events/TableEventExecutor.java File fe/src/main/java/org/apache/impala/catalog/events/TableEventExecutor.java: http://gerrit.cloudera.org:8080/#/c/21031/33/fe/src/main/java/org/apache/impala/catalog/events/TableEventExecutor.java@55 PS33, Line 55: > Done Done http://gerrit.cloudera.org:8080/#/c/21031/33/fe/src/main/java/org/apache/impala/catalog/events/TableEventExecutor.java@57 PS33, Line 57: > Done Done http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/TableEventExecutor.java File fe/src/main/java/org/apache/impala/catalog/events/TableEventExecutor.java: http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/TableEventExecutor.java@56 PS34, Line 56: // Executor name nit: // Executor name, like "DBEventExecutor-0.TableEventExecutor-0". http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/TableEventExecutor.java@312 PS34, Line 312: LOG.debug( : "Executor: {} might have been cleared. Not decrementing outstanding event count", : name_); This also does not seem right to me. If Executor is cleared, then the clear method should proactively decrement outstandingEventCount_ though this decrOutstandingEventCount(). outstandingEventCount_ must always be >= 0 at any point of time (thus, L307 is always True). Otherwise, there is a bug happen somewhere. http://gerrit.cloudera.org:8080/#/c/21031/34/fe/src/main/java/org/apache/impala/catalog/events/TableEventExecutor.java@333 PS34, Line 333: outstandingEventCount_ = 0; This should be Precondition at the end of method rather than assignment. Precondition.checkState(outstandingEventCount_ == 0); http://gerrit.cloudera.org:8080/#/c/21031/34/tests/custom_cluster/test_event_processing_perf.py File tests/custom_cluster/test_event_processing_perf.py: http://gerrit.cloudera.org:8080/#/c/21031/34/tests/custom_cluster/test_event_processing_perf.py@220 PS34, Line 220: run_queries I suggest rename to __run_event_processing_tests to distinct from other method names at ImpalaTestSuite / CustomClusterTestSuite. http://gerrit.cloudera.org:8080/#/c/21031/34/tests/custom_cluster/test_events_custom_configs.py File tests/custom_cluster/test_events_custom_configs.py: http://gerrit.cloudera.org:8080/#/c/21031/34/tests/custom_cluster/test_events_custom_configs.py@1399 PS34, Line 1399: src_db = ImpalaTestSuite.get_random_name(unique_database) + "_imp_src" : target_db = ImpalaTestSuite.get_random_name(unique_database) + "_imp_target" FYI, unique_database fixture has a way to declare/asking for more than 1 new unique database to create before test start. https://github.com/apache/impala/blob/dc9066e7b29208e6dfdf8276a56c9045bd859637/tests/conftest.py#L319-L340 unique_database fixture is convenient because it cleans up the temporary databases once test is done, whether test is success or failed. So technically, you can annotate like this: @UniqueDatabase.parametrize(num_dbs=4, sync_ddl=True) def test_rename_table_hep_two_db_and_table_threads(self, unique_database): impala_src_db = unique_database impala_target_db = unique_database + '2' hive_src_db = unique_database + '3' hive_target_db = unique_database + '4' Same comment for test_rename_table_hep_one_db_and_two_table_threads. http://gerrit.cloudera.org:8080/#/c/21031/34/tests/custom_cluster/test_events_custom_configs.py@1445 PS34, Line 1445: rename_with_hierarchical_event_process I think 'rename_table_with_hierarchical_event_process' is more accurate? -- To view, visit http://gerrit.cloudera.org:8080/21031 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I76d8a739f9db6d40f01028bfd786a85d83f9e5d6 Gerrit-Change-Number: 21031 Gerrit-PatchSet: 34 Gerrit-Owner: Anonymous Coward <[email protected]> Gerrit-Reviewer: Anonymous Coward <[email protected]> Gerrit-Reviewer: Anonymous Coward <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Reviewer: Sai Hemanth Gantasala <[email protected]> Gerrit-Comment-Date: Tue, 21 Jan 2025 22:34:12 +0000 Gerrit-HasComments: Yes
