Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/22640 )
Change subject: IMPALA-13850 (part 2): Implement in-place reset for CatalogD ...................................................................... Patch Set 8: (7 comments) http://gerrit.cloudera.org:8080/#/c/22640/8/fe/src/main/java/org/apache/impala/catalog/Catalog.java File fe/src/main/java/org/apache/impala/catalog/Catalog.java: http://gerrit.cloudera.org:8080/#/c/22640/8/fe/src/main/java/org/apache/impala/catalog/Catalog.java@90 PS8, Line 90: Uses an AtomicReference so reset() : // operations can atomically swap dbCache_ references. > nit: stale comment Ack. Will remove. http://gerrit.cloudera.org:8080/#/c/22640/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/22640/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1402 PS8, Line 1402: (db1, db2) -> db1.getName().compareTo(db2.getName()) > nit: can be simplified to Ack. Will do. http://gerrit.cloudera.org:8080/#/c/22640/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1420 PS8, Line 1420: waitOngoingResetMetadata(nextDb.getName()); > If reset() runs long, getCatalogDelta() still waits that long and coordinat Not necessarily true. Remember that this Iterable initialized from snapshot list obtained from getAllDbs(). If the very first reset() happen to populate dbCache_ just from Db 'default' to Db 'functional', then temporarily release lock, then followed by getCatalogDelta() taking snapshot, that snapshot will only contain ['default' ... 'functional']. The first getCatalogDelta() will not iterate beyond Db 'functional'. If the at second reset(), it managed to populate dbCache_ up to Db 'tpcds', then the second getCatalogDelta() will work over ['default' ... 'tpcds'] snapshot only. Granted, it might not see Db 'zzzz' until the whole reset() complete, and the subsequent getCatalogDelta() after full reset will wait the longest (because it will see full snapshot). But it won't block initial catalog topic update for too long. You can observe this in active CatalogD log by running test_catalogd_ha_with_two_catalogd and lower --reset_metadata_lock_duration_ms. http://gerrit.cloudera.org:8080/#/c/22640/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2466 PS8, Line 2466: resetMetadataCondition_.signalAll(); > The current thread is still holding the write lock at this point. Will this This should be OK, as long as signal() is followed by either await() or unlock(). Here is the official example: https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/locks/Condition.html I choose to unlock here because I want to evaluate a debug action in-between when not holding the lock. http://gerrit.cloudera.org:8080/#/c/22640/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@4400 PS8, Line 4400: resp.catalog_info.db_names = ImmutableList.copyOf(dbCache_.keySet()); > Some dbs might be missing here if reset() runs long and hasn't added them i True. But similar like initial getCatalogDelta() scenario, how is it different from situation where Db is being added/removed after this method return? http://gerrit.cloudera.org:8080/#/c/22640/8/fe/src/main/java/org/apache/impala/util/DebugUtils.java File fe/src/main/java/org/apache/impala/util/DebugUtils.java: http://gerrit.cloudera.org:8080/#/c/22640/8/fe/src/main/java/org/apache/impala/util/DebugUtils.java@111 PS8, Line 111: lable > nit: "label" Ack. Will edit. http://gerrit.cloudera.org:8080/#/c/22640/8/tests/custom_cluster/test_catalogd_ha.py File tests/custom_cluster/test_catalogd_ha.py: http://gerrit.cloudera.org:8080/#/c/22640/8/tests/custom_cluster/test_catalogd_ha.py@285 PS8, Line 285: @UniqueDatabase.parametrize(name_prefix='aa_test_catalogd_auto_failover_slow') > Can we add another test using a db name that is the last one to be loaded? Ack. Will do. -- To view, visit http://gerrit.cloudera.org:8080/22640 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib4ae2154612746b34484391c5950e74b61f85c9d Gerrit-Change-Number: 22640 Gerrit-PatchSet: 8 Gerrit-Owner: Riza Suminto <[email protected]> Gerrit-Reviewer: Anonymous Coward <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Reviewer: Sai Hemanth Gantasala <[email protected]> Gerrit-Comment-Date: Tue, 01 Apr 2025 03:09:00 +0000 Gerrit-HasComments: Yes
