Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23155 )

Change subject: IMPALA-14074: Warmup metadata cache in catalogd for critical 
tables
......................................................................


Patch Set 4:

(12 comments)

Addressed the comments and add support for

* wildcards like "tpch.*".
* ignore comments in config file
* schedule table loading based on their order in the config file.

http://gerrit.cloudera.org:8080/#/c/23155/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/23155/2//COMMIT_MSG@34
PS2, Line 34: The flag is --warmup_tables_config_file.
Yeah, table list could be huge so using a config file is better. I'll add the 
wildcard support to specify all tables in a db.

> It looks like we're only supporting loading tables, what about functions?

Functions are always loaded and there is no way to invalidate a function from 
the cache.


http://gerrit.cloudera.org:8080/#/c/23155/2/be/src/catalog/catalog-server.cc
File be/src/catalog/catalog-server.cc:

http://gerrit.cloudera.org:8080/#/c/23155/2/be/src/catalog/catalog-server.cc@298
PS2, Line 298: keeps_warmup_tables_loaded
> Create validator for this flag against invalidate_tables_on_memory_pressure
It depends on the heap size and how large of the memory the warmup tables need. 
If users are confident that JVM heap size is large enough, they can turn this 
on.


http://gerrit.cloudera.org:8080/#/c/23155/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/23155/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3325
PS2, Line 3325:           msDb = msClient.getHiveClient().getDatabase(dbName);
              :           catalogTimeline.markEvent(FETCHED_HMS_DB);
              :           Preconditions.checkNotNull(msDb);
              :           addDb(dbName, msDb);
              :
> Create common function for both invalidateTable() and invalidateTableIfExis
Done


http://gerrit.cloudera.org:8080/#/c/23155/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/23155/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2613
PS3, Line 2613:         LOG.info("Scheduled warmup on table {}.{}", 
tTblName.db_name,
> line too long (91 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/23155/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3351
PS3, Line 3351:       // The database should always have a lower catalog 
version than the table because
> line too long (113 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/23155/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3390
PS3, Line 3390:     if (loadInBackground_
> line too long (113 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/23155/2/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
File fe/src/main/java/org/apache/impala/common/FileSystemUtil.java:

http://gerrit.cloudera.org:8080/#/c/23155/2/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@1281
PS2, Line 1281: line = line.trim();
> skip a line starting with '#' so that user could temporarily comment out a
Good point! Done.


http://gerrit.cloudera.org:8080/#/c/23155/2/fe/src/test/java/org/apache/impala/common/FileSystemUtilTest.java
File fe/src/test/java/org/apache/impala/common/FileSystemUtilTest.java:

http://gerrit.cloudera.org:8080/#/c/23155/2/fe/src/test/java/org/apache/impala/common/FileSystemUtilTest.java@243
PS2, Line 243: for (String table : tables) {
> could we read table names from warmup_table_list.txt? otherwise this tables
We need to parse the config file which is actually the code we are testing 
here. So I think it's OK to hard-code the results.


http://gerrit.cloudera.org:8080/#/c/23155/2/testdata/bin/create-load-data.sh
File testdata/bin/create-load-data.sh:

http://gerrit.cloudera.org:8080/#/c/23155/2/testdata/bin/create-load-data.sh@229
PS2, Line 229:
This should be ${TMP_DIR}/warmup_table_list.txt. The wrong patch causes test 
failures in https://jenkins.impala.io/job/gerrit-verify-dryrun/12138/


http://gerrit.cloudera.org:8080/#/c/23155/2/tests/common/impala_test_suite.py
File tests/common/impala_test_suite.py:

http://gerrit.cloudera.org:8080/#/c/23155/2/tests/common/impala_test_suite.py@612
PS2, Line 612:   def get_var_current_val(self, var):
             :     """Returns the current value of a given Impalad flag 
variable."""
             :     # Parse the /varz endpoint to get the flag information.
             :     varz = self.get_debug_page(VARZ_URL)
             :     assert 'flags' in varz.keys()
             :     filtered_varz = [flag for flag in varz['flags'] if 
flag['name'] == var]
             :     assert len(filtered_varz) == 1
             :     assert 'current' in filtered_varz[0].keys()
             :     return filtered_varz[0]['current'].strip()
> Can you embed this into CatalogdService in impala_cluster.py?
Good point! Done.


http://gerrit.cloudera.org:8080/#/c/23155/2/tests/custom_cluster/test_catalogd_ha.py
File tests/custom_cluster/test_catalogd_ha.py:

http://gerrit.cloudera.org:8080/#/c/23155/2/tests/custom_cluster/test_catalogd_ha.py@533
PS2, Line 533: "
> use FILESYSTEM_PREFIX ?
Good catch!


http://gerrit.cloudera.org:8080/#/c/23155/2/tests/custom_cluster/test_restart_services.py
File tests/custom_cluster/test_restart_services.py:

http://gerrit.cloudera.org:8080/#/c/23155/2/tests/custom_cluster/test_restart_services.py@1132
PS2, Line 1132:
> What if test environment is against Ozone or S3?
Replaced this with FILESYSTEM_PREFIX.



--
To view, visit http://gerrit.cloudera.org:8080/23155
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d09eae1f12a8acd2de945984d956d11eeee1ab6
Gerrit-Change-Number: 23155
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Abhishek Rawat <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Reviewer: Wenzhe Zhou <[email protected]>
Gerrit-Comment-Date: Fri, 11 Jul 2025 12:03:00 +0000
Gerrit-HasComments: Yes

Reply via email to