Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/20574 )
Change subject: IMPALA-7131: Support external data sources in LocalCatalog mode ...................................................................... Patch Set 5: (8 comments) http://gerrit.cloudera.org:8080/#/c/20574/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/20574/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3664 PS5, Line 3664: Lists.newArrayList() nit: use Collections.emptyList() to avoid creating new array objects. http://gerrit.cloudera.org:8080/#/c/20574/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3675 PS5, Line 3675: Lists.newArrayList() nit: use Collections.emptyList() http://gerrit.cloudera.org:8080/#/c/20574/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/20574/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@671 PS3, Line 671: if (dsName == null) dsName = ""; : req.object_desc.setData_source(new TDataSource(dsName, "", "", "")); : return req; : } : : @Override : public Database loadDb(final String dbName) throws TException { : return loadWithCaching("database metadata for " + dbName, : DB_METADATA_STATS_CATEGORY, > Changed code as suggested. Ack http://gerrit.cloudera.org:8080/#/c/20574/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@1233 PS3, Line 1233: TGetPartialCatalogObjectResponse resp = sendRequest(req); > It's unused now. Plan to add more fields in TDataSource object after making I see. I think the DataSource name list could be inconsistent with the DataSource object list since they are fetched at different time. E.g. the cache might have a name list of 3 items but the object list is stale and just have 2 items. We might need additional invalidation across them, i.e. when fetched the name list, invalidate the object list, and vice versus. If we won't add too many large fields in TDataSource object, maybe using one list is simpler. E.g. just like how we cache functions, we cache the function name list of each db and cache each individual function object. http://gerrit.cloudera.org:8080/#/c/20574/5/fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java File fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java: http://gerrit.cloudera.org:8080/#/c/20574/5/fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java@232 PS5, Line 232: LOG.info("Unable to load DataSource objects, " + e.getMessage()); nit: the stacktrace might be helpful, so let's log it as LOG.info("Unable to load DataSource objects", e); http://gerrit.cloudera.org:8080/#/c/20574/5/fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java@244 PS5, Line 244: LOG.info("DataSource not found: " + dsName + ", " + e.getMessage()); nit: log the stacktrace LOG.info("DataSource not found: " + dsName, e); http://gerrit.cloudera.org:8080/#/c/20574/5/testdata/workloads/functional-query/queries/QueryTest/data-source-tables.test File testdata/workloads/functional-query/queries/QueryTest/data-source-tables.test: http://gerrit.cloudera.org:8080/#/c/20574/5/testdata/workloads/functional-query/queries/QueryTest/data-source-tables.test@5 PS5, Line 5: CREATE DATA SOURCE AllTypesDataSource I see this is created in testdata/bin/create-ext-data-source-table.sql. Are we re-creating it to test creating data sources? https://github.com/apache/impala/blob/c244aadcf367360e52807a84e7fba8b6237651fd/testdata/bin/create-ext-data-source-table.sql#L24 http://gerrit.cloudera.org:8080/#/c/20574/5/testdata/workloads/functional-query/queries/QueryTest/data-source-tables.test@184 PS5, Line 184: DROP DATA SOURCE AllTypesDataSource; Wouldn't this impact other tests that depend on this data source? Or is this the only test that use it? -- To view, visit http://gerrit.cloudera.org:8080/20574 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I40841c9be9064ac67771c4d3f5acbb3b552a2e55 Gerrit-Change-Number: 20574 Gerrit-PatchSet: 5 Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com> Gerrit-Reviewer: Abhishek Rawat <ara...@cloudera.com> Gerrit-Reviewer: Anonymous Coward <gsi...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com> Gerrit-Comment-Date: Wed, 25 Oct 2023 06:45:21 +0000 Gerrit-HasComments: Yes