Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 )
Change subject: IMPALA-10613: Standup HMS thrift server in Catalog ...................................................................... Patch Set 3: (23 comments) http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java File fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java: http://gerrit.cloudera.org:8080/#/c/17244/3/fe/src/main/java/org/apache/impala/catalog/CatalogHMSAPIHelper.java@141 PS3, Line 141: deepCopy > I think this won't have significant impact on performance. But if we do so, I think what you are saying makes sense. May be a better way for now is to just get rid of the deepCopy and add a comment here to make sure in case this assumption changes in the future, we should make a copy of the table/partition before modifying it here. http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java File fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java: http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@91 PS4, Line 91: > I am curious about this value..should this depend on the number of files fo Yes, ideally it should be proportional to the number of directories. We don't have good heuristics on how much do we typically take to load a given number of directories. I guess it also would be different for different filesystems. Right now this is only for informational purposes. I can add a TODO for this as of now if that is okay with you. http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@91 PS4, Line 91: > nit: usually we append _MS if the time units is ms. It is done for the exte renamed to _MS http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@126 PS4, Line 126: > The comment above says to check the processor engine is set to Impala but t Added a check to make sure that the engine is set to Impala. http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@135 PS4, Line 135: > For transactional tables, should we have a precondition that the ValidWrite Good point. However, at this point we don't know if the table is transactional or not. I added a check later down below to make sure that if the table is transactional, we have a ValidWriteIdList for it in the request. http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@205 PS4, Line 205: > nit: the defaultCatalog is missing in the params. Done http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@225 PS4, Line 225: > nit: move this TODO to line 275. This method is different way to get a list of partitions which match a given partition expression. In case of getPartitionsByNames the client is expected to provide list of names while in this case the client provides a expression like partKey != null && partKey > 10. The method first retrieves all the partitions and then returns the ones which match the expression below. http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@299 PS4, Line 299: : : > nit: one line Done http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@312 PS4, Line 312: : : > nit: one line Done http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@365 PS4, Line 365: > Need to update this as well. Done http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@414 PS4, Line 414: : : > nit: one line Done http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@491 PS4, Line 491: : : > nit: one line Done http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3318 PS4, Line 3318: } > nit: keep the blank line Done http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHmsClientUtils.java File fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHmsClientUtils.java: http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHmsClientUtils.java@50 PS4, Line 50: > nit: remove 'the' Done http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java File fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java: http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java@57 PS4, Line 57: //TODO add > It says TODO here but if --enable_catalogd_hms_cache is true, this patch do Thanks for catching this. The comment was stale and I updated it. http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java@105 PS4, Line 105: public void preServe() { : : } > nit: one line Done http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java@182 PS4, Line 182: TODO Add SASL and ssl support > File an upstream JIRA for this? Done http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java@205 PS4, Line 205: //TODO add config for this? : boolean useCompactProtocol = false; > Will setting this to true help on large tables? Do you plan to file an upst Yes, I can't find documentation of how much improvement TCompactProtocol makes over TBinaryProtocol. I can see some references where users say there is 20-25% reduction in the payload sizes but it might be very workload dependent. Nevertheless, having it configurable makes sense. Created IMPALA-10639 http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java@250 PS4, Line 250: metaserver > nit: At other places you have used 'metastore server' or 'metastore service Done http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java@265 PS4, Line 265: * @return > nit: remove this Done http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java File fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java: http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@146 PS4, Line 146: import org.apache.hadoop.hive.metastore.api.MaxAllocatedTableWriteIdRequest; > This doesn't do any logging of the request (unlike the get_partition_by_exp Thanks for catching that. Added similar logging. http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java File fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java: http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@1342 PS4, Line 1342: getPartitionsByNamesRequest.getTbl_name(), > line too long (93 > 90) Done http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/test/java/org/apache/impala/catalog/metastore/EnableCatalogdHmsCacheFlagTest.java File fe/src/test/java/org/apache/impala/catalog/metastore/EnableCatalogdHmsCacheFlagTest.java: http://gerrit.cloudera.org:8080/#/c/17244/4/fe/src/test/java/org/apache/impala/catalog/metastore/EnableCatalogdHmsCacheFlagTest.java@25 PS4, Line 25: > line too long (94 > 90) Done -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 3 Gerrit-Owner: Vihang Karajgaonkar <vih...@cloudera.com> Gerrit-Reviewer: Aman Sinha <amsi...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Vihang Karajgaonkar <vih...@cloudera.com> Gerrit-Comment-Date: Tue, 06 Apr 2021 21:19:09 +0000 Gerrit-HasComments: Yes