Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/11053 )
Change subject: IMPALA-7203. Support UDFs in LocalCatalog ...................................................................... Patch Set 3: (6 comments) http://gerrit.cloudera.org:8080/#/c/11053/3/fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java File fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java: http://gerrit.cloudera.org:8080/#/c/11053/3/fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java@75 PS3, Line 75: If the value is 'null', indicates that the function exists but : * has not been loaded yet. seems to have been replaced by the bit in FunctionOverloads that specifies if loading is needed. perhaps stale? http://gerrit.cloudera.org:8080/#/c/11053/3/fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java@190 PS3, Line 190: private void loadFunctionNames() { lemme see if I've got this correct. each query has its own LocalCatalog, therefore its own LocalDb. the provider has a lifetime that spans all LocalCatalogs. As a result, any query that uses any udf will wind up fetching all function names. a caching provider can help with this. http://gerrit.cloudera.org:8080/#/c/11053/3/fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java: http://gerrit.cloudera.org:8080/#/c/11053/3/fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java@64 PS3, Line 64: Load Retrieve? Unclear if this has a side-effect. http://gerrit.cloudera.org:8080/#/c/11053/3/fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java@69 PS3, Line 69: from the HMS remove since its specific for the provider implementation? http://gerrit.cloudera.org:8080/#/c/11053/3/fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java@71 PS3, Line 71: org.apache.hadoop.hive.metastore.api. remove the fully qualified name for consistency with the other api's here. http://gerrit.cloudera.org:8080/#/c/11053/3/fe/src/main/java/org/apache/impala/util/FunctionUtils.java File fe/src/main/java/org/apache/impala/util/FunctionUtils.java: http://gerrit.cloudera.org:8080/#/c/11053/3/fe/src/main/java/org/apache/impala/util/FunctionUtils.java@84 PS3, Line 84: // TODO(todd): cache these jars based on the mtime and file ID of the : // remote JAR? Can we share a cache with the backend? > agreed, we should re-use FeSupport CacheJar and related methods. however, a clarification.... the code that was refactored seems to be used on the catalogd end. the localdb otoh lives in the impalad. in that case, there's a lib-cache available so we should use it so that we avoid copying jars around multiple times to the same host when functions are used in the frontend and the backend. this redundancy will only be there when an impalad is both a coordinator and an executor. while this should be less likely, the combo is still supported. -- To view, visit http://gerrit.cloudera.org:8080/11053 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6130d07b9c641525382a618a9f8da048c7ae75ed Gerrit-Change-Number: 11053 Gerrit-PatchSet: 3 Gerrit-Owner: Todd Lipcon <t...@apache.org> Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Tianyi Wang <tw...@cloudera.com> Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-Reviewer: Vuk Ercegovac <vercego...@cloudera.com> Gerrit-Comment-Date: Fri, 03 Aug 2018 16:41:51 +0000 Gerrit-HasComments: Yes