[Impala-ASF-CR] IMPALA-10637: Fixes bug in ValidWriteIdList comparison
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17276 ) Change subject: IMPALA-10637: Fixes bug in ValidWriteIdList comparison .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/17276 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idaa4bcdbda1757a6451122efc505d1d483c879cc Gerrit-Change-Number: 17276 Gerrit-PatchSet: 4 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Anonymous Coward (646) Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 08 Apr 2021 06:08:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10643: Allow the inclusion of jetty-client
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17287 ) Change subject: IMPALA-10643: Allow the inclusion of jetty-client .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8523/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17287 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9087b7e6866f1500c66f42a74b3f8619e82c3bda Gerrit-Change-Number: 17287 Gerrit-PatchSet: 1 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Thu, 08 Apr 2021 04:17:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10643: Allow the inclusion of jetty-client
Fang-Yu Rao has posted comments on this change. ( http://gerrit.cloudera.org:8080/17287 ) Change subject: IMPALA-10643: Allow the inclusion of jetty-client .. Patch Set 1: Hi Joe, please review this patch and let me know if there is any suggestion or comment. Thank you very much! -- To view, visit http://gerrit.cloudera.org:8080/17287 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9087b7e6866f1500c66f42a74b3f8619e82c3bda Gerrit-Change-Number: 17287 Gerrit-PatchSet: 1 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Thu, 08 Apr 2021 03:57:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10643: Allow the inclusion of jetty-client
Fang-Yu Rao has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17287 Change subject: IMPALA-10643: Allow the inclusion of jetty-client .. IMPALA-10643: Allow the inclusion of jetty-client We excluded all artifacts under the group of org.eclipse.jetty for ranger-plugins-audit when bumping up CDP_BUILD_NUMBER to 4493826, which is too stringent in that the artifact of jetty-client is actually required by ranger-plugins-audit. Excluding jetty-client would thus result in a NoClassDefFoundError at runtime. This patch removes the block that excluded all artifacts of org.eclipse.jetty when the dependency of ranger-plugins-audit is added, which allows ranger-plugins-audit to pull in jetty-client, which in turn pulls in jetty-http and jetty-io. In this regard, we also add these three artifacts as allowed dependencies because all artifacts under org.eclipse.jetty are banned in the section of bannedDependencies. Testing: - Verified in a local development environment that Impala could build and that jetty-client-9.4.31.v20200723.jar is indeed on the class path in fe/target/build-classpath.txt. Change-Id: I9087b7e6866f1500c66f42a74b3f8619e82c3bda --- M fe/pom.xml 1 file changed, 8 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/17287/1 -- To view, visit http://gerrit.cloudera.org:8080/17287 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I9087b7e6866f1500c66f42a74b3f8619e82c3bda Gerrit-Change-Number: 17287 Gerrit-PatchSet: 1 Gerrit-Owner: Fang-Yu Rao
[Impala-ASF-CR] IMPALA-9997/IMPALA-9998: Upgrade compression libraries to latest versions
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17254 ) Change subject: IMPALA-9997/IMPALA-9998: Upgrade compression libraries to latest versions .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8522/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17254 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I858f82f773023bd0aea14543f18bd74071758468 Gerrit-Change-Number: 17254 Gerrit-PatchSet: 5 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Thu, 08 Apr 2021 03:32:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9997/IMPALA-9998: Upgrade compression libraries to latest versions
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17254 ) Change subject: IMPALA-9997/IMPALA-9998: Upgrade compression libraries to latest versions .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7053/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/17254 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I858f82f773023bd0aea14543f18bd74071758468 Gerrit-Change-Number: 17254 Gerrit-PatchSet: 5 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Thu, 08 Apr 2021 03:13:03 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9997/IMPALA-9998: Upgrade compression libraries to latest versions
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/17254 ) Change subject: IMPALA-9997/IMPALA-9998: Upgrade compression libraries to latest versions .. Patch Set 5: Code-Review+2 Went back to the toolchain without JWT. Carrying +2 -- To view, visit http://gerrit.cloudera.org:8080/17254 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I858f82f773023bd0aea14543f18bd74071758468 Gerrit-Change-Number: 17254 Gerrit-PatchSet: 5 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Thu, 08 Apr 2021 03:12:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10629: Fix parquet compression codecs for data load scripts
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/17259 ) Change subject: IMPALA-10629: Fix parquet compression codecs for data load scripts .. Patch Set 7: Code-Review+2 Carry +2 -- To view, visit http://gerrit.cloudera.org:8080/17259 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1a346de3e5c4e38328e5a8ce8162697b7dd6553a Gerrit-Change-Number: 17259 Gerrit-PatchSet: 7 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Thu, 08 Apr 2021 03:12:20 + Gerrit-HasComments: No
[native-toolchain-CR] IMPALA-9997/IMPALA-9998: Upgrade compression libraries to latest versions
Joe McDonnell has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17277 ) Change subject: IMPALA-9997/IMPALA-9998: Upgrade compression libraries to latest versions .. IMPALA-9997/IMPALA-9998: Upgrade compression libraries to latest versions This updates several compression libraries to their latest versions: - Bzip2 1.0.8 - LZ4 1.9.3 - Snappy 1.1.8 - Zlib 1.2.11 - ZStd 1.4.9 Several of these have minor performance improvements. Change-Id: I8db549dd5d5778c39e8a48ad378c652a81f968ca Reviewed-on: http://gerrit.cloudera.org:8080/17277 Reviewed-by: Joe McDonnell Tested-by: Joe McDonnell --- M buildall.sh A source/bzip2/bzip2-1.0.8-patches/001-adjust-makefile-to-env-variables.diff A source/bzip2/bzip2-1.0.8-patches/002-directoryless-executable-symlinks.diff M source/lz4/build.sh M source/snappy/build.sh M source/zstd/build.sh 6 files changed, 125 insertions(+), 17 deletions(-) Approvals: Joe McDonnell: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/17277 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I8db549dd5d5778c39e8a48ad378c652a81f968ca Gerrit-Change-Number: 17277 Gerrit-PatchSet: 3 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Joe McDonnell
[native-toolchain-CR] IMPALA-9997/IMPALA-9998: Upgrade compression libraries to latest versions
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/17277 ) Change subject: IMPALA-9997/IMPALA-9998: Upgrade compression libraries to latest versions .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/17277 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8db549dd5d5778c39e8a48ad378c652a81f968ca Gerrit-Change-Number: 17277 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Thu, 08 Apr 2021 03:12:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9997/IMPALA-9998: Upgrade compression libraries to latest versions
Hello Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17254 to look at the new patch set (#5). Change subject: IMPALA-9997/IMPALA-9998: Upgrade compression libraries to latest versions .. IMPALA-9997/IMPALA-9998: Upgrade compression libraries to latest versions This updates several compression libraries to their latest versions: - Bzip2 1.0.8 - LZ4 1.9.3 - Snappy 1.1.8 - Zlib 1.2.11 - ZStd 1.4.9 Several of these claim minor performance improvements. Testing: - Ran release exhaustive job and debug core job - Ran TPC-H scale 42 with Parquet/Snappy and Parquet/ZSTD. (ZSTD tests ran with default compression level.) Parquet/Snappy was unchanged. Parquet/ZSTD improved: +--++-++++ | Workload | File Format| Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) | +--++-++++ | TPCH(42) | parquet / zstd / block | 8.50| -2.10% | 5.46 | -2.63% | +--++-++++ Change-Id: I858f82f773023bd0aea14543f18bd74071758468 --- M be/src/util/compress.cc M be/src/util/decompress.cc M bin/impala-config.sh 3 files changed, 6 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/17254/5 -- To view, visit http://gerrit.cloudera.org:8080/17254 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I858f82f773023bd0aea14543f18bd74071758468 Gerrit-Change-Number: 17254 Gerrit-PatchSet: 5 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell
[native-toolchain-CR] IMPALA-10488: Add jwt-cpp 0.5.0 to the toolchain
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/17118 ) Change subject: IMPALA-10488: Add jwt-cpp 0.5.0 to the toolchain .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/17118/3/source/jwt-cpp/build.sh File source/jwt-cpp/build.sh: http://gerrit.cloudera.org:8080/#/c/17118/3/source/jwt-cpp/build.sh@33 PS3, Line 33: # jwt-cpp is currently header-only, so it really is only copying files around > Should we add header only libraries to native toolchain? I think it is fairly harmless for native toolchain to have header only libraries. There are some nice properties about doing this through the native toolchain: - native toolchain builds on all the Linux versions. If the header-only projects have validations through CMake, then those validations confirm that there isn't anything incompatible with all the Linux versions we support. - The version of the library is clear. Any patches we do on top of it are kept separately. I think I prefer this to the be/src/thirdparty approach. I think there are options that may get the best of both. There are some CMake extensions that can go fetch a project as of a particular revision and build it (see https://cmake.org/cmake/help/latest/module/ExternalProject.html ). If we had something like that working in the Impala build system, I would use it. I'm going to split this off from the stack with the compression changes since I don't need it yet. -- To view, visit http://gerrit.cloudera.org:8080/17118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I77aa3b36b45e8ef3c2d7873327948197c2c65d11 Gerrit-Change-Number: 17118 Gerrit-PatchSet: 3 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Thu, 08 Apr 2021 03:10:22 + Gerrit-HasComments: Yes
[native-toolchain-CR] IMPALA-9997/IMPALA-9998: Upgrade compression libraries to latest versions
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/17277 ) Change subject: IMPALA-9997/IMPALA-9998: Upgrade compression libraries to latest versions .. Patch Set 2: Code-Review+2 Pulled this change out from the stack that had the JWT change. Nothing else changed, carrying +2 -- To view, visit http://gerrit.cloudera.org:8080/17277 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8db549dd5d5778c39e8a48ad378c652a81f968ca Gerrit-Change-Number: 17277 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Thu, 08 Apr 2021 02:50:28 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10645: Log catalogd HMS API metrics
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17284 ) Change subject: IMPALA-10645: Log catalogd HMS API metrics .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8521/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17284 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id41afe89bbe3395c158919bddd09f302c6752287 Gerrit-Change-Number: 17284 Gerrit-PatchSet: 2 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 08 Apr 2021 01:32:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10645: Log catalogd HMS API metrics
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17284 ) Change subject: IMPALA-10645: Log catalogd HMS API metrics .. Patch Set 2: (11 comments) http://gerrit.cloudera.org:8080/#/c/17284/2/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java File fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java: http://gerrit.cloudera.org:8080/#/c/17284/2/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@142 PS2, Line 142: catalog, reqBuilder.build(), dbName, tblName, HmsApiNameEnum.GET_TABLE_REQ.apiName()); line too long (94 > 90) http://gerrit.cloudera.org:8080/#/c/17284/2/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@245 PS2, Line 245: catalogReq.build(), dbName, tblName, HmsApiNameEnum.GET_PARTITION_BY_EXPR.apiName()); line too long (93 > 90) http://gerrit.cloudera.org:8080/#/c/17284/2/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@367 PS2, Line 367: requestBuilder.build(), dbName, tblName, HmsApiNameEnum.GET_PARTITION_BY_NAMES.apiName()); line too long (98 > 90) http://gerrit.cloudera.org:8080/#/c/17284/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/17284/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2221 PS2, Line 2221: .getCounter(String.format(CatalogMetastoreServer.CATALOGD_CACHE_API_HIT_METRIC, reason)) line too long (106 > 90) http://gerrit.cloudera.org:8080/#/c/17284/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2231 PS2, Line 2231: // Update the cache miss metric, as the valid write id list did not match and we have to reload the table. line too long (114 > 90) http://gerrit.cloudera.org:8080/#/c/17284/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2233 PS2, Line 2233: .getCounter(String.format(CatalogMetastoreServer.CATALOGD_CACHE_API_MISS_METRIC, reason)) line too long (105 > 90) http://gerrit.cloudera.org:8080/#/c/17284/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/17284/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2067 PS2, Line 2067: Counter misses = CatalogMonitor.INSTANCE.getCatalogdHmsCacheMetrics().getCounter(FILEMETADATA_CACHE_MISS_METRIC); line too long (117 > 90) http://gerrit.cloudera.org:8080/#/c/17284/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2068 PS2, Line 2068: Counter hits = CatalogMonitor.INSTANCE.getCatalogdHmsCacheMetrics().getCounter(FILEMETADATA_CACHE_HIT_METRIC); line too long (114 > 90) http://gerrit.cloudera.org:8080/#/c/17284/2/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java File fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java: http://gerrit.cloudera.org:8080/#/c/17284/2/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@122 PS2, Line 122: LOG.info(String.format(HMS_FALLBACK_MSG_FORMAT, HmsApiNameEnum.GET_PARTITION_BY_EXPR.apiName(), tblName)); line too long (110 > 90) http://gerrit.cloudera.org:8080/#/c/17284/2/fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java File fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java: http://gerrit.cloudera.org:8080/#/c/17284/2/fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java@21 PS2, Line 21: * HmsApiNameEnum has list of names of HMS APIs that will be served from CatalogD HMS cache, line too long (92 > 90) http://gerrit.cloudera.org:8080/#/c/17284/2/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java File fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java: http://gerrit.cloudera.org:8080/#/c/17284/2/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@1304 PS2, Line 1304: LOG.info(String.format(HMS_FALLBACK_MSG_FORMAT, HmsApiNameEnum.GET_PARTITION_BY_NAMES.apiName(), tblName)); line too long (111 > 90) -- To view, visit http://gerrit.cloudera.org:8080/17284 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id41afe89bbe3395c158919bddd09f302c6752287 Gerrit-Change-Number: 17284 Gerrit-PatchSet: 2 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 08 Apr 2021 01:13:32 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10645: Log catalogd HMS API metrics
Vihang Karajgaonkar has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17284 Change subject: IMPALA-10645: Log catalogd HMS API metrics .. IMPALA-10645: Log catalogd HMS API metrics Expose rpc duration, cache hit ratio, etc for Catalogd HMS APIs. The metrics currently are only logged at trace level. A followup will be done separately to expose them to the debug UI. This patch was originally contributed by Kishen Das. Testing: 1. Deployed the catalogd's metastore server and made sure that the metrics are logged in the catalogd.INFO logs. Change-Id: Id41afe89bbe3395c158919bddd09f302c6752287 --- M common/thrift/JniCatalog.thrift M common/thrift/metrics.json M fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java A fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java M fe/src/main/java/org/apache/impala/catalog/metastore/ICatalogMetastoreServer.java M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java M fe/src/main/java/org/apache/impala/catalog/metastore/NoOpCatalogMetastoreServer.java M fe/src/main/java/org/apache/impala/catalog/monitor/CatalogMonitor.java 12 files changed, 481 insertions(+), 46 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/17284/2 -- To view, visit http://gerrit.cloudera.org:8080/17284 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Id41afe89bbe3395c158919bddd09f302c6752287 Gerrit-Change-Number: 17284 Gerrit-PatchSet: 2 Gerrit-Owner: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-10637: Fixes bug in ValidWriteIdList comparison
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17276 ) Change subject: IMPALA-10637: Fixes bug in ValidWriteIdList comparison .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17276 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idaa4bcdbda1757a6451122efc505d1d483c879cc Gerrit-Change-Number: 17276 Gerrit-PatchSet: 4 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Anonymous Coward (646) Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 08 Apr 2021 01:01:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Vihang Karajgaonkar has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. IMPALA-10613: Standup HMS thrift server in Catalog This change adds the basic infrastructure to start the HMS server in Catalog. It introduces a new configuration (--start_hms_server) along with a config for the port and starts a HMS thrift server in the CatalogServiceCatalog instance. Currently, all the HMS APIs are "pass-through" to the backing HMS service. Except for the following 3 HMS APIs which can be used to request a table and its partitions. Additionally, there is another flag (--enable_catalogd_hms_cache) which can be used to disable the usage of catalogd for providing the table and partition metadata. This contribution was done by Kishen Das. 1. get_table_req 2. get_partitions_by_expr 3. get_partitions_by_names In case of get_partitions_by_expr we need the hive-exec jar to be present in the classpath since it needs to load the PartitionExpressionProxy to push down the partition predicates to the HMS database. In case of get_table_req if column statistics are requested, we return the table level statistics. Additionally, this patch adds a new configuration fallback_to_hms_on_errors for the catalog which is used to determine if the Catalog falls back to HMS service in case of errors while executing the API. This is useful for testing purposes. In order to expose the file-metadata for the tables and partitions, HMS API changes were made to add the filemetadata fields to table and partitions. In case of transactional tables, the file-metadata which is returned is consistent with the provided ValidWriteIdList in the API call. There are a few TODOs which will be done in follow up tasks: 1. Add support for SASL support. 2. Pin the hive_metastore.thrift in the code so that any changes to HMS APIs in the hive branch doesn't break Catalog's HMS service. Testing: 1. Added a new end-to-end test which starts the HMS service in Catalog and runs some basic HMS APIs against it. 2. Ran a modification of TestRemoteHiveMetastore in the Hive code base and confirmed most tests are working. There were some test failures but they are unrelated since the test assumes an empty warehouse whereas we run against the actual HMS service running in the mini-cluster. Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Reviewed-on: http://gerrit.cloudera.org:8080/17244 Reviewed-by: Quanlong Huang Tested-by: Vihang Karajgaonkar --- M be/src/catalog/catalog-server.cc M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/CatalogService.thrift M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java A fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/GetPartialCatalogObjectRequestBuilder.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/Table.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHmsClientUtils.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java A fe/src/main/java/org/apache/impala/catalog/metastore/ICatalogMetastoreServer.java A fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java A fe/src/main/java/org/apache/impala/catalog/metastore/NoOpCatalogMetastoreServer.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsFileMetadataTest.java A fe/src/test/java/org/apache/impala/catalog/metastore/EnableCatalogdHmsCacheFlagTest.java M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java M tests/common/impala_test_suite.py A tests/custom_cluster/test_metastore_service.py 24 files changed, 5,398 insertions(+), 22 deletions(-) Approvals: Quanlong Huang: Looks good to me, approved Vihang Karajgaonkar: Verified -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 11 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 10: Verified+1 Carrying forward the Verified +1 vote from PS9 since there was only a new comment added to a test between PS9 and PS10. -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 10 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 08 Apr 2021 01:01:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 9: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 9 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 08 Apr 2021 00:46:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 10: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 10 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 08 Apr 2021 00:39:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10637: Fixes bug in ValidWriteIdList comparison
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17276 ) Change subject: IMPALA-10637: Fixes bug in ValidWriteIdList comparison .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7052/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/17276 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idaa4bcdbda1757a6451122efc505d1d483c879cc Gerrit-Change-Number: 17276 Gerrit-PatchSet: 4 Gerrit-Owner: Sourabh Goyal Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Anonymous Coward (646) Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sourabh Goyal Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 08 Apr 2021 00:22:40 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10637: Fixes bug in ValidWriteIdList comparison
soura...@cloudera.com has posted comments on this change. ( http://gerrit.cloudera.org:8080/17276 ) Change subject: IMPALA-10637: Fixes bug in ValidWriteIdList comparison .. Patch Set 4: Code-Review+1 Carry forwarding +1 from Kishen. -- To view, visit http://gerrit.cloudera.org:8080/17276 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idaa4bcdbda1757a6451122efc505d1d483c879cc Gerrit-Change-Number: 17276 Gerrit-PatchSet: 4 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Anonymous Coward (646) Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Thu, 08 Apr 2021 00:11:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10619: Minor refactoring of analytic function methods
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17237 ) Change subject: IMPALA-10619: Minor refactoring of analytic function methods .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8520/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17237 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I39e4268c0c5500f09acf98357a80763c28f615c2 Gerrit-Change-Number: 17237 Gerrit-PatchSet: 3 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Wed, 07 Apr 2021 23:03:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10619: Minor refactoring of analytic function methods
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/17237 ) Change subject: IMPALA-10619: Minor refactoring of analytic function methods .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/17237/2/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java File fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java: http://gerrit.cloudera.org:8080/#/c/17237/2/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java@673 PS2, Line 673: > line has trailing whitespace Done -- To view, visit http://gerrit.cloudera.org:8080/17237 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I39e4268c0c5500f09acf98357a80763c28f615c2 Gerrit-Change-Number: 17237 Gerrit-PatchSet: 3 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Wed, 07 Apr 2021 22:42:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10619: Minor refactoring of analytic function methods
Hello Quanlong Huang, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17237 to look at the new patch set (#3). Change subject: IMPALA-10619: Minor refactoring of analytic function methods .. IMPALA-10619: Minor refactoring of analytic function methods The FIRST_VALUE, LAST_VALUE functions go through standardization process in AnalyticExpr where they may be rewritten with different number of parameters or with different window frame. In order for an external FE to leverage this standardization, this patch creates a wrapper method for FunctionCallExpr creation and does minor refactoring. Also added accessor methods to AnalyticEvalNode and changed visibility of couple of methods in PlanNode for use by external FE. Testing: Ran PlannerTests. No new tests are added since this does not change the existing behavior. Change-Id: I39e4268c0c5500f09acf98357a80763c28f615c2 --- M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java 3 files changed, 18 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/17237/3 -- To view, visit http://gerrit.cloudera.org:8080/17237 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I39e4268c0c5500f09acf98357a80763c28f615c2 Gerrit-Change-Number: 17237 Gerrit-PatchSet: 3 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR] IMPALA-10619: Minor refactoring of analytic function methods
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/17237 ) Change subject: IMPALA-10619: Minor refactoring of analytic function methods .. Patch Set 2: (2 comments) Responses below. Also, in PS 2, I added 3 accessor methods and changed modifiers which were needed for analytic functions. http://gerrit.cloudera.org:8080/#/c/17237/1/fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java File fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java: http://gerrit.cloudera.org:8080/#/c/17237/1/fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java@280 PS1, Line 280: new FunctionCallExpr("if", ifParams) > Should this be wrapped as well? Pls see related response below. http://gerrit.cloudera.org:8080/#/c/17237/1/fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java@833 PS1, Line 833: protected FunctionCallExpr createRewrittenFunction(FunctionName funcName, > I'm ok if we don't have a better way. Just concerns that whether future cha Sorry, was distracted by some other issues. I did look into wrapping those 3 invocations you mentioned. The problem is those are static methods whereas the createRewrittenFunction() needs to be an instance method. Your point about future changes causing some breakage in the external frontend is valid. In the short term, the expectation is that we will run the tests ourselves to detect potential issues on a regular basis but in the medium term we do want to address it in a general manner such that code compilation itself would catch the issue instead of testing. -- To view, visit http://gerrit.cloudera.org:8080/17237 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I39e4268c0c5500f09acf98357a80763c28f615c2 Gerrit-Change-Number: 17237 Gerrit-PatchSet: 2 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Wed, 07 Apr 2021 22:40:31 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10619: Minor refactoring of analytic function methods
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17237 ) Change subject: IMPALA-10619: Minor refactoring of analytic function methods .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8519/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17237 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I39e4268c0c5500f09acf98357a80763c28f615c2 Gerrit-Change-Number: 17237 Gerrit-PatchSet: 2 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Wed, 07 Apr 2021 22:33:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10619: Minor refactoring of analytic function methods
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17237 ) Change subject: IMPALA-10619: Minor refactoring of analytic function methods .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/17237/2/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java File fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java: http://gerrit.cloudera.org:8080/#/c/17237/2/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java@673 PS2, Line 673: line has trailing whitespace -- To view, visit http://gerrit.cloudera.org:8080/17237 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I39e4268c0c5500f09acf98357a80763c28f615c2 Gerrit-Change-Number: 17237 Gerrit-PatchSet: 2 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Wed, 07 Apr 2021 22:14:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10619: Minor refactoring of analytic function methods
Hello Quanlong Huang, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17237 to look at the new patch set (#2). Change subject: IMPALA-10619: Minor refactoring of analytic function methods .. IMPALA-10619: Minor refactoring of analytic function methods The FIRST_VALUE, LAST_VALUE functions go through standardization process in AnalyticExpr where they may be rewritten with different number of parameters or with different window frame. In order for an external FE to leverage this standardization, this patch creates a wrapper method for FunctionCallExpr creation and does minor refactoring. Also added accessor methods to AnalyticEvalNode and changed visibility of couple of methods in PlanNode for use by external FE. Testing: Ran PlannerTests. No new tests are added since this does not change the existing behavior. Change-Id: I39e4268c0c5500f09acf98357a80763c28f615c2 --- M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java 3 files changed, 18 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/17237/2 -- To view, visit http://gerrit.cloudera.org:8080/17237 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I39e4268c0c5500f09acf98357a80763c28f615c2 Gerrit-Change-Number: 17237 Gerrit-PatchSet: 2 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR] IMPALA-10644: RangerAuthorizationFactory cannot be instantiated
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17282 ) Change subject: IMPALA-10644: RangerAuthorizationFactory cannot be instantiated .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8518/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17282 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b6953b84fd28bb75f97516a3b7f40cd0a12af41 Gerrit-Change-Number: 17282 Gerrit-PatchSet: 1 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Wed, 07 Apr 2021 21:41:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10644: RangerAuthorizationFactory cannot be instantiated
Vihang Karajgaonkar has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17282 Change subject: IMPALA-10644: RangerAuthorizationFactory cannot be instantiated .. IMPALA-10644: RangerAuthorizationFactory cannot be instantiated Earlier when the GBN was bumped up to 11920537 in commit 1ab1143 some of the solr dependencies were excluded. This causes RangerAuthorizationFactory to initialization errors. This patch reverts the dependency exclusion to fix the problem. Testing: [WIP] Change-Id: I1b6953b84fd28bb75f97516a3b7f40cd0a12af41 --- M fe/pom.xml 1 file changed, 0 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/17282/1 -- To view, visit http://gerrit.cloudera.org:8080/17282 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I1b6953b84fd28bb75f97516a3b7f40cd0a12af41 Gerrit-Change-Number: 17282 Gerrit-PatchSet: 1 Gerrit-Owner: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 10: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8517/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 10 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 07 Apr 2021 20:13:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 10: (1 comment) http://gerrit.cloudera.org:8080/#/c/17244/10/tests/custom_cluster/test_metastore_service.py File tests/custom_cluster/test_metastore_service.py: http://gerrit.cloudera.org:8080/#/c/17244/10/tests/custom_cluster/test_metastore_service.py@215 PS10, Line 215: e flake8: E722 do not use bare except' -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 10 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 07 Apr 2021 19:53:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Vihang Karajgaonkar has uploaded a new patch set (#10). ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. IMPALA-10613: Standup HMS thrift server in Catalog This change adds the basic infrastructure to start the HMS server in Catalog. It introduces a new configuration (--start_hms_server) along with a config for the port and starts a HMS thrift server in the CatalogServiceCatalog instance. Currently, all the HMS APIs are "pass-through" to the backing HMS service. Except for the following 3 HMS APIs which can be used to request a table and its partitions. Additionally, there is another flag (--enable_catalogd_hms_cache) which can be used to disable the usage of catalogd for providing the table and partition metadata. This contribution was done by Kishen Das. 1. get_table_req 2. get_partitions_by_expr 3. get_partitions_by_names In case of get_partitions_by_expr we need the hive-exec jar to be present in the classpath since it needs to load the PartitionExpressionProxy to push down the partition predicates to the HMS database. In case of get_table_req if column statistics are requested, we return the table level statistics. Additionally, this patch adds a new configuration fallback_to_hms_on_errors for the catalog which is used to determine if the Catalog falls back to HMS service in case of errors while executing the API. This is useful for testing purposes. In order to expose the file-metadata for the tables and partitions, HMS API changes were made to add the filemetadata fields to table and partitions. In case of transactional tables, the file-metadata which is returned is consistent with the provided ValidWriteIdList in the API call. There are a few TODOs which will be done in follow up tasks: 1. Add support for SASL support. 2. Pin the hive_metastore.thrift in the code so that any changes to HMS APIs in the hive branch doesn't break Catalog's HMS service. Testing: 1. Added a new end-to-end test which starts the HMS service in Catalog and runs some basic HMS APIs against it. 2. Ran a modification of TestRemoteHiveMetastore in the Hive code base and confirmed most tests are working. There were some test failures but they are unrelated since the test assumes an empty warehouse whereas we run against the actual HMS service running in the mini-cluster. Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 --- M be/src/catalog/catalog-server.cc M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/CatalogService.thrift M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java A fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/GetPartialCatalogObjectRequestBuilder.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/Table.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHmsClientUtils.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java A fe/src/main/java/org/apache/impala/catalog/metastore/ICatalogMetastoreServer.java A fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java A fe/src/main/java/org/apache/impala/catalog/metastore/NoOpCatalogMetastoreServer.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsFileMetadataTest.java A fe/src/test/java/org/apache/impala/catalog/metastore/EnableCatalogdHmsCacheFlagTest.java M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java M tests/common/impala_test_suite.py A tests/custom_cluster/test_metastore_service.py 24 files changed, 5,398 insertions(+), 22 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/17244/10 -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 10 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 6: (2 comments) http://gerrit.cloudera.org:8080/#/c/17244/8/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java File fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java: http://gerrit.cloudera.org:8080/#/c/17244/8/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@157 PS8, Line 157: + " is transactional but it was requested without providing validWriteIdList"); > line too long (91 > 90) Done http://gerrit.cloudera.org:8080/#/c/17244/6/tests/custom_cluster/test_metastore_service.py File tests/custom_cluster/test_metastore_service.py: http://gerrit.cloudera.org:8080/#/c/17244/6/tests/custom_cluster/test_metastore_service.py@277 PS6, Line 277: catalog_hms_client.create_table(self.__get_test_tbl(new_db_name, new_tbl_name, > It'd be helpful to leave a comment that "this won't trigger table metadata Done. Yes, DDL handling needs more thought. I think it would be cleaner to rely on the events to update the catalogd instead of out of band add/remove of the tables which could cause failures when same table is added or removed at a high frequency. -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 6 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 07 Apr 2021 19:52:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 8: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8516/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 8 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 07 Apr 2021 19:14:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8515/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 7 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 07 Apr 2021 19:10:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 9: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7051/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 9 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 07 Apr 2021 18:54:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 8: (2 comments) http://gerrit.cloudera.org:8080/#/c/17244/8/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java File fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java: http://gerrit.cloudera.org:8080/#/c/17244/8/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@157 PS8, Line 157: + " is transactional but it was requested without providing validWriteIdList"); line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/17244/8/tests/custom_cluster/test_metastore_service.py File tests/custom_cluster/test_metastore_service.py: http://gerrit.cloudera.org:8080/#/c/17244/8/tests/custom_cluster/test_metastore_service.py@215 PS8, Line 215: e flake8: E722 do not use bare except' -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 8 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 07 Apr 2021 18:54:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 8: Latest patch fixes the test failure which was due to unavailable port to start the catalogd's HMS endpoint. The test was modified to use a free port instead of a hard-coded one. -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 8 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 07 Apr 2021 18:54:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Vihang Karajgaonkar has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. IMPALA-10613: Standup HMS thrift server in Catalog This change adds the basic infrastructure to start the HMS server in Catalog. It introduces a new configuration (--start_hms_server) along with a config for the port and starts a HMS thrift server in the CatalogServiceCatalog instance. Currently, all the HMS APIs are "pass-through" to the backing HMS service. Except for the following 3 HMS APIs which can be used to request a table and its partitions. Additionally, there is another flag (--enable_catalogd_hms_cache) which can be used to disable the usage of catalogd for providing the table and partition metadata. This contribution was done by Kishen Das. 1. get_table_req 2. get_partitions_by_expr 3. get_partitions_by_names In case of get_partitions_by_expr we need the hive-exec jar to be present in the classpath since it needs to load the PartitionExpressionProxy to push down the partition predicates to the HMS database. In case of get_table_req if column statistics are requested, we return the table level statistics. Additionally, this patch adds a new configuration fallback_to_hms_on_errors for the catalog which is used to determine if the Catalog falls back to HMS service in case of errors while executing the API. This is useful for testing purposes. In order to expose the file-metadata for the tables and partitions, HMS API changes were made to add the filemetadata fields to table and partitions. In case of transactional tables, the file-metadata which is returned is consistent with the provided ValidWriteIdList in the API call. There are a few TODOs which will be done in follow up tasks: 1. Add support for SASL support. 2. Pin the hive_metastore.thrift in the code so that any changes to HMS APIs in the hive branch doesn't break Catalog's HMS service. Testing: 1. Added a new end-to-end test which starts the HMS service in Catalog and runs some basic HMS APIs against it. 2. Ran a modification of TestRemoteHiveMetastore in the Hive code base and confirmed most tests are working. There were some test failures but they are unrelated since the test assumes an empty warehouse whereas we run against the actual HMS service running in the mini-cluster. Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 --- M be/src/catalog/catalog-server.cc M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/CatalogService.thrift M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java A fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/GetPartialCatalogObjectRequestBuilder.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/Table.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHmsClientUtils.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java A fe/src/main/java/org/apache/impala/catalog/metastore/ICatalogMetastoreServer.java A fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java A fe/src/main/java/org/apache/impala/catalog/metastore/NoOpCatalogMetastoreServer.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsFileMetadataTest.java A fe/src/test/java/org/apache/impala/catalog/metastore/EnableCatalogdHmsCacheFlagTest.java M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java M tests/common/impala_test_suite.py A tests/custom_cluster/test_metastore_service.py 24 files changed, 5,395 insertions(+), 22 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/17244/8 -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 8 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 7: (2 comments) http://gerrit.cloudera.org:8080/#/c/17244/7/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java File fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java: http://gerrit.cloudera.org:8080/#/c/17244/7/fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java@157 PS7, Line 157: + " is transactional but it was requested without providing validWriteIdList"); line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/17244/7/tests/custom_cluster/test_metastore_service.py File tests/custom_cluster/test_metastore_service.py: http://gerrit.cloudera.org:8080/#/c/17244/7/tests/custom_cluster/test_metastore_service.py@215 PS7, Line 215: e flake8: E722 do not use bare except' -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 7 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 07 Apr 2021 18:50:58 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Vihang Karajgaonkar has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. IMPALA-10613: Standup HMS thrift server in Catalog This change adds the basic infrastructure to start the HMS server in Catalog. It introduces a new configuration (--start_hms_server) along with a config for the port and starts a HMS thrift server in the CatalogServiceCatalog instance. Currently, all the HMS APIs are "pass-through" to the backing HMS service. Except for the following 3 HMS APIs which can be used to request a table and its partitions. Additionally, there is another flag (--enable_catalogd_hms_cache) which can be used to disable the usage of catalogd for providing the table and partition metadata. This contribution was done by Kishen Das. 1. get_table_req 2. get_partitions_by_expr 3. get_partitions_by_names In case of get_partitions_by_expr we need the hive-exec jar to be present in the classpath since it needs to load the PartitionExpressionProxy to push down the partition predicates to the HMS database. In case of get_table_req if column statistics are requested, we return the table level statistics. Additionally, this patch adds a new configuration fallback_to_hms_on_errors for the catalog which is used to determine if the Catalog falls back to HMS service in case of errors while executing the API. This is useful for testing purposes. In order to expose the file-metadata for the tables and partitions, HMS API changes were made to add the filemetadata fields to table and partitions. In case of transactional tables, the file-metadata which is returned is consistent with the provided ValidWriteIdList in the API call. There are a few TODOs which will be done in follow up tasks: 1. Add support for SASL support. 2. Pin the hive_metastore.thrift in the code so that any changes to HMS APIs in the hive branch doesn't break Catalog's HMS service. Testing: 1. Added a new end-to-end test which starts the HMS service in Catalog and runs some basic HMS APIs against it. 2. Ran a modification of TestRemoteHiveMetastore in the Hive code base and confirmed most tests are working. There were some test failures but they are unrelated since the test assumes an empty warehouse whereas we run against the actual HMS service running in the mini-cluster. Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 --- M be/src/catalog/catalog-server.cc M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/CatalogService.thrift M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java A fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/GetPartialCatalogObjectRequestBuilder.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/Table.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHmsClientUtils.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java A fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java A fe/src/main/java/org/apache/impala/catalog/metastore/ICatalogMetastoreServer.java A fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java A fe/src/main/java/org/apache/impala/catalog/metastore/NoOpCatalogMetastoreServer.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsFileMetadataTest.java A fe/src/test/java/org/apache/impala/catalog/metastore/EnableCatalogdHmsCacheFlagTest.java M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java M tests/common/impala_test_suite.py A tests/custom_cluster/test_metastore_service.py 24 files changed, 5,395 insertions(+), 22 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/17244/7 -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 7 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-10637: Fixes bug in ValidWriteIdList comparison
soura...@cloudera.com has posted comments on this change. ( http://gerrit.cloudera.org:8080/17276 ) Change subject: IMPALA-10637: Fixes bug in ValidWriteIdList comparison .. Patch Set 4: > Patch Set 3: Code-Review+1 > > "Because of a bug in ValidWriteIdList comparison, catalogd was not populating > table > cache with more recent changes." > nit in the commit message : "catalogD was not refreshing table metadata in > the cache with more recent changes." . Populating sounds like it's putting > the contents for the first time. Thanks for the review @kishendas I have updated the commit message. -- To view, visit http://gerrit.cloudera.org:8080/17276 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idaa4bcdbda1757a6451122efc505d1d483c879cc Gerrit-Change-Number: 17276 Gerrit-PatchSet: 4 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Anonymous Coward (646) Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 07 Apr 2021 17:56:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10637: Fixes bug in ValidWriteIdList comparison
Hello Aman Sinha, Vihang Karajgaonkar, Anonymous Coward (646), Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17276 to look at the new patch set (#4). Change subject: IMPALA-10637: Fixes bug in ValidWriteIdList comparison .. IMPALA-10637: Fixes bug in ValidWriteIdList comparison For a transactional table, catalogd compares previous and current ValidWriteList to determine more recent version out of the two and reloads table cache accordingly. Because of a bug in ValidWriteIdList comparison, catalogD was not refreshing table metadata in the cache with more recent changes. As a result of which we were seeing inconsistencies in read after write into the table. Tested by 1. Adding a unit test to compare WriteIDLists. Change-Id: Idaa4bcdbda1757a6451122efc505d1d483c879cc --- M fe/src/main/java/org/apache/impala/util/AcidUtils.java M fe/src/test/java/org/apache/impala/util/AcidUtilsTest.java 2 files changed, 16 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/17276/4 -- To view, visit http://gerrit.cloudera.org:8080/17276 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idaa4bcdbda1757a6451122efc505d1d483c879cc Gerrit-Change-Number: 17276 Gerrit-PatchSet: 4 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Anonymous Coward (646) Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-10637: Fixes bug in ValidWriteIdList comparison
Anonymous Coward (646) has posted comments on this change. ( http://gerrit.cloudera.org:8080/17276 ) Change subject: IMPALA-10637: Fixes bug in ValidWriteIdList comparison .. Patch Set 3: Code-Review+1 "Because of a bug in ValidWriteIdList comparison, catalogd was not populating table cache with more recent changes." nit in the commit message : "catalogD was not refreshing table metadata in the cache with more recent changes." . Populating sounds like it's putting the contents for the first time. -- To view, visit http://gerrit.cloudera.org:8080/17276 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idaa4bcdbda1757a6451122efc505d1d483c879cc Gerrit-Change-Number: 17276 Gerrit-PatchSet: 3 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Anonymous Coward (646) Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 07 Apr 2021 17:44:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10613: Standup HMS thrift server in Catalog
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17244 ) Change subject: IMPALA-10613: Standup HMS thrift server in Catalog .. Patch Set 6: > Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7047/ One of the tests which failed was fixed in https://gerrit.cloudera.org/#/c/17248/. I rebase to that change. Looking into the other failure. -- To view, visit http://gerrit.cloudera.org:8080/17244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5 Gerrit-Change-Number: 17244 Gerrit-PatchSet: 6 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 07 Apr 2021 17:40:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10532: TestOverlapMinMaxFilters.test overlap min max filters seems flaky
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17252 ) Change subject: IMPALA-10532: TestOverlapMinMaxFilters.test_overlap_min_max_filters seems flaky .. Patch Set 10: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8514/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17252 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964 Gerrit-Change-Number: 17252 Gerrit-PatchSet: 10 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 07 Apr 2021 16:27:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10532: TestOverlapMinMaxFilters.test overlap min max filters seems flaky
Qifan Chen has uploaded a new patch set (#10). ( http://gerrit.cloudera.org:8080/17252 ) Change subject: IMPALA-10532: TestOverlapMinMaxFilters.test_overlap_min_max_filters seems flaky .. IMPALA-10532: TestOverlapMinMaxFilters.test_overlap_min_max_filters seems flaky This change disables the overlap min/max filter test for hdfs in erasure coding, due to the query plan change (from 3-node scan to 2-node scan) which splits the row groups among scan nodes differently. The change also improves how a coordinator behaves when a just arriving min/max filter is the last one to arrive or is always true. Previously, the coordinator disables the corresponding filter representation by setting it to Always True, which makes it impossible to differentiate a true AlwaysTrue filter (say, set in the hash join building step) from the one being disabled. A dedicated Boolean variable minmaxDisabled_ is introduced to record the disabled state. The Always True state of a filter is never altered. The enhancement improves the display of the min and max column in "Filter routing table" and "Final filter table" in profile. These two columns now display the following possible values. 1. 'PartialUpdates' - The min and the max are partially updated; 2. 'AlwaysTrue' - The filter is always true; 3. 'AlwaysFalse'- The filter is always false; 4. Real values - The filter is neither always true or false, fully updated with the min/max real values. A third change introduced is to record, in profile for scan node, the arrival time of min/max filters (in elapsed time since the system is rebooted obtained by calling MonotonicMillis()). It can help the diagnosis of late arrival of filters, when compared with the elpased time when a row group is filtered with these filters. Testing: 1. Ran unit tests; 2. Ran core tests. Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964 --- M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/scan-node.cc M be/src/runtime/coordinator-filter-state.h M be/src/runtime/coordinator.cc M be/src/util/min-max-filter.cc M be/src/util/min-max-filter.h M tests/query_test/test_runtime_filters.py 7 files changed, 112 insertions(+), 18 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/17252/10 -- To view, visit http://gerrit.cloudera.org:8080/17252 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964 Gerrit-Change-Number: 17252 Gerrit-PatchSet: 10 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-10532: TestOverlapMinMaxFilters.test overlap min max filters seems flaky
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17252 ) Change subject: IMPALA-10532: TestOverlapMinMaxFilters.test_overlap_min_max_filters seems flaky .. Patch Set 9: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/8513/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/17252 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964 Gerrit-Change-Number: 17252 Gerrit-PatchSet: 9 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 07 Apr 2021 15:33:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10532: TestOverlapMinMaxFilters.test overlap min max filters seems flaky
Qifan Chen has uploaded a new patch set (#9). ( http://gerrit.cloudera.org:8080/17252 ) Change subject: IMPALA-10532: TestOverlapMinMaxFilters.test_overlap_min_max_filters seems flaky .. IMPALA-10532: TestOverlapMinMaxFilters.test_overlap_min_max_filters seems flaky This change disables the overlap min/max filter test for hdfs in erasure coding, due to the query plan change (from 3-node scan to 2-node scan) which splits the row groups among scan nodes differently. The change also improves how a coordinator behaves when a just arriving min/max filter is the last one to arrive or is always true. Previously, the coordinator disables the corresponding filter representation by setting it to Always True, which makes it impossible to differentiate a true AlwaysTrue filter (say, set in the hash join building step) from the one being disabled. A dedicated Boolean variable minmaxDisabled_ is introduced to record the disabled state. The Always True state of a filter is never altered. The enhancement improves the display of the min and max column in "Filter routing table" and "Final filter table" in profile. These two columns now display the following possible values. 1. 'PartialUpdates' - The min and the max are partially updated; 2. 'AlwaysTrue' - The filter is always true; 3. 'AlwaysFalse'- The filter is always false; 4. Real values - The filter is neither always true or false, fully updated with the min/max real values. A third change introduced is to record, in profile for scan node, the arrival time of min/max filters (in elapsed time since the system is rebooted obtained by calling MonotonicMillis()). It can help the diagnosis of late arrival of filters, when compared with the elpased time when a row group is filtered with these filters. Testing: 1. Ran unit tests; 2. Ran core tests. Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964 --- M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/scan-node.cc M be/src/runtime/coordinator-filter-state.h M be/src/runtime/coordinator.cc M be/src/util/min-max-filter.cc M be/src/util/min-max-filter.h M tests/query_test/test_runtime_filters.py 7 files changed, 112 insertions(+), 18 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/17252/9 -- To view, visit http://gerrit.cloudera.org:8080/17252 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964 Gerrit-Change-Number: 17252 Gerrit-PatchSet: 9 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-10640: Support reading Parquet Bloom filters - most common types
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/17026 ) Change subject: IMPALA-10640: Support reading Parquet Bloom filters - most common types .. Patch Set 24: (9 comments) http://gerrit.cloudera.org:8080/#/c/17026/24//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17026/24//COMMIT_MSG@7 PS24, Line 7: IMPALA-10640: Support reading Parquet Bloom filters - most common types Do we filter fields in complex types as well, e.g. elements of an array? Or only top-level columns? http://gerrit.cloudera.org:8080/#/c/17026/24/be/src/exec/parquet/hdfs-parquet-scanner.cc File be/src/exec/parquet/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/17026/24/be/src/exec/parquet/hdfs-parquet-scanner.cc@887 PS24, Line 887: Status bloom_filter_status = ProcessBloomFilter(row_group, We have query options to disable row group/page index filtering. Probably we should add a query option for bloom filtering as well, so in case there's a bug in the code the users will be able to disable it. http://gerrit.cloudera.org:8080/#/c/17026/24/be/src/exec/parquet/hdfs-parquet-scanner.cc@1483 PS24, Line 1483: ReadToBuffer nit: thanks for adding this member function. Can we use it at ParquetPageIndex::ReadAll() as well? http://gerrit.cloudera.org:8080/#/c/17026/24/be/src/exec/parquet/hdfs-parquet-scanner.cc@1497 PS24, Line 1497: cache_options)); nit: fits previous line http://gerrit.cloudera.org:8080/#/c/17026/24/be/src/exec/parquet/hdfs-parquet-scanner.cc@1502 PS24, Line 1502: nit: +2 indent http://gerrit.cloudera.org:8080/#/c/17026/24/be/src/exec/parquet/hdfs-parquet-scanner.cc@1716 PS24, Line 1716: if (!bloom_filter.Find(hash)) { nit: could you please add some logging at VLOG(3)? E.g. which conjunct filtered which row group in which file? http://gerrit.cloudera.org:8080/#/c/17026/24/be/src/exec/parquet/parquet-bloom-filter-util.h File be/src/exec/parquet/parquet-bloom-filter-util.h: http://gerrit.cloudera.org:8080/#/c/17026/24/be/src/exec/parquet/parquet-bloom-filter-util.h@37 PS24, Line 37: May or may not use nit: could you add some details please? When storage is used, and when it isn't? And when is 'ptr' used? http://gerrit.cloudera.org:8080/#/c/17026/24/be/src/kudu/util/block_bloom_filter.cc File be/src/kudu/util/block_bloom_filter.cc: http://gerrit.cloudera.org:8080/#/c/17026/24/be/src/kudu/util/block_bloom_filter.cc@135 PS24, Line 135: nit: too much indentation http://gerrit.cloudera.org:8080/#/c/17026/24/be/src/thirdparty/xxhash/xxhash.h File be/src/thirdparty/xxhash/xxhash.h: http://gerrit.cloudera.org:8080/#/c/17026/24/be/src/thirdparty/xxhash/xxhash.h@70 PS24, Line 70: https://fastcompression.blogspot.com/2019/03/presenting-xxh3.html?showComment=1552696407071#c3490092340461170735 > line too long (112 > 90) Probably you could add extend EXCLUDE_FILE_PATTERNS to thirdparty files in bin/jenkins/critique-gerrit-review.py. -- To view, visit http://gerrit.cloudera.org:8080/17026 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7119c7161fa3658e561fc1265430cb90079d8287 Gerrit-Change-Number: 17026 Gerrit-PatchSet: 24 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 07 Apr 2021 13:29:30 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10632: Update the Theta sketch serialization interface
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17261 ) Change subject: IMPALA-10632: Update the Theta sketch serialization interface .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/17261 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I80470863097a4836ee07fe44babaef0c852f3051 Gerrit-Change-Number: 17261 Gerrit-PatchSet: 3 Gerrit-Owner: Fucun Chu Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 07 Apr 2021 12:45:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10632: Update the Theta sketch serialization interface
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17261 ) Change subject: IMPALA-10632: Update the Theta sketch serialization interface .. IMPALA-10632: Update the Theta sketch serialization interface DataSketches 3.0.0 removes the serialization of Update Theta sketch, and uses Compact Theta sketch to serialize for backward compatibility. tests: -Ran the tests from tests/query_test/test_datasketches.py Change-Id: I80470863097a4836ee07fe44babaef0c852f3051 Reviewed-on: http://gerrit.cloudera.org:8080/17261 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/exprs/aggregate-functions-ir.cc M be/src/exprs/datasketches-common.cc M be/src/exprs/datasketches-functions-ir.cc 3 files changed, 48 insertions(+), 42 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/17261 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I80470863097a4836ee07fe44babaef0c852f3051 Gerrit-Change-Number: 17261 Gerrit-PatchSet: 4 Gerrit-Owner: Fucun Chu Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-9997/IMPALA-9998: Upgrade compression libraries to latest versions
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17254 ) Change subject: IMPALA-9997/IMPALA-9998: Upgrade compression libraries to latest versions .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/17254 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I858f82f773023bd0aea14543f18bd74071758468 Gerrit-Change-Number: 17254 Gerrit-PatchSet: 4 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Wed, 07 Apr 2021 10:41:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10640: Support reading Parquet Bloom filters - most common types
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17026 ) Change subject: IMPALA-10640: Support reading Parquet Bloom filters - most common types .. Patch Set 24: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8512/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17026 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7119c7161fa3658e561fc1265430cb90079d8287 Gerrit-Change-Number: 17026 Gerrit-PatchSet: 24 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 07 Apr 2021 10:39:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10640: Support reading Parquet Bloom filters - most common types
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17026 ) Change subject: IMPALA-10640: Support reading Parquet Bloom filters - most common types .. Patch Set 24: (182 comments) http://gerrit.cloudera.org:8080/#/c/17026/24/be/src/thirdparty/xxhash/xxhash.h File be/src/thirdparty/xxhash/xxhash.h: http://gerrit.cloudera.org:8080/#/c/17026/24/be/src/thirdparty/xxhash/xxhash.h@70 PS24, Line 70: https://fastcompression.blogspot.com/2019/03/presenting-xxh3.html?showComment=1552696407071#c3490092340461170735 line too long (112 > 90) http://gerrit.cloudera.org:8080/#/c/17026/24/be/src/thirdparty/xxhash/xxhash.h@92 PS24, Line 92: * https://fastcompression.blogspot.com/2018/03/xxhash-for-small-keys-impressive-power.html line too long (96 > 90) http://gerrit.cloudera.org:8080/#/c/17026/24/be/src/thirdparty/xxhash/xxhash.h@113 PS24, Line 113: # elif defined (__cplusplus) || (defined (__STDC_VERSION__) && (__STDC_VERSION__ >= 199901L) /* C99 */) line too long (104 > 90) http://gerrit.cloudera.org:8080/#/c/17026/24/be/src/thirdparty/xxhash/xxhash.h@243 PS24, Line 243: # define XXH3_64bits_reset_withSecret XXH_NAME2(XXH_NAMESPACE, XXH3_64bits_reset_withSecret) line too long (93 > 90) http://gerrit.cloudera.org:8080/#/c/17026/24/be/src/thirdparty/xxhash/xxhash.h@253 PS24, Line 253: # define XXH3_128bits_reset_withSeed XXH_NAME2(XXH_NAMESPACE, XXH3_128bits_reset_withSeed) line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/17026/24/be/src/thirdparty/xxhash/xxhash.h@254 PS24, Line 254: # define XXH3_128bits_reset_withSecret XXH_NAME2(XXH_NAMESPACE, XXH3_128bits_reset_withSecret) line too long (95 > 90) http://gerrit.cloudera.org:8080/#/c/17026/24/be/src/thirdparty/xxhash/xxhash.h@270 PS24, Line 270: #define XXH_VERSION_NUMBER (XXH_VERSION_MAJOR *100*100 + XXH_VERSION_MINOR *100 + XXH_VERSION_RELEASE) line too long (103 > 90) http://gerrit.cloudera.org:8080/#/c/17026/24/be/src/thirdparty/xxhash/xxhash.h@429 PS24, Line 429: * @param statePtr A pointer to an @ref XXH32_state_t allocated with @ref XXH32_createState(). line too long (94 > 90) http://gerrit.cloudera.org:8080/#/c/17026/24/be/src/thirdparty/xxhash/xxhash.h@441 PS24, Line 441: XXH_PUBLIC_API void XXH32_copyState(XXH32_state_t* dst_state, const XXH32_state_t* src_state); line too long (94 > 90) http://gerrit.cloudera.org:8080/#/c/17026/24/be/src/thirdparty/xxhash/xxhash.h@476 PS24, Line 476: XXH_PUBLIC_API XXH_errorcode XXH32_update (XXH32_state_t* statePtr, const void* input, size_t length); line too long (102 > 90) http://gerrit.cloudera.org:8080/#/c/17026/24/be/src/thirdparty/xxhash/xxhash.h@628 PS24, Line 628: XXH_PUBLIC_API void XXH64_copyState(XXH64_state_t* dst_state, const XXH64_state_t* src_state); line too long (94 > 90) http://gerrit.cloudera.org:8080/#/c/17026/24/be/src/thirdparty/xxhash/xxhash.h@631 PS24, Line 631: XXH_PUBLIC_API XXH_errorcode XXH64_update (XXH64_state_t* statePtr, const void* input, size_t length); line too long (102 > 90) http://gerrit.cloudera.org:8080/#/c/17026/24/be/src/thirdparty/xxhash/xxhash.h@700 PS24, Line 700: XXH_PUBLIC_API XXH64_hash_t XXH3_64bits_withSeed(const void* data, size_t len, XXH64_hash_t seed); line too long (98 > 90) http://gerrit.cloudera.org:8080/#/c/17026/24/be/src/thirdparty/xxhash/xxhash.h@724 PS24, Line 724: XXH_PUBLIC_API XXH64_hash_t XXH3_64bits_withSecret(const void* data, size_t len, const void* secret, size_t secretSize); line too long (120 > 90) http://gerrit.cloudera.org:8080/#/c/17026/24/be/src/thirdparty/xxhash/xxhash.h@743 PS24, Line 743: XXH_PUBLIC_API void XXH3_copyState(XXH3_state_t* dst_state, const XXH3_state_t* src_state); line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/17026/24/be/src/thirdparty/xxhash/xxhash.h@756 PS24, Line 756: XXH_PUBLIC_API XXH_errorcode XXH3_64bits_reset_withSeed(XXH3_state_t* statePtr, XXH64_hash_t seed); line too long (99 > 90) http://gerrit.cloudera.org:8080/#/c/17026/24/be/src/thirdparty/xxhash/xxhash.h@766 PS24, Line 766: XXH_PUBLIC_API XXH_errorcode XXH3_64bits_reset_withSecret(XXH3_state_t* statePtr, const void* secret, size_t secretSize); line too long (121 > 90) http://gerrit.cloudera.org:8080/#/c/17026/24/be/src/thirdparty/xxhash/xxhash.h@768 PS24, Line 768: XXH_PUBLIC_API XXH_errorcode XXH3_64bits_update (XXH3_state_t* statePtr, const void* input, size_t length); line too long (107 > 90) http://gerrit.cloudera.org:8080/#/c/17026/24/be/src/thirdparty/xxhash/xxhash.h@791 PS24, Line 791: XXH_PUBLIC_API XXH128_hash_t XXH3_128bits_withSeed(const void* data, size_t len, XXH64_hash_t seed); line too long (100 > 90) http://gerrit.cloudera.org:8080/#/c/17026/24/be/src/thirdparty/xxhash/xxhash.h@792 PS24, Line 792: XXH_PUBLIC_API XXH128_hash_t XXH3_128bits_withSecret(const void* data, size_t len, const void* secret, size_t secretSize); line too long (122 > 90)
[Impala-ASF-CR] IMPALA-10640: Support reading Parquet Bloom filters - most common types
Daniel Becker has uploaded a new patch set (#24). ( http://gerrit.cloudera.org:8080/17026 ) Change subject: IMPALA-10640: Support reading Parquet Bloom filters - most common types .. IMPALA-10640: Support reading Parquet Bloom filters - most common types This change adds read support for Parquet Bloom filters for types that can reasonably be supported in Impala. Other types, such as CHAR(N), would be very difficult to support because the length may be different in Parquet and in Impala which results in truncation or padding, and that changes the hash which makes using the Bloom filter impossible. Write support will be added in a later change. The supported Parquet type - Impala type pairs are the following: --- |Parquet type | Impala type| |---| |INT32| TINYINT, SMALLINT, INT | |INT64| BIGINT | |FLOAT| FLOAT | |DOUBLE | DOUBLE | |BYTE_ARRAY | STRING | --- The following types are not supported for the given reasons: |Impala type | Problem | || |VARCHAR(N) | truncation can change hash| |CHAR(N) | padding / truncation can change hash | |DECIMAL | multiple encodings supported | |TIMESTAMP | multiple encodings supported, timezone conversion | |DATE| not considered yet| Support may be added for these types later, see IMPALA-10641. If a Bloom filter is available for a column that is fully dictionary encoded, the Bloom filter is not used as the dictionary can give exact results in filtering. Testing: - Added tests/query_test/test_parquet_bloom_filter.py that tests whether Parquet Bloom filtering works for the supported types and that we do not incorrectly discard row groups for the unsupported type VARCHAR. The Parquet file used in the test was generated with an external tool. - Added unit tests for ParquetBloomFilter in file be/src/util/parquet-bloom-filter-test.cc - A minor, unrelated change was done in be/src/util/bloom-filter-test.cc: the MakeRandom() function had return type uint64_t, the documentation claimed it returned a 64 bit random number, but the actual number of random bits is 32, which is what is intended in the tests. The return type and documentation have been corrected to use 32 bits. Change-Id: I7119c7161fa3658e561fc1265430cb90079d8287 --- M LICENSE.txt M be/src/exec/parquet/CMakeLists.txt M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h A be/src/exec/parquet/parquet-bloom-filter-util.cc A be/src/exec/parquet/parquet-bloom-filter-util.h M be/src/exprs/expr-value.h M be/src/exprs/literal.cc M be/src/exprs/literal.h M be/src/kudu/util/block_bloom_filter.cc M be/src/kudu/util/block_bloom_filter.h M be/src/kudu/util/block_bloom_filter_avx2.cc M be/src/runtime/bufferpool/buffer-pool-internal.h M be/src/runtime/bufferpool/buffer-pool.cc M be/src/runtime/bufferpool/buffer-pool.h A be/src/thirdparty/xxhash/README.md A be/src/thirdparty/xxhash/xxhash.h M be/src/util/CMakeLists.txt M be/src/util/bloom-filter-test.cc M be/src/util/bloom-filter.cc M be/src/util/bloom-filter.h A be/src/util/impala-bloom-filter-buffer-allocator.cc A be/src/util/impala-bloom-filter-buffer-allocator.h A be/src/util/parquet-bloom-filter-test.cc A be/src/util/parquet-bloom-filter.cc A be/src/util/parquet-bloom-filter.h M bin/rat_exclude_files.txt M bin/run_clang_tidy.sh M common/thrift/parquet.thrift M testdata/data/README A testdata/data/parquet-bloom-filtering.parquet A testdata/workloads/functional-query/queries/QueryTest/parquet-bloom-filter.test A tests/query_test/test_parquet_bloom_filter.py 33 files changed, 7,133 insertions(+), 139 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/17026/24 -- To view, visit http://gerrit.cloudera.org:8080/17026 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7119c7161fa3658e561fc1265430cb90079d8287 Gerrit-Change-Number: 17026 Gerrit-PatchSet: 24 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy
[native-toolchain-CR] IMPALA-10488: Add jwt-cpp 0.5.0 to the toolchain
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/17118 ) Change subject: IMPALA-10488: Add jwt-cpp 0.5.0 to the toolchain .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/17118/3/source/jwt-cpp/build.sh File source/jwt-cpp/build.sh: http://gerrit.cloudera.org:8080/#/c/17118/3/source/jwt-cpp/build.sh@33 PS3, Line 33: # jwt-cpp is currently header-only, so it really is only copying files around Should we add header only libraries to native toolchain? We had this dilemma when adding data sketches, and in the end we simply copied the header files to https://github.com/apache/impala/tree/master/be/src/thirdparty I don't have a clear preference here, just curious about your opinion. -- To view, visit http://gerrit.cloudera.org:8080/17118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I77aa3b36b45e8ef3c2d7873327948197c2c65d11 Gerrit-Change-Number: 17118 Gerrit-PatchSet: 3 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Wed, 07 Apr 2021 07:10:59 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10632: Update the Theta sketch serialization interface
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17261 ) Change subject: IMPALA-10632: Update the Theta sketch serialization interface .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7050/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/17261 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I80470863097a4836ee07fe44babaef0c852f3051 Gerrit-Change-Number: 17261 Gerrit-PatchSet: 3 Gerrit-Owner: Fucun Chu Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 07 Apr 2021 07:00:45 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10632: Update the Theta sketch serialization interface
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17261 ) Change subject: IMPALA-10632: Update the Theta sketch serialization interface .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17261 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I80470863097a4836ee07fe44babaef0c852f3051 Gerrit-Change-Number: 17261 Gerrit-PatchSet: 3 Gerrit-Owner: Fucun Chu Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 07 Apr 2021 07:00:45 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10632: Update the Theta sketch serialization interface
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/17261 ) Change subject: IMPALA-10632: Update the Theta sketch serialization interface .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17261 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I80470863097a4836ee07fe44babaef0c852f3051 Gerrit-Change-Number: 17261 Gerrit-PatchSet: 2 Gerrit-Owner: Fucun Chu Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 07 Apr 2021 07:00:09 + Gerrit-HasComments: No