[Impala-ASF-CR] IMPALA-10489: Implement JWT support
Wenzhe Zhou has uploaded a new patch set (#9). ( http://gerrit.cloudera.org:8080/17435 ) Change subject: IMPALA-10489: Implement JWT support .. IMPALA-10489: Implement JWT support This patch added JWT support with following functionality: * Load and parse JWKS from pre-installed JSON file. * Read the JWT token from the HTTP Header. * Verify the JWT's signature with puclic key in JWKS. * Get the username out of the payload of JWT token. We use third party library jwt-cpp to verify JWT token. jwt-cpp is a headers only C++ library. It was added to native-toolchain. This patch modified bootstrap_toolchian.py to download jwt-cpp from toolchain s3 bucket, and modified makefiles to add jwt-cpp/include in the include path. Added BE unit-tests for loading JWKS file and verifying JWT token. Also added FE custom cluster test for JWT authentication. Testing: - Passed core run. Change-Id: I6b71fa854c9ddc8ca882878853395e1eb866143c --- M CMakeLists.txt M be/CMakeLists.txt M be/src/rpc/authentication.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/transport/THttpServer.cpp M be/src/transport/THttpServer.h M be/src/util/CMakeLists.txt A be/src/util/jwt-util-test.cc A be/src/util/jwt-util.cc A be/src/util/jwt-util.h M be/src/util/webserver.cc M be/src/util/webserver.h M bin/bootstrap_toolchain.py M bin/impala-config.sh M bin/rat_exclude_files.txt A cmake_modules/FindJwtCpp.cmake M common/thrift/generate_error_codes.py M common/thrift/metrics.json M fe/src/test/java/org/apache/impala/customcluster/LdapHS2Test.java M fe/src/test/java/org/apache/impala/customcluster/LdapWebserverTest.java A testdata/jwt/jwks_rs256.json 22 files changed, 1,502 insertions(+), 14 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/17435/9 -- To view, visit http://gerrit.cloudera.org:8080/17435 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6b71fa854c9ddc8ca882878853395e1eb866143c Gerrit-Change-Number: 17435 Gerrit-PatchSet: 9 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Wenzhe Zhou
[Impala-ASF-CR] IMPALA-10489: Implement JWT support
Wenzhe Zhou has posted comments on this change. ( http://gerrit.cloudera.org:8080/17435 ) Change subject: IMPALA-10489: Implement JWT support .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/17435/6/be/src/transport/THttpServer.cpp File be/src/transport/THttpServer.cpp: http://gerrit.cloudera.org:8080/#/c/17435/6/be/src/transport/THttpServer.cpp@286 PS6, Line 286: resetAuthState(); : returnUnauthorized(); : throw TTransportException("HTTP auth failed."); : } : } : : if (!authorized && use_jwt_token_ && !auth_value_.empty() : && auth_value_.find('.') != string::npos) { : // Check Authorization header with the Bearer authentication scheme as: : // Authorization: Bearer : // A well-formed JWT consists of three concatenated Base64url-encoded strings, : // se > Thanks for capturing this case. SAML2 token is base64 encoded XML and shoul Discussed with Vihang. SAML Bearer token is generated by Impala code after the SAML auth flow is completed. Technically we could generate a JWT instead of our current implementation of the token at the end of SAML flow. Since Impala supports multiple auth mechanism in parallel, in theory we can have SAML and JWT configured simultaneously. So it's better to fall back to JWT verification after SAML verification fails. -- To view, visit http://gerrit.cloudera.org:8080/17435 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6b71fa854c9ddc8ca882878853395e1eb866143c Gerrit-Change-Number: 17435 Gerrit-PatchSet: 8 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Sat, 22 May 2021 01:01:53 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10613: (addendum) Fix test on S3 builds
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/17483 ) Change subject: IMPALA-10613: (addendum) Fix test on S3 builds .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17483 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6ac291529dc0661abdfc2d4f48924a2c4b807c40 Gerrit-Change-Number: 17483 Gerrit-PatchSet: 3 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Sat, 22 May 2021 00:59:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10681: Improve inner join cardinality estimates
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/17387 ) Change subject: IMPALA-10681: Improve inner join cardinality estimates .. Patch Set 7: Code-Review+1 Carry Zoltan's +1 -- To view, visit http://gerrit.cloudera.org:8080/17387 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8aa9d3b8f3c4848b3e9414fe19ad7ad348d12ecc Gerrit-Change-Number: 17387 Gerrit-PatchSet: 7 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Sat, 22 May 2021 00:57:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10681: Improve inner join cardinality estimates
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/17387 ) Change subject: IMPALA-10681: Improve inner join cardinality estimates .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/17387/3/fe/src/main/java/org/apache/impala/planner/JoinNode.java File fe/src/main/java/org/apache/impala/planner/JoinNode.java: http://gerrit.cloudera.org:8080/#/c/17387/3/fe/src/main/java/org/apache/impala/planner/JoinNode.java@430 PS3, Line 430: he NDVs on both sides t > I will try to refactor these two methods such that a common internal method I refactored both getGenericJoinCardinality() and getOtherJoinCardinality() by having them call a common internal utility method that computes the generic join cardinality. The difference between the 2 callers is the manner in which the parameters are supplied. -- To view, visit http://gerrit.cloudera.org:8080/17387 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8aa9d3b8f3c4848b3e9414fe19ad7ad348d12ecc Gerrit-Change-Number: 17387 Gerrit-PatchSet: 7 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Sat, 22 May 2021 00:56:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10681: Improve inner join cardinality estimates
Hello Qifan Chen, Zoltan Borok-Nagy, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17387 to look at the new patch set (#7). Change subject: IMPALA-10681: Improve inner join cardinality estimates .. IMPALA-10681: Improve inner join cardinality estimates During cardinality estimation for inner joins, if the join conjunct involves a scan slot on left side and a function (e.g MAX) on the right, currently we determine that the NDV stats of either side is not useful and return the left side's cardinality even though it may be a significant over-estimate. In this patch, we handle join conjuncts of such types by keeping them in an 'other' eligible conjuncts list as long as the NDV for expressions on both sides of the join and the input row count is available. For example, in the following cases the NDV is available but was not being used for inner joins since the previous logic was only looking for scan slots: (a) int_col = MAX(int_col) and the right input does not have a group-by, so right NDV = 1 can be used. (b) if it has a group-by and the group-by columns already have associated NDV, the combined NDV is also available. Other such examples exist. An auxiliary struct is introduced to keep track of the ndv and row count. Once these 'other' eligible conjuncts are populated, we do the join cardinality estimation in a manner similar to the normal join conjuncts by fetching the stats from the auxiliary struct. Testing: - Added new planner tests for inner join cardinality - Modified expected plans for certains tests including TPC-DS queries and ran end-to-end TPC-DS queries - Since TPC-DS plans are complex, I did a check of the cardinality changes for some of the hash joins but not the changes in the shape of a plan (e.g whether the join order changed). - Preliminary tests with a TPC-DS 10 GB scale factor on a single node showed between 5-15% performance improvements for 4 of the 6 queries whose plans changed. Change-Id: I8aa9d3b8f3c4848b3e9414fe19ad7ad348d12ecc --- M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M testdata/workloads/functional-planner/queries/PlannerTest/card-inner-join.test M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test M testdata/workloads/functional-planner/queries/PlannerTest/joins.test M testdata/workloads/functional-planner/queries/PlannerTest/partition-key-scans-default.test M testdata/workloads/functional-planner/queries/PlannerTest/partition-key-scans.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q05.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q54.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q71.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q74.test M testdata/workloads/functional-planner/queries/PlannerTest/views.test 15 files changed, 3,719 insertions(+), 3,349 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/17387/7 -- To view, visit http://gerrit.cloudera.org:8080/17387 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8aa9d3b8f3c4848b3e9414fe19ad7ad348d12ecc Gerrit-Change-Number: 17387 Gerrit-PatchSet: 7 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] [PROTOTYPE] IMPALA-6784: Upgrade gperftools to 2.8.1
Joe McDonnell has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17485 Change subject: [PROTOTYPE] IMPALA-6784: Upgrade gperftools to 2.8.1 .. [PROTOTYPE] IMPALA-6784: Upgrade gperftools to 2.8.1 This upgrades gperftools to 2.8.1 to get various improvements over gperftools 2.5. One improvement is that PageHeap::AllocLarge() has been improved from O(n) complexity to O(log n). Testing: - Core job Change-Id: Ib0974b276db908c1ef733114656ba5276c302fdc --- M bin/impala-config.sh 1 file changed, 2 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/85/17485/1 -- To view, visit http://gerrit.cloudera.org:8080/17485 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ib0974b276db908c1ef733114656ba5276c302fdc Gerrit-Change-Number: 17485 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell
[Impala-ASF-CR] IMPALA-10709: Min/max filters should be enabled for joins into sorted columns in Parquet tables
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/17478 ) Change subject: IMPALA-10709: Min/max filters should be enabled for joins into sorted columns in Parquet tables .. Patch Set 7: (5 comments) http://gerrit.cloudera.org:8080/#/c/17478/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17478/7//COMMIT_MSG@7 PS7, Line 7: enabled for joins into sorted nit: maybe say 'enabled only for joins on sorted..' http://gerrit.cloudera.org:8080/#/c/17478/7//COMMIT_MSG@10 PS7, Line 10: addvantage nit: spelling http://gerrit.cloudera.org:8080/#/c/17478/7//COMMIT_MSG@10 PS7, Line 10: the nit: extra 'the' http://gerrit.cloudera.org:8080/#/c/17478/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/17478/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2776 PS7, Line 2776: sort_by_columns_ = For non-parquet tables or parquet tables that have not been created with the SORT BY clause, the getParameters().get("sort.columns") would return NULL, so suggest adding a null check for that. Also, getParameters() itself may return NULL so both checks would be needed. http://gerrit.cloudera.org:8080/#/c/17478/7/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/17478/7/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@750 PS7, Line 750: if (table != null && table instanceof HdfsTable) { I think this should also check for ((HdfsTable) table).isParquetTable() . A more general thought on this is when we talk about the sort by column, here we are limiting it to the case when the table itself was created with the SORT BY clause. As opposed to the ColumnOrder property contained in the Parquet footer: optional list column_orders; <-- from the FileMetadata portion of the footer So if an external software (e.g Spark) populated the above in the footer, we won't end up using it. When we talk about min-max filters for Parquet, we always think about the Parquet footer so this would be worth clarifying in the commit message. In the future we could be smarter about this e.g by looking at the footers of a small sample of files. -- To view, visit http://gerrit.cloudera.org:8080/17478 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I28c19c4b39b01ffa7d275fb245be85c28e9b2963 Gerrit-Change-Number: 17478 Gerrit-PatchSet: 7 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Comment-Date: Fri, 21 May 2021 23:43:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8737: Switch to gperftool with O(log n) PageHeap::AllocLarge()
Joe McDonnell has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17484 Change subject: IMPALA-8737: Switch to gperftool with O(log n) PageHeap::AllocLarge() .. IMPALA-8737: Switch to gperftool with O(log n) PageHeap::AllocLarge() In gperftools 2.5, PageHeap::AllocLarge() has O(n) behavior. This switches to a patched version of gperftools 2.5 that changes the behavior to O(log n). This corresponds to these commits: https://github.com/gperftools/gperftools/commit/06c9414ec423ffe442c047b2560555f9d5847b1d https://github.com/gperftools/gperftools/commit/f1d3fe4a21e339a3fd6e4592ee784a7b92dc Testing: - Core job Change-Id: I6377f7087111cf10ae35ce102beabcafb505579f --- M bin/impala-config.sh 1 file changed, 2 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/17484/1 -- To view, visit http://gerrit.cloudera.org:8080/17484 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I6377f7087111cf10ae35ce102beabcafb505579f Gerrit-Change-Number: 17484 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell
[Impala-ASF-CR] IMPALA-10613: (addendum) Fix test on S3 builds
Vihang Karajgaonkar has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17483 Change subject: IMPALA-10613: (addendum) Fix test on S3 builds .. IMPALA-10613: (addendum) Fix test on S3 builds The test_metastore_service.py fails on S3 builds because it expects the filemetadata's object dictionary to be present. However, if the table is located on S3 then there are no file-blocks in the returned file-metadata and hence the length of obj_dict will be 0. This patch fixes the test in S3 builds by not asserting the check on S3, ADLS, GCS builds. Testing Done: 1. Ran the test locally to confirm it is working. 2. [WIP] Ran the test on a S3 environment where tables are created on S3. Change-Id: I6ac291529dc0661abdfc2d4f48924a2c4b807c40 --- M tests/custom_cluster/test_metastore_service.py 1 file changed, 5 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/17483/3 -- To view, visit http://gerrit.cloudera.org:8080/17483 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I6ac291529dc0661abdfc2d4f48924a2c4b807c40 Gerrit-Change-Number: 17483 Gerrit-PatchSet: 3 Gerrit-Owner: Vihang Karajgaonkar
[native-toolchain-CR] IMPALA-8737: Patch gperftools to implement O(log n) searching for PageHeap::AllocLarge()
Joe McDonnell has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17482 Change subject: IMPALA-8737: Patch gperftools to implement O(log n) searching for PageHeap::AllocLarge() .. IMPALA-8737: Patch gperftools to implement O(log n) searching for PageHeap::AllocLarge() In gperftools 2.5, PageHeap::AllocLarge() has O(n) behavior. This patches gperftool 2.5 to incorporate the patches that change PageHeap::AllocLarge() to have O(log n) searching: https://github.com/gperftools/gperftools/commit/06c9414ec423ffe442c047b2560555f9d5847b1d https://github.com/gperftools/gperftools/commit/f1d3fe4a21e339a3fd6e4592ee784a7b92dc This also adds a build for gperftools 2.8.1 to allow future development/testing for actually upgrading gperftools (IMPALA-6784). Testing: - Ran a toolchain build with this - Ran Impala locally with the resulting gperftools-2.5-p4 Change-Id: I7f41f2d296e26e160fd9bcc8be29085d1e9a8bc2 --- M buildall.sh A source/gperftools/gperftools-2.5-patches/0003-Implemented-O-log-n-searching-among-large-spans.patch A source/gperftools/gperftools-2.5-patches/0004-refactored-handling-of-reverse-span-set-iterator-for.patch A source/gperftools/gperftools-2.8.1-patches/0001-Port-gperftools-to-adapt-aarch64.patch 4 files changed, 873 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/native-toolchain refs/changes/82/17482/1 -- To view, visit http://gerrit.cloudera.org:8080/17482 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I7f41f2d296e26e160fd9bcc8be29085d1e9a8bc2 Gerrit-Change-Number: 17482 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell
[Impala-ASF-CR] IMPALA-10680: Replace StringToFloatInternal using fast double parser library
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17389 ) Change subject: IMPALA-10680: Replace StringToFloatInternal using fast_double_parser library .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/17389/6/be/src/util/string-parser.h File be/src/util/string-parser.h: http://gerrit.cloudera.org:8080/#/c/17389/6/be/src/util/string-parser.h@545 PS6, Line 545: std::string res; > Original input string in StringToFloatInternal() is not null-terminated str Results: 3 Runs with 20 length string: With std::string Fetched 110 row(s) in 30.54s Fetched 110 row(s) in 30.68s Fetched 110 row(s) in 30.62s With VLA: Fetched 110 row(s) in 30.60s Fetched 110 row(s) in 30.64s Fetched 110 row(s) in 30.56s 3 Runs with 30 length string: - With std::string Fetched 110 row(s) in 33.85s Fetched 110 row(s) in 33.89s Fetched 110 row(s) in 34.19s With VLA: Fetched 110 row(s) in 33.78s Fetched 110 row(s) in 33.71s Fetched 110 row(s) in 33.57s Summary: I think we are fine with performance. In real system hopefully we don't see too many tcmalloc contention due to this. Couple of things will help us avoid malloc in current scenario: 1. Strings less than 16 length due to Small string optimisation. 2. Based on Qifan's suggestion, we can avoid creating copy of string for some cases where we have non-integer terminated string as original input. Other alternative (also suggested by Zoltan) is to do VLA for string of less than 256 bytes and above that we can use strtod. I am ok with either approach. -- To view, visit http://gerrit.cloudera.org:8080/17389 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic105ad38a2fcbf2fb4e8ae8af6d9a8e251a9c141 Gerrit-Change-Number: 17389 Gerrit-PatchSet: 6 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 21 May 2021 21:37:06 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10709: Min/max filters should be enabled for joins into sorted columns in Parquet tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17478 ) Change subject: IMPALA-10709: Min/max filters should be enabled for joins into sorted columns in Parquet tables .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8771/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17478 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I28c19c4b39b01ffa7d275fb245be85c28e9b2963 Gerrit-Change-Number: 17478 Gerrit-PatchSet: 7 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Comment-Date: Fri, 21 May 2021 21:07:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10709: Min/max filters should be enabled for joins into sorted columns in Parquet tables
Qifan Chen has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/17478 ) Change subject: IMPALA-10709: Min/max filters should be enabled for joins into sorted columns in Parquet tables .. IMPALA-10709: Min/max filters should be enabled for joins into sorted columns in Parquet tables This patch enables min/max filters for equi-joins into sort by columns in a Parquet table by default. This is to take the addvantage of the min/max values being fully sorted in each data file for the table. When there are multiple sort by columns in the table, only the leading column will be assigned a min/max filter. The control knob is minmax_filter_sorted_columns, default to true. When query option minmax_filter_sorted_columns is true and query option minmax_filter_threshold is 0, the patch automatically assigns a reasonable value for the threshhold, and selects PAGE to be the filtering level as normally specified via option minmax_filtering_level. When minmax_filter_threshold is greater than 0, then no adjustment will be made to both options (minmax_filter_threshold and minmax_filtering_level). When minmax_filter_sorted_columns is set to false, no min/max filters will be specifically assigned to the leading sort by columns. Testing: 1). Added two new tests in overlap_min_max_filters.test to verify a) min/max filters are only created for leading sort by column; b) query option minmax_filter_sorted_columns works. 2). Core [TBD] 3). Performance [TBD] Change-Id: I28c19c4b39b01ffa7d275fb245be85c28e9b2963 --- M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java M testdata/workloads/functional-query/queries/QueryTest/overlap_min_max_filters.test 10 files changed, 125 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/17478/7 -- To view, visit http://gerrit.cloudera.org:8080/17478 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I28c19c4b39b01ffa7d275fb245be85c28e9b2963 Gerrit-Change-Number: 17478 Gerrit-PatchSet: 7 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen
[Impala-ASF-CR] IMPALA-10489: Implement JWT support
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17435 ) Change subject: IMPALA-10489: Implement JWT support .. Patch Set 8: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8770/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17435 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6b71fa854c9ddc8ca882878853395e1eb866143c Gerrit-Change-Number: 17435 Gerrit-PatchSet: 8 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Fri, 21 May 2021 20:03:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10489: Implement JWT support
Wenzhe Zhou has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/17435 ) Change subject: IMPALA-10489: Implement JWT support .. IMPALA-10489: Implement JWT support This patch added JWT support with following functionality: * Load and parse JWKS from pre-installed JSON file. * Read the JWT token from the HTTP Header. * Verify the JWT's signature with puclic key in JWKS. * Get the username out of the payload of JWT token. We use third party library jwt-cpp to verify JWT token. jwt-cpp is a headers only C++ library. It was added to native-toolchain. This patch modified bootstrap_toolchian.py to download jwt-cpp from toolchain s3 bucket, and modified makefiles to add jwt-cpp/include in the include path. Added BE unit-tests for loading JWKS file and verifying JWT token. Also added FE custom cluster test for JWT authentication. Testing: - Passed core run. Change-Id: I6b71fa854c9ddc8ca882878853395e1eb866143c --- M CMakeLists.txt M be/CMakeLists.txt M be/src/rpc/authentication.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/transport/THttpServer.cpp M be/src/transport/THttpServer.h M be/src/util/CMakeLists.txt A be/src/util/jwt-util-test.cc A be/src/util/jwt-util.cc A be/src/util/jwt-util.h M be/src/util/webserver.cc M be/src/util/webserver.h M bin/bootstrap_toolchain.py M bin/impala-config.sh M bin/rat_exclude_files.txt A cmake_modules/FindJwtCpp.cmake M common/thrift/generate_error_codes.py M common/thrift/metrics.json M fe/src/test/java/org/apache/impala/customcluster/LdapHS2Test.java M fe/src/test/java/org/apache/impala/customcluster/LdapWebserverTest.java A testdata/jwt/jwks_rs256.json 22 files changed, 1,494 insertions(+), 15 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/17435/8 -- To view, visit http://gerrit.cloudera.org:8080/17435 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6b71fa854c9ddc8ca882878853395e1eb866143c Gerrit-Change-Number: 17435 Gerrit-PatchSet: 8 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Wenzhe Zhou
[Impala-ASF-CR] IMPALA-10489: Implement JWT support
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17435 ) Change subject: IMPALA-10489: Implement JWT support .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8769/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17435 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6b71fa854c9ddc8ca882878853395e1eb866143c Gerrit-Change-Number: 17435 Gerrit-PatchSet: 7 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Fri, 21 May 2021 19:22:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10489: Implement JWT support
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17435 ) Change subject: IMPALA-10489: Implement JWT support .. Patch Set 7: (3 comments) http://gerrit.cloudera.org:8080/#/c/17435/7/fe/src/test/java/org/apache/impala/customcluster/LdapHS2Test.java File fe/src/test/java/org/apache/impala/customcluster/LdapHS2Test.java: http://gerrit.cloudera.org:8080/#/c/17435/7/fe/src/test/java/org/apache/impala/customcluster/LdapHS2Test.java@433 PS7, Line 433: "--jwt_token_auth=true --jwt_validate_signature=true --jwks_file_path=%s --jwt_allow_without_tls=true", line too long (111 > 90) http://gerrit.cloudera.org:8080/#/c/17435/7/fe/src/test/java/org/apache/impala/customcluster/LdapHS2Test.java@494 PS7, Line 494: "--jwt_token_auth=true --jwt_validate_signature=false --jwt_allow_without_tls=true"); line too long (93 > 90) http://gerrit.cloudera.org:8080/#/c/17435/7/fe/src/test/java/org/apache/impala/customcluster/LdapWebserverTest.java File fe/src/test/java/org/apache/impala/customcluster/LdapWebserverTest.java: http://gerrit.cloudera.org:8080/#/c/17435/7/fe/src/test/java/org/apache/impala/customcluster/LdapWebserverTest.java@289 PS7, Line 289: "--jwt_token_auth=true --jwt_validate_signature=true --jwks_file_path=%s --jwt_allow_without_tls=true", line too long (115 > 90) -- To view, visit http://gerrit.cloudera.org:8080/17435 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6b71fa854c9ddc8ca882878853395e1eb866143c Gerrit-Change-Number: 17435 Gerrit-PatchSet: 7 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Fri, 21 May 2021 19:02:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10489: Implement JWT support
Wenzhe Zhou has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/17435 ) Change subject: IMPALA-10489: Implement JWT support .. IMPALA-10489: Implement JWT support This patch added JWT support with following functionality: * Load and parse JWKS from pre-installed JSON file. * Read the JWT token from the HTTP Header. * Verify the JWT's signature with puclic key in JWKS. * Get the username out of the payload of JWT token. We use third party library jwt-cpp to verify JWT token. jwt-cpp is a headers only C++ library. It was added to native-toolchain. This patch modified bootstrap_toolchian.py to download jwt-cpp from toolchain s3 bucket, and modified makefiles to add jwt-cpp/include in the include path. Added BE unit-tests for loading JWKS file and verifying JWT token. Also added FE custom cluster test for JWT authentication. Testing: - Passed core run. Change-Id: I6b71fa854c9ddc8ca882878853395e1eb866143c --- M CMakeLists.txt M be/CMakeLists.txt M be/src/rpc/authentication.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/transport/THttpServer.cpp M be/src/transport/THttpServer.h M be/src/util/CMakeLists.txt A be/src/util/jwt-util-test.cc A be/src/util/jwt-util.cc A be/src/util/jwt-util.h M be/src/util/webserver.cc M be/src/util/webserver.h M bin/bootstrap_toolchain.py M bin/impala-config.sh M bin/rat_exclude_files.txt A cmake_modules/FindJwtCpp.cmake M common/thrift/generate_error_codes.py M common/thrift/metrics.json M fe/src/test/java/org/apache/impala/customcluster/LdapHS2Test.java M fe/src/test/java/org/apache/impala/customcluster/LdapWebserverTest.java A testdata/jwt/jwks_rs256.json 22 files changed, 1,493 insertions(+), 15 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/17435/7 -- To view, visit http://gerrit.cloudera.org:8080/17435 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6b71fa854c9ddc8ca882878853395e1eb866143c Gerrit-Change-Number: 17435 Gerrit-PatchSet: 7 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Wenzhe Zhou
[Impala-ASF-CR] IMPALA-10650: Bailout min/max filters in hash join builder early
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17295 ) Change subject: IMPALA-10650: Bailout min/max filters in hash join builder early .. Patch Set 32: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8768/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17295 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I193646e7acfdd3023f7c947d8107da58a1f41183 Gerrit-Change-Number: 17295 Gerrit-PatchSet: 32 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 21 May 2021 18:55:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10650: Bailout min/max filters in hash join builder early
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17295 ) Change subject: IMPALA-10650: Bailout min/max filters in hash join builder early .. Patch Set 32: Here is the clean sql execution after the fix. Server version: impalad version 4.0.0-SNAPSHOT DEBUG (build 603d1a8fd0946b32e76807bb4c505171f87f7881) Query: drop table if exists min_max_filter_large_strings1 +-+ | summary | +-+ | Table has been dropped. | +-+ Fetched 1 row(s) in 4.94s Query: drop table if exists min_max_filter_large_strings2 +-+ | summary | +-+ | Table has been dropped. | +-+ Fetched 1 row(s) in 3.99s Query: create table min_max_filter_large_strings1 (string_col string primary key) stored as kudu +-+ | summary | +-+ | Table has been created. | +-+ WARNINGS: Unpartitioned Kudu tables are inefficient for large data sizes. Fetched 1 row(s) in 0.56s Query: insert into min_max_filter_large_strings1 values (''), ('bbbc'), (''), ('ff
[Impala-ASF-CR] IMPALA-10489: Implement JWT support
Wenzhe Zhou has posted comments on this change. ( http://gerrit.cloudera.org:8080/17435 ) Change subject: IMPALA-10489: Implement JWT support .. Patch Set 6: (5 comments) http://gerrit.cloudera.org:8080/#/c/17435/6/be/src/rpc/authentication.cc File be/src/rpc/authentication.cc: http://gerrit.cloudera.org:8080/#/c/17435/6/be/src/rpc/authentication.cc@135 PS6, Line 135: DEFINE_bool_hidden(saml2_allow_without_tls_debug_only, false, : "When this configuration is set to true, Impala allows SAML2 authentication " : "on unsecure channel. This should be only enabled for development / testing."); > I think we should have a similar parameter for JWTs, because JWTs are only Done http://gerrit.cloudera.org:8080/#/c/17435/6/be/src/rpc/authentication.cc@155 PS6, Line 155: // Authorization type for JWT token based authorization in HTTP header, for example : // "Authorization: Bearer ", or “Authorization: Bearer;token: ”. : DEFINE_string( : jwt_auth_type, "Bearer", "Authorization type for JWT token based authorization"); : // JWT token identifier in HTTP header. for example “token=”. : DEFINE_string(jwt_token_identifier, "token", "JWT token identifier"); > We definitely want to support this standard authorization header: I saw following statement in design doc "SSO Token based authentication in CDW": "This token is then used in the JDBC URL with authentication type set to “auth=token” and “token=” (exact semantics are TBD Action: for hive/impala teams)". I was thinking we got to support different formats. Did a search, it's standard to set JWT auth as "Authorization: Bearer ". Removed jwt_auth_type and jwt_token_identifier. http://gerrit.cloudera.org:8080/#/c/17435/6/be/src/rpc/authentication.cc@568 PS6, Line 568: // Create a cookie to return. : connection_context->return_headers.push_back( : Substitute("Set-Cookie: $0", GenerateCookie(username, hash))); > I'm wondering if we really need to exchange the JWT for a cookie. We defini Agree, it's common to JWT with each request. When discussed with Naveen last time, it's not clear if JWT will accompany each request. I will confirm with him. http://gerrit.cloudera.org:8080/#/c/17435/6/be/src/transport/THttpServer.cpp File be/src/transport/THttpServer.cpp: http://gerrit.cloudera.org:8080/#/c/17435/6/be/src/transport/THttpServer.cpp@170 PS6, Line 170: #if 0 : } else if (use_jwt_token_ : && THRIFT_strncasecmp(header, FLAGS_jwt_token_identifier.c_str(), sz) == 0 : && THRIFT_strncasecmp( : auth_value_.c_str(), FLAGS_jwt_auth_type.c_str(), auth_value_.length()) : == 0) { : // JWT token in HTTP header as “Authorization=Bearer;token=” : jwt_token_ = string(value); : #endif > If this code isn't needed, let's remove it. Done http://gerrit.cloudera.org:8080/#/c/17435/6/be/src/transport/THttpServer.cpp@286 PS6, Line 286: if (got_bearer_auth) { : fallback_to_other_auths = false; : // Final SAML message in browser mode, check bearer and replace it with a cookie. : DCHECK(wrapped_request_ != nullptr); : wrapped_request_->headers["Authorization"] = auth_value_; : if (callbacks_.validate_saml2_bearer_fn()) { : // During EE tests it makes things easier to return 401-Unauthorized here. : // This hack can be removed once there is a Python client that : // supports SAML (IMPALA-10496). : if (!FLAGS_saml2_ee_test_mode) authorized = true; : if (metrics_enabled_) http_metrics_->total_saml_auth_success_->Increment(1); : } > One complication is that both SAML and JWT use the "Authorization Bearer http://gerrit.cloudera.org:8080/17435 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6b71fa854c9ddc8ca882878853395e1eb866143c Gerrit-Change-Number: 17435 Gerrit-PatchSet: 6 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Fri, 21 May 2021 18:38:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10650: Bailout min/max filters in hash join builder early
Qifan Chen has uploaded a new patch set (#32). ( http://gerrit.cloudera.org:8080/17295 ) Change subject: IMPALA-10650: Bailout min/max filters in hash join builder early .. IMPALA-10650: Bailout min/max filters in hash join builder early This change set addresses the weakness in population min/max filters in the hash join builder by periodically measuring the usefulness of each filter and set the 'always_true_' flag accordingly. Once set to true, the insertion to such a filter completely skips the steps from the evaluation of the value from a row to the verification of the value in the min/max range. This optimization is LLVM-enabled. In addition, a new flag 'is_min_max_value_present' is added to TRuntimeFilterTargetDesc to indicate whether the min/max column stats is present in the query plan. The flag eliminates the need to check the presence of min/max stats for every row in back-end. Early bail out improves the HJ builder step in general. For example, the step for join node #11 in TPCDS Q8 improves 13%, and the step for join node #8 in TPCDS Q16 improves 3.2%. The Insert() methods are optimized with branch prediction compiler hints which yield the following improvement when tested with the insertion of 1 randomly generated items. Small Integers: 7.0% Integers: 4.1% Big Integers: 4.3% Strings:5.6% Dates: 4.4% Timestamps:10.7% Decimals(4): 10.4% Decimals(8):9.1% In addition, the min/max stats for pages are read in batches with a fast track version for column types of int32_t, int64_t, float, double and date that have identical storage format as Parquet. For a row group, the page locations are read only once, instead of once for every page skipped, resulting in 100x speedup when a subset of 199 pages are skipped. Testing: 1. Ran core test successfully; 2. Ran TPCDS performance tests. Change-Id: I193646e7acfdd3023f7c947d8107da58a1f41183 --- M be/src/codegen/gen_ir_descriptions.py M be/src/exec/filter-context.cc M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/parquet/parquet-column-stats.cc M be/src/exec/parquet/parquet-column-stats.h M be/src/exec/parquet/parquet-column-stats.inline.h M be/src/exec/parquet/parquet-common.h M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/runtime/runtime-filter-ir.cc M be/src/util/min-max-filter-ir.cc M be/src/util/min-max-filter.cc M be/src/util/min-max-filter.h M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java M fe/src/main/java/org/apache/impala/util/TColumnValueUtil.java 17 files changed, 984 insertions(+), 297 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/17295/32 -- To view, visit http://gerrit.cloudera.org:8080/17295 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I193646e7acfdd3023f7c947d8107da58a1f41183 Gerrit-Change-Number: 17295 Gerrit-PatchSet: 32 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-10626: Add support for Iceberg's Catalogs API
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17466 ) Change subject: IMPALA-10626: Add support for Iceberg's Catalogs API .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8767/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17466 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5dfa150986117fc55b28034c4eda38a736460ead Gerrit-Change-Number: 17466 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 21 May 2021 17:03:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10626: Add support for Iceberg's Catalogs API
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17466 ) Change subject: IMPALA-10626: Add support for Iceberg's Catalogs API .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8766/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17466 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5dfa150986117fc55b28034c4eda38a736460ead Gerrit-Change-Number: 17466 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 21 May 2021 16:53:40 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10626: Add support for Iceberg's Catalogs API
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/17466 ) Change subject: IMPALA-10626: Add support for Iceberg's Catalogs API .. Patch Set 4: PS4 is a rebase. -- To view, visit http://gerrit.cloudera.org:8080/17466 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5dfa150986117fc55b28034c4eda38a736460ead Gerrit-Change-Number: 17466 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 21 May 2021 16:36:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10626: Add support for Iceberg's Catalogs API
Hello wangsheng, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17466 to look at the new patch set (#4). Change subject: IMPALA-10626: Add support for Iceberg's Catalogs API .. IMPALA-10626: Add support for Iceberg's Catalogs API Iceberg recently switched to use its Catalogs class to define catalog and table properties. Catalog information is stored in a configuration file such as hive-site.xml. And the table properties contain information about which catalog is being used and what is the Iceberg table id. E.g. in the Hive conf we can have the following properties to define catalogs: iceberg.catalog..type = hadoop iceberg.catalog..warehouse = somelocation or iceberg.catalog..type = hive And at the table level we can have the following: iceberg.catalog = name = Table property 'iceberg.catalog' refers to a Catalog defined in the configuration file. This is in contradiction with Impala's current behavior where we are already using 'iceberg.catalog', and it can have the following values: * hive.catalog for HiveCatalog * hadoop.catalog for HadoopCatalog * hadoop.tables for HadoopTables To be backward-compatible and also support the new Catalogs properties Impala still recognizes the above special values. But, from now Impala doesn't define 'iceberg.catalog' by default. 'iceberg.catalog' being NULL means HiveCatalog for both Impala and Iceberg's Catalogs API, hence for Hive and Spark as well. If 'iceberg.catalog' has a different value than the special values it indicates that Iceberg's Catalogs API is being used, so Impala will try to look up the catalog configuration from the Hive config file. Testing: * added SHOW CREATE TABLE tests * added e2e tests that create/insert/drop Iceberg tables with Catalogs * manually tested interop behavior with Hive Change-Id: I5dfa150986117fc55b28034c4eda38a736460ead --- M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCatalog.java A fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCatalogs.java M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHadoopCatalog.java M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHadoopTables.java M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M fe/src/test/resources/hive-site.xml.py A testdata/workloads/functional-query/queries/QueryTest/iceberg-catalogs.test M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test M tests/query_test/test_iceberg.py 15 files changed, 468 insertions(+), 35 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/17466/4 -- To view, visit http://gerrit.cloudera.org:8080/17466 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5dfa150986117fc55b28034c4eda38a736460ead Gerrit-Change-Number: 17466 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng
[Impala-ASF-CR] IMPALA-10626: Add support for Iceberg's Catalogs API
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/17466 ) Change subject: IMPALA-10626: Add support for Iceberg's Catalogs API .. Patch Set 2: (4 comments) Thanks for the comments! http://gerrit.cloudera.org:8080/#/c/17466/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17466/2//COMMIT_MSG@12 PS2, Line 12: contiain > contain? Done http://gerrit.cloudera.org:8080/#/c/17466/2/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java File fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java: http://gerrit.cloudera.org:8080/#/c/17466/2/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java@677 PS2, Line 677: throws AnalysisException > Exception never thrown in this method. Done http://gerrit.cloudera.org:8080/#/c/17466/2/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCatalog.java File fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCatalog.java: http://gerrit.cloudera.org:8080/#/c/17466/2/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCatalog.java@58 PS2, Line 58: properties > Maybe we should alse add some comments for this new parameter. Done http://gerrit.cloudera.org:8080/#/c/17466/2/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCatalogs.java File fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCatalogs.java: http://gerrit.cloudera.org:8080/#/c/17466/2/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCatalogs.java@114 PS2, Line 114: properties > Unused variable. Done -- To view, visit http://gerrit.cloudera.org:8080/17466 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5dfa150986117fc55b28034c4eda38a736460ead Gerrit-Change-Number: 17466 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 21 May 2021 16:33:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10626: Add support for Iceberg's Catalogs API
Hello wangsheng, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17466 to look at the new patch set (#3). Change subject: IMPALA-10626: Add support for Iceberg's Catalogs API .. IMPALA-10626: Add support for Iceberg's Catalogs API Iceberg recently switched to use its Catalogs class to define catalog and table properties. Catalog information is stored in a configuration file such as hive-site.xml. And the table properties contain information about which catalog is being used and what is the Iceberg table id. E.g. in the Hive conf we can have the following properties to define catalogs: iceberg.catalog..type = hadoop iceberg.catalog..warehouse = somelocation or iceberg.catalog..type = hive And at the table level we can have the following: iceberg.catalog = name = Table property 'iceberg.catalog' refers to a Catalog defined in the configuration file. This is in contradiction with Impala's current behavior where we are already using 'iceberg.catalog', and it can have the following values: * hive.catalog for HiveCatalog * hadoop.catalog for HadoopCatalog * hadoop.tables for HadoopTables To be backward-compatible and also support the new Catalogs properties Impala still recognizes the above special values. But, from now Impala doesn't define 'iceberg.catalog' by default. 'iceberg.catalog' being NULL means HiveCatalog for both Impala and Iceberg's Catalogs API, hence for Hive and Spark as well. If 'iceberg.catalog' has a different value than the special values it indicates that Iceberg's Catalogs API is being used, so Impala will try to look up the catalog configuration from the Hive config file. Testing: * added SHOW CREATE TABLE tests * added e2e tests that create/insert/drop Iceberg tables with Catalogs * manually tested interop behavior with Hive Change-Id: I5dfa150986117fc55b28034c4eda38a736460ead --- M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCatalog.java A fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCatalogs.java M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHadoopCatalog.java M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHadoopTables.java M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M fe/src/test/resources/hive-site.xml.py A testdata/workloads/functional-query/queries/QueryTest/iceberg-catalogs.test M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test M tests/query_test/test_iceberg.py 15 files changed, 468 insertions(+), 35 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/17466/3 -- To view, visit http://gerrit.cloudera.org:8080/17466 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5dfa150986117fc55b28034c4eda38a736460ead Gerrit-Change-Number: 17466 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: wangsheng
[Impala-ASF-CR] IMPALA-10680: Replace StringToFloatInternal using fast double parser library
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17389 ) Change subject: IMPALA-10680: Replace StringToFloatInternal using fast_double_parser library .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/17389/6/be/src/util/string-parser.h File be/src/util/string-parser.h: http://gerrit.cloudera.org:8080/#/c/17389/6/be/src/util/string-parser.h@545 PS6, Line 545: std::string res; > Stack overflow is sensitive to the depth of the stack and the size of the c Original input string in StringToFloatInternal() is not null-terminated string and was figured out during testing. That is also one of the reason for the copying things into null-terminated string. I am checking perf penalty with a million casts requiring heap allocation. if we see perf issues, we can use VLA for shorter strings and heap for longer ones. -- To view, visit http://gerrit.cloudera.org:8080/17389 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic105ad38a2fcbf2fb4e8ae8af6d9a8e251a9c141 Gerrit-Change-Number: 17389 Gerrit-PatchSet: 6 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 21 May 2021 16:02:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10680: Replace StringToFloatInternal using fast double parser library
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17389 ) Change subject: IMPALA-10680: Replace StringToFloatInternal using fast_double_parser library .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/17389/6/be/src/util/string-parser.h File be/src/util/string-parser.h: http://gerrit.cloudera.org:8080/#/c/17389/6/be/src/util/string-parser.h@545 PS6, Line 545: std::string res; > Yeah, I think it'd be good to see measurements for longer float literals. W Stack overflow is sensitive to the depth of the stack and the size of the current frame. Assume in most cases, the input string conforms to the format that the fast double parser library can handle properly, then we may not need to build the string object at all. Thus, we can use the following form of signature: inline bool stringToFloatPreprocess(char*, int offset, int len, string* preprocessed_result); We return string (in preprocessed_result) only when the input does not conform. In good case, we do not construct a string object and we can feed the original input string in StringToFloatInternal() to the fast parser library, if the input string is null-terminated. In seems this is the case as the only use of the StringToFloatInternal() is called indirectly here throughout Impala BE. 343 static bool ParseProbability(const string& prob_str, bool* should_execute) { 344 StringParser::ParseResult parse_result; 345 double probability = StringParser::StringToFloat( 346 prob_str.c_str(), prob_str.size(), &parse_result); 347 if (parse_result != StringParser::PARSE_SUCCESS || 348 probability < 0.0 || probability > 1.0) { 349 return false; 350 } 351 // +1L ensures probability of 0.0 and 1.0 work as expected. 352 *should_execute = rand() < probability * (RAND_MAX + 1L); 353 return true; 354 } -- To view, visit http://gerrit.cloudera.org:8080/17389 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic105ad38a2fcbf2fb4e8ae8af6d9a8e251a9c141 Gerrit-Change-Number: 17389 Gerrit-PatchSet: 6 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 21 May 2021 15:27:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10680: Replace StringToFloatInternal using fast double parser library
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/17389 ) Change subject: IMPALA-10680: Replace StringToFloatInternal using fast_double_parser library .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/17389/6/be/src/util/string-parser.h File be/src/util/string-parser.h: http://gerrit.cloudera.org:8080/#/c/17389/6/be/src/util/string-parser.h@545 PS6, Line 545: std::string res; > I had checked Small String Optimization and had read till 24 bytes it would Yeah, I think it'd be good to see measurements for longer float literals. We are using TCMalloc in Impala which should be fairly performant so it's possible that the perf is not too bad. We can also have an 'if' in the code. If strlen > 1 KiB use strtod(), otherwise use fast_double_parser with a Variable-length array, or a fixed-sized 1 KiB array? -- To view, visit http://gerrit.cloudera.org:8080/17389 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic105ad38a2fcbf2fb4e8ae8af6d9a8e251a9c141 Gerrit-Change-Number: 17389 Gerrit-PatchSet: 6 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 21 May 2021 11:24:06 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10680: Replace StringToFloatInternal using fast double parser library
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17389 ) Change subject: IMPALA-10680: Replace StringToFloatInternal using fast_double_parser library .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/17389/6/be/src/util/string-parser.h File be/src/util/string-parser.h: http://gerrit.cloudera.org:8080/#/c/17389/6/be/src/util/string-parser.h@545 PS6, Line 545: std::string res; > I wonder if the performance numbers are still the same, given that now we n I had checked Small String Optimization and had read till 24 bytes it would not do allocation which appeared reasonable. But i rechecked after your comment and 24 bytes seem to be only for clang. I wrote a small program to check for our g++ and it avoids allocation till 15 bytes only. Program: #include #include #include // replace operator new and delete to log allocations void* operator new(std::size_t n) { std::cout << "[Allocating " << n << " bytes]"; return malloc(n); } void operator delete(void* p) throw() { free(p); } int main() { for (size_t i = 0; i < 30; ++i) { std::cout << i << ": " << std::string(i, '=') << std::endl; } } is 15 good enough length ? otherwise dynamic allocation will make things slow. My assumption is input won't be long enough to cause stack overflow. I am not sure if we have some limitation on the string being passed to StringToFloat. -- To view, visit http://gerrit.cloudera.org:8080/17389 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic105ad38a2fcbf2fb4e8ae8af6d9a8e251a9c141 Gerrit-Change-Number: 17389 Gerrit-PatchSet: 6 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 21 May 2021 10:49:09 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10680: Replace StringToFloatInternal using fast double parser library
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/17389 ) Change subject: IMPALA-10680: Replace StringToFloatInternal using fast_double_parser library .. Patch Set 6: (2 comments) Thanks for the changes! http://gerrit.cloudera.org:8080/#/c/17389/6/be/src/util/string-parser.h File be/src/util/string-parser.h: http://gerrit.cloudera.org:8080/#/c/17389/6/be/src/util/string-parser.h@545 PS6, Line 545: std::string res; I wonder if the performance numbers are still the same, given that now we need to dynamically allocate memory from the heap. (Though for short strings the standard library implementation being used avoids allocation by using Small String Optimization) http://gerrit.cloudera.org:8080/#/c/17389/6/be/src/util/string-parser.h@552 PS6, Line 552: // string has `move` defined, which would be used instead of copy. nit: I don't think we need this comment. And I think the situation is even better as in this case Named Return Value Optimization kicks in which even avoids moving: https://en.cppreference.com/w/cpp/language/copy_elision -- To view, visit http://gerrit.cloudera.org:8080/17389 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic105ad38a2fcbf2fb4e8ae8af6d9a8e251a9c141 Gerrit-Change-Number: 17389 Gerrit-PatchSet: 6 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 21 May 2021 09:29:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10650: Bailout min/max filters in hash join builder early
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/17295 ) Change subject: IMPALA-10650: Bailout min/max filters in hash join builder early .. Patch Set 31: Seems like TestMinMaxFilters.test_large_strings timed out: https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/13947/testReport/ https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/4238/testReport/ -- To view, visit http://gerrit.cloudera.org:8080/17295 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I193646e7acfdd3023f7c947d8107da58a1f41183 Gerrit-Change-Number: 17295 Gerrit-PatchSet: 31 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 21 May 2021 08:59:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10626: Add support for Iceberg's Catalogs API
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/17466 ) Change subject: IMPALA-10626: Add support for Iceberg's Catalogs API .. Patch Set 2: (4 comments) Thanks for new feature! http://gerrit.cloudera.org:8080/#/c/17466/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17466/2//COMMIT_MSG@12 PS2, Line 12: contiain contain? http://gerrit.cloudera.org:8080/#/c/17466/2/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java File fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java: http://gerrit.cloudera.org:8080/#/c/17466/2/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java@677 PS2, Line 677: throws AnalysisException Exception never thrown in this method. http://gerrit.cloudera.org:8080/#/c/17466/2/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCatalog.java File fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCatalog.java: http://gerrit.cloudera.org:8080/#/c/17466/2/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCatalog.java@58 PS2, Line 58: properties Maybe we should alse add some comments for this new parameter. http://gerrit.cloudera.org:8080/#/c/17466/2/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCatalogs.java File fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCatalogs.java: http://gerrit.cloudera.org:8080/#/c/17466/2/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCatalogs.java@114 PS2, Line 114: properties Unused variable. -- To view, visit http://gerrit.cloudera.org:8080/17466 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5dfa150986117fc55b28034c4eda38a736460ead Gerrit-Change-Number: 17466 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 21 May 2021 07:35:21 + Gerrit-HasComments: Yes