[Impala-ASF-CR] IMPALA-10389: impala-profile-tool container
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17015 ) Change subject: IMPALA-10389: impala-profile-tool container .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8061/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17015 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I36915cd686ab930dcc934bc0c81bff8c16d46714 Gerrit-Change-Number: 17015 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Tue, 02 Feb 2021 01:32:25 + Gerrit-HasComments: No
[native-toolchain-CR] [config] bump toolchain build id for Kudu 1.14
Alexey Serbin has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17016 Change subject: [config] bump toolchain build id for Kudu 1.14 .. [config] bump toolchain build id for Kudu 1.14 The motivation for this version patch is two-fold: * Update the version of Kudu client to reflect the recently released Kudu 1.14 (see https://kudu.apache.org/releases/1.14.0/) * Be able to pick up https://gerrit.cloudera.org/#/c/16705 change (control of Kudu client connection negotiation timeout for impalad) Change-Id: I5e75ca996670a7abf161f0c5e7751031391fd959 --- M buildall.sh 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/native-toolchain refs/changes/16/17016/1 -- To view, visit http://gerrit.cloudera.org:8080/17016 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I5e75ca996670a7abf161f0c5e7751031391fd959 Gerrit-Change-Number: 17016 Gerrit-PatchSet: 1 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] [config] bump toolchain build id for Kudu 1.14
Alexey Serbin has abandoned this change. ( http://gerrit.cloudera.org:8080/17014 ) Change subject: [config] bump toolchain build id for Kudu 1.14 .. Abandoned It seems this change should be done automatically. I posted corresponding change for the native-toolchain project: https://gerrit.cloudera.org/#/c/17016/ -- To view, visit http://gerrit.cloudera.org:8080/17014 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: Icb8a8ba2660c6c7ffa03a9b0874d427c1fec3439 Gerrit-Change-Number: 17014 Gerrit-PatchSet: 2 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-10389: impala-profile-tool container
Tim Armstrong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17015 Change subject: IMPALA-10389: impala-profile-tool container .. IMPALA-10389: impala-profile-tool container Add a build step for an impala-profile-tool docker image that makes it easy to run the binary on any system. This container is automatically built as part of the docker build. This sets up a new build context that doesn't pull in all of the same dependencies or depend on the Java build Testing: cat logs/cluster/profiles/* | \ docker run -i impala_profile_tool I uploaded a build of the container to dockerhub too: timgarmstrong/impala_profile_tool Change-Id: I36915cd686ab930dcc934bc0c81bff8c16d46714 --- M docker/CMakeLists.txt A docker/impala_profile_tool/Dockerfile M docker/setup_build_context.py A docker/utility_entrypoint.sh 4 files changed, 200 insertions(+), 43 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/17015/1 -- To view, visit http://gerrit.cloudera.org:8080/17015 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I36915cd686ab930dcc934bc0c81bff8c16d46714 Gerrit-Change-Number: 17015 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] [config] bump toolchain build id for Kudu 1.14
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17014 ) Change subject: [config] bump toolchain build id for Kudu 1.14 .. Patch Set 1: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/8060/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/17014 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icb8a8ba2660c6c7ffa03a9b0874d427c1fec3439 Gerrit-Change-Number: 17014 Gerrit-PatchSet: 1 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 02 Feb 2021 01:02:20 + Gerrit-HasComments: No
[Impala-ASF-CR] [config] bump toolchain build id for Kudu 1.14
Hello Thomas Tauber-Marshall, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17014 to look at the new patch set (#2). Change subject: [config] bump toolchain build id for Kudu 1.14 .. [config] bump toolchain build id for Kudu 1.14 The motivation for this version patch is two-fold: * Update the version of Kudu client to reflect the recently released Kudu 1.14 (see https://kudu.apache.org/releases/1.14.0/) * Be able to pick up https://gerrit.cloudera.org/#/c/16705 change (control of Kudu client connection negotiation timeout for impalad) Change-Id: Icb8a8ba2660c6c7ffa03a9b0874d427c1fec3439 --- M bin/impala-config.sh 1 file changed, 2 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/17014/2 -- To view, visit http://gerrit.cloudera.org:8080/17014 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Icb8a8ba2660c6c7ffa03a9b0874d427c1fec3439 Gerrit-Change-Number: 17014 Gerrit-PatchSet: 2 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] [config] bump toolchain build id for Kudu 1.14
Alexey Serbin has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17014 Change subject: [config] bump toolchain build id for Kudu 1.14 .. [config] bump toolchain build id for Kudu 1.14 The motivation for this version patch is two-fold: * Update the version of Kudu client to reflect the recently released Kudu 1.14 (see https://kudu.apache.org/releases/1.14.0/) * To able to pick up https://gerrit.cloudera.org/#/c/16705 change (control of Kudu client connection negotiation timeout for impalad) Change-Id: Icb8a8ba2660c6c7ffa03a9b0874d427c1fec3439 --- M bin/impala-config.sh 1 file changed, 2 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/17014/1 -- To view, visit http://gerrit.cloudera.org:8080/17014 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Icb8a8ba2660c6c7ffa03a9b0874d427c1fec3439 Gerrit-Change-Number: 17014 Gerrit-PatchSet: 1 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9979: part 2: partitioned top-n
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/16242 ) Change subject: IMPALA-9979: part 2: partitioned top-n .. Patch Set 28: (7 comments) http://gerrit.cloudera.org:8080/#/c/16242/28//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16242/28//COMMIT_MSG@59 PS28, Line 59: and the tie-handling : semantics required by rank() predicates nit: I think this was really implemented in your previous patch? http://gerrit.cloudera.org:8080/#/c/16242/28/be/src/exec/topn-node.h File be/src/exec/topn-node.h: http://gerrit.cloudera.org:8080/#/c/16242/28/be/src/exec/topn-node.h@64 PS28, Line 64: int64_t limit = is_partitioned() ? per_partition_limit() : What's the relationship between 'include_ties' and 'is_partitioned', i.e. why does 'include_ties' here matter for the unpartitioned case but not the partitioned case? http://gerrit.cloudera.org:8080/#/c/16242/28/be/src/exec/topn-node.cc File be/src/exec/topn-node.cc: http://gerrit.cloudera.org:8080/#/c/16242/28/be/src/exec/topn-node.cc@244 PS28, Line 244: U typo http://gerrit.cloudera.org:8080/#/c/16242/28/be/src/exec/topn-node.cc@399 PS28, Line 399: RETURN_IF_ERROR(QueryMaintenance(state)); This results in two calls to QueryMaintenance() in quick succession, here and in GetNext(), might be better to avoid that http://gerrit.cloudera.org:8080/#/c/16242/28/be/src/exec/topn-node.cc@566 PS28, Line 566: be typo http://gerrit.cloudera.org:8080/#/c/16242/28/be/src/exec/topn-node.cc@666 PS28, Line 666: vector> rematerialized_heaps; : for (auto& entry : partition_heaps_) { : RETURN_IF_ERROR(entry.second->RematerializeTuples(this, state, temp_pool.get())); : DCHECK(entry.second->DCheckConsistency()); : // The key references memory in 'tuple_pool_'. Replace it with a rematerialized tuple. : rematerialized_heaps.push_back(move(entry.second)); : } : partition_heaps_.clear(); : for (auto& heap_ptr : rematerialized_heaps) { : const Tuple* key_tuple = heap_ptr->top(); : partition_heaps_.emplace(key_tuple, move(heap_ptr)); : } I think this can be put in an 'else' with the above 'if (heap_ != nullptr)' to make the partitioned vs. unpartitioned handling clearer http://gerrit.cloudera.org:8080/#/c/16242/28/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/16242/28/common/thrift/ImpalaService.thrift@625 PS28, Line 625: // If > 0, the rank()/row_number() pushdown into pre-analytic sorts is enabled Maybe note the default value, and briefly the issues with setting it higher. -- To view, visit http://gerrit.cloudera.org:8080/16242 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic638af9495981d889a4cb7455a71e8be0eb1a8e5 Gerrit-Change-Number: 16242 Gerrit-PatchSet: 28 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 02 Feb 2021 00:25:09 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10161: User LDAP Search bind support
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/16717 ) Change subject: IMPALA-10161: User LDAP Search bind support .. Patch Set 7: (10 comments) http://gerrit.cloudera.org:8080/#/c/16717/7/be/src/util/ldap-search-bind.cc File be/src/util/ldap-search-bind.cc: http://gerrit.cloudera.org:8080/#/c/16717/7/be/src/util/ldap-search-bind.cc@46 PS7, Line 46: std::string Here and at other places: do you intentionally don't use "using namespace std" or "using std::string"? Is there some kind of ambiguity in that case? http://gerrit.cloudera.org:8080/#/c/16717/7/be/src/util/ldap-search-bind.cc@53 PS7, Line 53: Status ldapBaseValidateStatus = ImpalaLdap::ValidateFlags(); : if (!ldapBaseValidateStatus.ok()) return ldapBaseValidateStatus; We generally use the RETURN_IF_ERROR macro for this pattern. http://gerrit.cloudera.org:8080/#/c/16717/7/be/src/util/ldap-search-bind.cc@71 PS7, Line 71: Status ldapBaseInitStatus = ImpalaLdap::Init(user_filter, group_filter); : if (!ldapBaseInitStatus.ok()) return ldapBaseInitStatus; RETURN_IF_ERROR http://gerrit.cloudera.org:8080/#/c/16717/7/be/src/util/ldap-search-bind.cc@100 PS7, Line 100: std::string not needed, user_filter_ is already a string http://gerrit.cloudera.org:8080/#/c/16717/7/be/src/util/ldap-search-bind.cc@137 PS7, Line 137: group_filter_ I think it would be a bit clearer to call find on group_filter instead of group_filter_. The result should be the same. http://gerrit.cloudera.org:8080/#/c/16717/7/be/src/util/ldap-search-bind.cc@142 PS7, Line 142: if (user_dn.empty()) return false; ldap_unbind_ext is not called if we return here http://gerrit.cloudera.org:8080/#/c/16717/7/be/src/util/ldap-simple-bind.cc File be/src/util/ldap-simple-bind.cc: http://gerrit.cloudera.org:8080/#/c/16717/7/be/src/util/ldap-simple-bind.cc@56 PS7, Line 56: Status ldapBaseValidateStatus = ImpalaLdap::ValidateFlags(); : if (!ldapBaseValidateStatus.ok()) return ldapBaseValidateStatus; RETURN_IF_ERROR http://gerrit.cloudera.org:8080/#/c/16717/7/be/src/util/ldap-simple-bind.cc@80 PS7, Line 80: Status ldapBaseInitStatus = ImpalaLdap::Init(user_filter, group_filter); : if (!ldapBaseInitStatus.ok()) return ldapBaseInitStatus; RETURN_IF_ERROR http://gerrit.cloudera.org:8080/#/c/16717/7/fe/src/test/java/org/apache/impala/customcluster/LdapSearchBindImpalaShellTest.java File fe/src/test/java/org/apache/impala/customcluster/LdapSearchBindImpalaShellTest.java: http://gerrit.cloudera.org:8080/#/c/16717/7/fe/src/test/java/org/apache/impala/customcluster/LdapSearchBindImpalaShellTest.java@46 PS7, Line 46: LdapSearchBindImpalaShellTest I think that a lot of duplication could be potentially avoided with LdapSimpleBindImpalaShellTest, e.g. by creating a common base class. If you agree, even if you don't want to deal with this in the current review a followup jira could be created. http://gerrit.cloudera.org:8080/#/c/16717/7/fe/src/test/java/org/apache/impala/customcluster/LdapSearchBindImpalaShellTest.java@291 PS7, Line 291: testLdapSearchBind Can you make the name more descriptive? The whole file seems to be about testLdapSearchBind -- To view, visit http://gerrit.cloudera.org:8080/16717 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I978744ad05d9ef408328d1e4dd2d18c329f4d3b7 Gerrit-Change-Number: 16717 Gerrit-PatchSet: 7 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Mon, 01 Feb 2021 23:21:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10223: Implement INSERT OVERWRITE for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17012 ) Change subject: IMPALA-10223: Implement INSERT OVERWRITE for Iceberg tables .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8059/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17012 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idf4acfb54cf62a3f3b2e8db9d04044580151299c Gerrit-Change-Number: 17012 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Mon, 01 Feb 2021 17:06:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9867: Add Support for Spilling to S3: Milestone 1
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16318 ) Change subject: IMPALA-9867: Add Support for Spilling to S3: Milestone 1 .. Patch Set 33: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8058/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16318 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I419b1d5dbbfe35334d9f964c4b65e553579fdc89 Gerrit-Change-Number: 16318 Gerrit-PatchSet: 33 Gerrit-Owner: Yida Wu Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Yida Wu Gerrit-Comment-Date: Mon, 01 Feb 2021 17:05:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10325: Parquet scan should use min/max statistics to skip pages based on equi-join predicate
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16720 ) Change subject: IMPALA-10325: Parquet scan should use min/max statistics to skip pages based on equi-join predicate .. Patch Set 59: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/8057/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/16720 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I379405ee75b14929df7d6b5d20dabc6f51375691 Gerrit-Change-Number: 16720 Gerrit-PatchSet: 59 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 01 Feb 2021 16:57:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10460: Impala should write normalized paths in Iceberg manifests
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16993 ) Change subject: IMPALA-10460: Impala should write normalized paths in Iceberg manifests .. IMPALA-10460: Impala should write normalized paths in Iceberg manifests Currently Impala writes double slashes in the paths of datafiles for non-partitioned Iceberg tables. Unnormalized paths can cause problems later. This patch removes the redundant slashes. Testing: * Tested manually by inspecting the manifest files of the Iceberg tables. Used both non-partitioned and partitioned tables. Change-Id: If5ecac78102ed35710dd70a18edc71f6e891e748 Reviewed-on: http://gerrit.cloudera.org:8080/16993 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/exec/hdfs-table-sink.cc 1 file changed, 8 insertions(+), 3 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/16993 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: If5ecac78102ed35710dd70a18edc71f6e891e748 Gerrit-Change-Number: 16993 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-10460: Impala should write normalized paths in Iceberg manifests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16993 ) Change subject: IMPALA-10460: Impala should write normalized paths in Iceberg manifests .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16993 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If5ecac78102ed35710dd70a18edc71f6e891e748 Gerrit-Change-Number: 16993 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 01 Feb 2021 16:55:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10223: Implement INSERT OVERWRITE for Iceberg tables
Zoltan Borok-Nagy has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17012 Change subject: IMPALA-10223: Implement INSERT OVERWRITE for Iceberg tables .. IMPALA-10223: Implement INSERT OVERWRITE for Iceberg tables This patch adds support for INSERT OVERWRITE statements for Iceberg tables. We use Iceberg's ReplacePartitions interface for this. This interface provides consistent behavior with INSERT OVERWRITEs against regular tables. It's also consistent with other engines dynamic overwrites, e.g. Spark. INSERT OVERWRITE for partitioned tables replaces the partitions affected by the INSERT, while keeping the other partitions untouched. INSERT OVERWRITE is prohibited for tables that use the BUCKET partition transform because it would randomly overwrite table data. Testing * added e2e test Change-Id: Idf4acfb54cf62a3f3b2e8db9d04044580151299c --- M be/src/service/client-request-state.cc M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test A testdata/workloads/functional-query/queries/QueryTest/iceberg-overwrite.test M tests/query_test/test_iceberg.py 7 files changed, 245 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/17012/1 -- To view, visit http://gerrit.cloudera.org:8080/17012 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Idf4acfb54cf62a3f3b2e8db9d04044580151299c Gerrit-Change-Number: 17012 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-9867: Add Support for Spilling to S3: Milestone 1
Yida Wu has uploaded a new patch set (#33). ( http://gerrit.cloudera.org:8080/16318 ) Change subject: IMPALA-9867: Add Support for Spilling to S3: Milestone 1 .. IMPALA-9867: Add Support for Spilling to S3: Milestone 1 Major Features 1) Local files as buffers for spilling to S3. 2) Async Upload for remote files. 3) Sync remote files deletion after query ends. 4) Local buffer files management. 5) Compatibility of spilling to local and remote. 6) All the errors from hdfs/s3 should terminate the query. Changes on TmpFile: * TmpFile is separated into two types of implementation, TmpFileLocal and TmpFileRemote. TmpFileLocal is used for Spilling to local file system. TmpFileRemote is a new type for Spilling to the remote. It contains two DiskFiles, one for local buffer, the other for the remote file. * The DiskFile is an object that contains the information of a pysical file for passing to the DiskIOMgr to execute the IO operations on that specific file. The DiskFile also contains status information of the file,includes DiskFileStatus::INWRITING/PERSISTED/DELETED. When the DiskFile is initialized, it is in INWRITING status. If the file is persisted into the file system, it would become PERSISTED status. If the file is deleted, for example, the local buffer is evicted, so the DiskFile status of the buffer file would become deleted. After that, if the file is fetching from the remote, the DiskFile status of the buffer file would become INWRITING, and then PERSISTED if the fetching finishes successfully. Implementation Details: 1) A new enum type is added to specify the disk type of files, indicating where the file physically locates. The types include DiskFileType::LOCAL/LOCAL_BUFFER/DFS/S3. DiskFileType::LOCAL indicates the file is in the local file system. DiskFileType::LOCAL_BUFFER indicates the file is in the local file system, and it is the buffer of a remote scratch file. DiskFileType::DFS/S3 indicates the file is in the HDFS/S3. The local buffer allows the buffer pool to pin(read), but mainly for remote files, buffer pool would pin(read) the page from the remote file system. 2) Two disk queues have been added to do the file operation jobs. Queue name: RemoteS3DiskFileOper/RemoteDfsDiskFileOper File operations on the remote disk like upload and fetch should be done in these queues. The purpose of the queues is to isolate the file operations from normal read/write IO operations in different queues. It could increase the efficiency of the file operations by not being interrupted during a relatively long execution time, and also provide a more accurate control on the thread number working on file operation jobs. RemoteOperRange is the new type to carry the file operation jobs. Previously,we have request types of READ and WRITE. Now FILE_FETCH/FILE_UPLOAD are added. 3) The tmp files are physically deleted when the tmp file group is deconstructing. For remote files, the entire directory would be deleted. 4) The local buffer files management is to control the total size of local buffer files and evict files if needed. A local buffer file can be evicted if the temporary file has uploaded a copy to the remote disk or the query ends. There are two modes to decide the sequence of choosing files to be evicted first. Default is LIFO, the other is FIFO. It can be controlled by startup option remote_tmp_files_avail_pool_lifo. Also, a thread TmpFileSpaceReserveThreadLoop in TmpFileMgr is running to allow to reserve buffer file space in an async way to avoid deadlocks. Startup option allow_spill_to_hdfs is added. By default the HDFS path is not allowed, but for testcases, the option can be set true to allow the use of HDFS path as scratch space for testing only. 5) Spilling to local has higher priority than spilling to remote. If no local scratch space is available, temporary data will be spilled to remote. The first available local directory is used for the local buffer for spilling to remote if any remote directory is configured. If remote directory is configured without any available local scratch space, an error will be returned during initialization. The purpose of the design is to simplify the implementation in milestone 1 with less changes on the configuration. Example (setting remote scratch space): Assume that the directories we have for scratch space: * Local dir: /tmp/local_buffer, /tmp/local, /tmp/local_sec * Remote dir: s3a://tmp/remote The scratch space path is configured in the startup options, and could have three types of configurations: 1. Pure local scratch space --scratch_dirs="/tmp/local" 2. Pure remote scratch space --scratch_dirs="s3a://tmp/remote,/tmp/local_buffer:16GB" 3. Mixed local and remote scratch space --scratch_dirs="s3a://tmp/romote:200GB,/tmp/local_buffer:1GB,
[Impala-ASF-CR] IMPALA-10325: Parquet scan should use min/max statistics to skip pages based on equi-join predicate
Qifan Chen has uploaded a new patch set (#59). ( http://gerrit.cloudera.org:8080/16720 ) Change subject: IMPALA-10325: Parquet scan should use min/max statistics to skip pages based on equi-join predicate .. IMPALA-10325: Parquet scan should use min/max statistics to skip pages based on equi-join predicate This patch adds a new class of predicates called overlap predicates to aid in the determination of whether a Parquet row group or a page overlap with a range computed from an equi hash join. If not, then the entire row group or page are skipped. When a row survives this way, it can be subjected to the row-level overlapping test against the same overlap predicate. For the following query, the min and max in the overlap predicate are computed with the values from the join column from table 'b'. To evaluate the overlap predicate, these two values are compared against the min/max of each row group or page at the scan node for 'a'. select straight_join count(*) from lineitem a join [SHUFFLE] lineitem b where a.l_shipdate = b.l_receiptdate and b.l_commitdate = "1992-01-31"; An overlap predicate associated with the column type J (in hash table) and scan column type S will be formed when one of the following is true: Both J and S are booleans Both J and S are integers (tinyint, smallint, int, or bigint) Both J and S are approximate numeric (float or double) Both J and S are decimals with the same precision and scale Both J and S are strings (STRING, CHAR or VARCHAR) Both J and S are date Both J and S are timestamp The overlap predicate is implemented as a min/max filter. Unlike existing min/max filters, MAX_NUM_RUNTIME_FILTERS query option does not apply to min/max filters created for overlap predicates. An overlap predicate will be evaluated as long as the overlap ratio is less than a thresold specified in a new query option 'minmax_filter_threshold'. Setting the threshold to its minimal value 0.0 disables the feature, and setting it to the maximal value 1.0 applies the filtering in all cases. A second query option, disable_row_minmax_filtering, can be used to disable row level filtering with overlap predicates. In addition, two new run-time profile counters are added to report the number of row groups or pages filtered out via the overlap predicates respectively: 1. NumRuntimeFilteredRowGroups 2. NumRuntimeFilteredPages Testing: 1. Unit tested on various column types with TPCH and TPCDS tables. Benefits were significant when the join column on the outer table is sorted and there exist many row groups or pages no overlapping with the implementing min/max filters; 2. Added following new tests: a. in min_max_filters.test to demonstrate the number of filtered out pages and row groups with the two new profile counters; b. in runtime-filter-propagation.test to demonstrate that the overlap predicates work with different column types; c. data type specific overlap method tests in min-max-filter-test.cc; 3. Core testing; 4. Performance measurement. To do in follow-up JIRAs: 1. Improve filtering efficiency; 2. Apply the overlap predicate on partition columns; 3. IR code-gen for various MinMaxFilter::EvalOverlap methods. Change-Id: I379405ee75b14929df7d6b5d20dabc6f51375691 --- M be/src/exec/exec-node.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner-ir.cc M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/parquet/parquet-column-stats.cc M be/src/exec/parquet/parquet-column-stats.h M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/scan-node.cc M be/src/runtime/coordinator.cc M be/src/runtime/date-value.cc M be/src/runtime/date-value.h M be/src/runtime/raw-value.h M be/src/runtime/runtime-filter-ir.cc M be/src/runtime/string-value-test.cc M be/src/runtime/string-value.cc M be/src/runtime/string-value.h M be/src/runtime/timestamp-value.cc M be/src/runtime/timestamp-value.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/debug-util.cc M be/src/util/debug-util.h M be/src/util/min-max-filter-ir.cc M be/src/util/min-max-filter-test.cc M be/src/util/min-max-filter.cc M be/src/util/min-max-filter.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java M fe/src/main/java/org/apache/impala/analysis/Predicate.java M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.te
[Impala-ASF-CR] IMPALA-9588: Add extra logging to cancel tests
Tamas Mate has posted comments on this change. ( http://gerrit.cloudera.org:8080/16985 ) Change subject: IMPALA-9588: Add extra logging to cancel tests .. Patch Set 1: (2 comments) Hi Gabor, found 2 nits, aside from those LGTM! http://gerrit.cloudera.org:8080/#/c/16985/1/tests/util/cancel_util.py File tests/util/cancel_util.py: http://gerrit.cloudera.org:8080/#/c/16985/1/tests/util/cancel_util.py@42 PS1, Line 42: occured nit: occurred http://gerrit.cloudera.org:8080/#/c/16985/1/tests/util/cancel_util.py@48 PS1, Line 48: "\n" nit: this could just go after the previous line, like error_msg += str(thread.fetch_results_error) + "\n" -- To view, visit http://gerrit.cloudera.org:8080/16985 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ied7100a9ea2e2f0611cf8e328e589b4c8e5d5100 Gerrit-Change-Number: 16985 Gerrit-PatchSet: 1 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 01 Feb 2021 13:09:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10161: User LDAP Search bind support
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16717 ) Change subject: IMPALA-10161: User LDAP Search bind support .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8056/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16717 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I978744ad05d9ef408328d1e4dd2d18c329f4d3b7 Gerrit-Change-Number: 16717 Gerrit-PatchSet: 7 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Mon, 01 Feb 2021 11:48:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10460: Impala should write normalized paths in Iceberg manifests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16993 ) Change subject: IMPALA-10460: Impala should write normalized paths in Iceberg manifests .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8055/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16993 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If5ecac78102ed35710dd70a18edc71f6e891e748 Gerrit-Change-Number: 16993 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 01 Feb 2021 11:34:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10161: User LDAP Search bind support
Tamas Mate has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/16717 ) Change subject: IMPALA-10161: User LDAP Search bind support .. IMPALA-10161: User LDAP Search bind support This change adds user search bind support next to simple bind that can be configured with LDAP filters. The group check was done with LDAP search earlier, this change adds the possibility to configure it with Hadoop library like options, which is the LDAP filter with optional patterns. The '{0}' will be replaced with the user name while the '{1}' pattern will be replaced with the user dn. The following new flags have been added: --ldap_search_bind: a flag to change between simple and search bind --ldap_user_search_basedn: the base dn for the LDAP subtree to search --ldap_group_search_basedn: the base dn for the LDAP subtree to search Tested: - Custom cluster tests have been added Change-Id: I978744ad05d9ef408328d1e4dd2d18c329f4d3b7 --- M be/src/rpc/authentication.cc M be/src/util/CMakeLists.txt A be/src/util/ldap-search-bind.cc A be/src/util/ldap-search-bind.h A be/src/util/ldap-simple-bind.cc A be/src/util/ldap-simple-bind.h M be/src/util/ldap-util.cc M be/src/util/ldap-util.h M be/src/util/webserver.cc M be/src/util/webserver.h C fe/src/test/java/org/apache/impala/customcluster/LdapSearchBindImpalaShellTest.java R fe/src/test/java/org/apache/impala/customcluster/LdapSimpleBindImpalaShellTest.java M fe/src/test/java/org/apache/impala/testutil/LdapUtil.java M fe/src/test/resources/users.ldif 14 files changed, 702 insertions(+), 298 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/17/16717/7 -- To view, visit http://gerrit.cloudera.org:8080/16717 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I978744ad05d9ef408328d1e4dd2d18c329f4d3b7 Gerrit-Change-Number: 16717 Gerrit-PatchSet: 7 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-10456: Implement TRUNCATE for Iceberg tables
wangsheng has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16987 ) Change subject: IMPALA-10456: Implement TRUNCATE for Iceberg tables .. IMPALA-10456: Implement TRUNCATE for Iceberg tables This patch adds support for the TRUNCATE statement for Iceberg tables. The TRUNCATE operation creates a new snapshot for the target table that doesn't have any data files. Table and column stats are also cleared. This patch also fixes a bug that caused table/column stats not being propagated. Testing * added e2e tests for both partitioned and unpartitioned tables Change-Id: I6116c7c36aba871c0be79f499e0ac618072ca7b8 Reviewed-on: http://gerrit.cloudera.org:8080/16987 Tested-by: Impala Public Jenkins Reviewed-by: wangsheng --- M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test A testdata/workloads/functional-query/queries/QueryTest/iceberg-truncate.test M tests/query_test/test_iceberg.py 8 files changed, 117 insertions(+), 17 deletions(-) Approvals: Impala Public Jenkins: Verified wangsheng: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/16987 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I6116c7c36aba871c0be79f499e0ac618072ca7b8 Gerrit-Change-Number: 16987 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: wangsheng
[Impala-ASF-CR] IMPALA-10456: Implement TRUNCATE for Iceberg tables
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/16987 ) Change subject: IMPALA-10456: Implement TRUNCATE for Iceberg tables .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16987 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6116c7c36aba871c0be79f499e0ac618072ca7b8 Gerrit-Change-Number: 16987 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Mon, 01 Feb 2021 11:13:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10460: Impala should write normalized paths in Iceberg manifests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16993 ) Change subject: IMPALA-10460: Impala should write normalized paths in Iceberg manifests .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6865/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16993 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If5ecac78102ed35710dd70a18edc71f6e891e748 Gerrit-Change-Number: 16993 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 01 Feb 2021 11:13:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10460: Impala should write normalized paths in Iceberg manifests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16993 ) Change subject: IMPALA-10460: Impala should write normalized paths in Iceberg manifests .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16993 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If5ecac78102ed35710dd70a18edc71f6e891e748 Gerrit-Change-Number: 16993 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 01 Feb 2021 11:13:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10460: Impala should write normalized paths in Iceberg manifests
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16993 ) Change subject: IMPALA-10460: Impala should write normalized paths in Iceberg manifests .. Patch Set 2: Code-Review+2 (1 comment) Carry +2 http://gerrit.cloudera.org:8080/#/c/16993/1/be/src/exec/hdfs-table-sink.cc File be/src/exec/hdfs-table-sink.cc: http://gerrit.cloudera.org:8080/#/c/16993/1/be/src/exec/hdfs-table-sink.cc@264 PS1, Line 264: > optional: maybe it would be clearer to do two separate Substitute based on Done -- To view, visit http://gerrit.cloudera.org:8080/16993 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If5ecac78102ed35710dd70a18edc71f6e891e748 Gerrit-Change-Number: 16993 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 01 Feb 2021 11:13:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10460: Impala should write normalized paths in Iceberg manifests
Hello Gabor Kaszab, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16993 to look at the new patch set (#2). Change subject: IMPALA-10460: Impala should write normalized paths in Iceberg manifests .. IMPALA-10460: Impala should write normalized paths in Iceberg manifests Currently Impala writes double slashes in the paths of datafiles for non-partitioned Iceberg tables. Unnormalized paths can cause problems later. This patch removes the redundant slashes. Testing: * Tested manually by inspecting the manifest files of the Iceberg tables. Used both non-partitioned and partitioned tables. Change-Id: If5ecac78102ed35710dd70a18edc71f6e891e748 --- M be/src/exec/hdfs-table-sink.cc 1 file changed, 8 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/16993/2 -- To view, visit http://gerrit.cloudera.org:8080/16993 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If5ecac78102ed35710dd70a18edc71f6e891e748 Gerrit-Change-Number: 16993 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins