[Impala-ASF-CR] IMPALA-10389: impala-profile-tool container

2021-02-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17015 )

Change subject: IMPALA-10389: impala-profile-tool container
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8061/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36915cd686ab930dcc934bc0c81bff8c16d46714
Gerrit-Change-Number: 17015
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Tue, 02 Feb 2021 01:32:25 +
Gerrit-HasComments: No


[native-toolchain-CR] [config] bump toolchain build id for Kudu 1.14

2021-02-01 Thread Alexey Serbin (Code Review)
Alexey Serbin has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/17016


Change subject: [config] bump toolchain build id for Kudu 1.14
..

[config] bump toolchain build id for Kudu 1.14

The motivation for this version patch is two-fold:

  * Update the version of Kudu client to reflect the recently
released Kudu 1.14 (see https://kudu.apache.org/releases/1.14.0/)

  * Be able to pick up https://gerrit.cloudera.org/#/c/16705 change
(control of Kudu client connection negotiation timeout for impalad)

Change-Id: I5e75ca996670a7abf161f0c5e7751031391fd959
---
M buildall.sh
1 file changed, 1 insertion(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/16/17016/1
--
To view, visit http://gerrit.cloudera.org:8080/17016
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I5e75ca996670a7abf161f0c5e7751031391fd959
Gerrit-Change-Number: 17016
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] [config] bump toolchain build id for Kudu 1.14

2021-02-01 Thread Alexey Serbin (Code Review)
Alexey Serbin has abandoned this change. ( 
http://gerrit.cloudera.org:8080/17014 )

Change subject: [config] bump toolchain build id for Kudu 1.14
..


Abandoned

It seems this change should be done automatically.  I posted corresponding 
change for the native-toolchain project: https://gerrit.cloudera.org/#/c/17016/
--
To view, visit http://gerrit.cloudera.org:8080/17014
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: Icb8a8ba2660c6c7ffa03a9b0874d427c1fec3439
Gerrit-Change-Number: 17014
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-10389: impala-profile-tool container

2021-02-01 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/17015


Change subject: IMPALA-10389: impala-profile-tool container
..

IMPALA-10389: impala-profile-tool container

Add a build step for an impala-profile-tool docker image
that makes it easy to run the binary on any system.

This container is automatically built as part of the
docker build.

This sets up a new build context that doesn't pull in all of
the same dependencies or depend on the Java build

Testing:

  cat logs/cluster/profiles/* | \
docker run -i impala_profile_tool

I uploaded a build of the container to dockerhub too:

  timgarmstrong/impala_profile_tool

Change-Id: I36915cd686ab930dcc934bc0c81bff8c16d46714
---
M docker/CMakeLists.txt
A docker/impala_profile_tool/Dockerfile
M docker/setup_build_context.py
A docker/utility_entrypoint.sh
4 files changed, 200 insertions(+), 43 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/17015/1
--
To view, visit http://gerrit.cloudera.org:8080/17015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I36915cd686ab930dcc934bc0c81bff8c16d46714
Gerrit-Change-Number: 17015
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] [config] bump toolchain build id for Kudu 1.14

2021-02-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17014 )

Change subject: [config] bump toolchain build id for Kudu 1.14
..


Patch Set 1:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/8060/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/17014
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icb8a8ba2660c6c7ffa03a9b0874d427c1fec3439
Gerrit-Change-Number: 17014
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 02 Feb 2021 01:02:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] [config] bump toolchain build id for Kudu 1.14

2021-02-01 Thread Alexey Serbin (Code Review)
Hello Thomas Tauber-Marshall, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17014

to look at the new patch set (#2).

Change subject: [config] bump toolchain build id for Kudu 1.14
..

[config] bump toolchain build id for Kudu 1.14

The motivation for this version patch is two-fold:

  * Update the version of Kudu client to reflect the recently
released Kudu 1.14 (see https://kudu.apache.org/releases/1.14.0/)

  * Be able to pick up https://gerrit.cloudera.org/#/c/16705 change
(control of Kudu client connection negotiation timeout for impalad)

Change-Id: Icb8a8ba2660c6c7ffa03a9b0874d427c1fec3439
---
M bin/impala-config.sh
1 file changed, 2 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/17014/2
--
To view, visit http://gerrit.cloudera.org:8080/17014
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Icb8a8ba2660c6c7ffa03a9b0874d427c1fec3439
Gerrit-Change-Number: 17014
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] [config] bump toolchain build id for Kudu 1.14

2021-02-01 Thread Alexey Serbin (Code Review)
Alexey Serbin has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/17014


Change subject: [config] bump toolchain build id for Kudu 1.14
..

[config] bump toolchain build id for Kudu 1.14

The motivation for this version patch is two-fold:

  * Update the version of Kudu client to reflect the recently
released Kudu 1.14 (see https://kudu.apache.org/releases/1.14.0/)

  * To able to pick up https://gerrit.cloudera.org/#/c/16705 change
(control of Kudu client connection negotiation timeout for impalad)

Change-Id: Icb8a8ba2660c6c7ffa03a9b0874d427c1fec3439
---
M bin/impala-config.sh
1 file changed, 2 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/17014/1
--
To view, visit http://gerrit.cloudera.org:8080/17014
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Icb8a8ba2660c6c7ffa03a9b0874d427c1fec3439
Gerrit-Change-Number: 17014
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9979: part 2: partitioned top-n

2021-02-01 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16242 )

Change subject: IMPALA-9979: part 2: partitioned top-n
..


Patch Set 28:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/16242/28//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16242/28//COMMIT_MSG@59
PS28, Line 59: and the tie-handling
 : semantics required by rank() predicates
nit: I think this was really implemented in your previous patch?


http://gerrit.cloudera.org:8080/#/c/16242/28/be/src/exec/topn-node.h
File be/src/exec/topn-node.h:

http://gerrit.cloudera.org:8080/#/c/16242/28/be/src/exec/topn-node.h@64
PS28, Line 64: int64_t limit = is_partitioned() ? per_partition_limit() :
What's the relationship between 'include_ties' and 'is_partitioned', i.e. why 
does 'include_ties' here matter for the unpartitioned case but not the 
partitioned case?


http://gerrit.cloudera.org:8080/#/c/16242/28/be/src/exec/topn-node.cc
File be/src/exec/topn-node.cc:

http://gerrit.cloudera.org:8080/#/c/16242/28/be/src/exec/topn-node.cc@244
PS28, Line 244: U
typo


http://gerrit.cloudera.org:8080/#/c/16242/28/be/src/exec/topn-node.cc@399
PS28, Line 399: RETURN_IF_ERROR(QueryMaintenance(state));
This results in two calls to QueryMaintenance() in quick succession, here and 
in GetNext(), might be better to avoid that


http://gerrit.cloudera.org:8080/#/c/16242/28/be/src/exec/topn-node.cc@566
PS28, Line 566: be
typo


http://gerrit.cloudera.org:8080/#/c/16242/28/be/src/exec/topn-node.cc@666
PS28, Line 666: vector> rematerialized_heaps;
  :   for (auto& entry : partition_heaps_) {
  : RETURN_IF_ERROR(entry.second->RematerializeTuples(this, 
state, temp_pool.get()));
  : DCHECK(entry.second->DCheckConsistency());
  : // The key references memory in 'tuple_pool_'. Replace it 
with a rematerialized tuple.
  : rematerialized_heaps.push_back(move(entry.second));
  :   }
  :   partition_heaps_.clear();
  :   for (auto& heap_ptr : rematerialized_heaps) {
  : const Tuple* key_tuple = heap_ptr->top();
  : partition_heaps_.emplace(key_tuple, move(heap_ptr));
  :   }
I think this can be put in an 'else' with the above 'if (heap_ != nullptr)' to 
make the partitioned vs. unpartitioned handling clearer


http://gerrit.cloudera.org:8080/#/c/16242/28/common/thrift/ImpalaService.thrift
File common/thrift/ImpalaService.thrift:

http://gerrit.cloudera.org:8080/#/c/16242/28/common/thrift/ImpalaService.thrift@625
PS28, Line 625:   // If > 0, the rank()/row_number() pushdown into pre-analytic 
sorts is enabled
Maybe note the default value, and briefly the issues with setting it higher.



--
To view, visit http://gerrit.cloudera.org:8080/16242
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic638af9495981d889a4cb7455a71e8be0eb1a8e5
Gerrit-Change-Number: 16242
Gerrit-PatchSet: 28
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 02 Feb 2021 00:25:09 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10161: User LDAP Search bind support

2021-02-01 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16717 )

Change subject: IMPALA-10161: User LDAP Search bind support
..


Patch Set 7:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/16717/7/be/src/util/ldap-search-bind.cc
File be/src/util/ldap-search-bind.cc:

http://gerrit.cloudera.org:8080/#/c/16717/7/be/src/util/ldap-search-bind.cc@46
PS7, Line 46: std::string
Here and at other places: do you intentionally don't use "using namespace std" 
or "using std::string"? Is there some kind of ambiguity in that case?


http://gerrit.cloudera.org:8080/#/c/16717/7/be/src/util/ldap-search-bind.cc@53
PS7, Line 53:   Status ldapBaseValidateStatus = ImpalaLdap::ValidateFlags();
:   if (!ldapBaseValidateStatus.ok()) return ldapBaseValidateStatus;
We generally use the RETURN_IF_ERROR macro for this pattern.


http://gerrit.cloudera.org:8080/#/c/16717/7/be/src/util/ldap-search-bind.cc@71
PS7, Line 71:   Status ldapBaseInitStatus = ImpalaLdap::Init(user_filter, 
group_filter);
:   if (!ldapBaseInitStatus.ok()) return ldapBaseInitStatus;
RETURN_IF_ERROR


http://gerrit.cloudera.org:8080/#/c/16717/7/be/src/util/ldap-search-bind.cc@100
PS7, Line 100: std::string
not needed, user_filter_ is already a string


http://gerrit.cloudera.org:8080/#/c/16717/7/be/src/util/ldap-search-bind.cc@137
PS7, Line 137: group_filter_
I think it would be a bit clearer to call find on group_filter instead of 
group_filter_. The result should be the same.


http://gerrit.cloudera.org:8080/#/c/16717/7/be/src/util/ldap-search-bind.cc@142
PS7, Line 142: if (user_dn.empty()) return false;
ldap_unbind_ext is not called if we return here


http://gerrit.cloudera.org:8080/#/c/16717/7/be/src/util/ldap-simple-bind.cc
File be/src/util/ldap-simple-bind.cc:

http://gerrit.cloudera.org:8080/#/c/16717/7/be/src/util/ldap-simple-bind.cc@56
PS7, Line 56:   Status ldapBaseValidateStatus = ImpalaLdap::ValidateFlags();
:   if (!ldapBaseValidateStatus.ok()) return ldapBaseValidateStatus;
RETURN_IF_ERROR


http://gerrit.cloudera.org:8080/#/c/16717/7/be/src/util/ldap-simple-bind.cc@80
PS7, Line 80:   Status ldapBaseInitStatus = ImpalaLdap::Init(user_filter, 
group_filter);
:   if (!ldapBaseInitStatus.ok()) return ldapBaseInitStatus;
RETURN_IF_ERROR


http://gerrit.cloudera.org:8080/#/c/16717/7/fe/src/test/java/org/apache/impala/customcluster/LdapSearchBindImpalaShellTest.java
File 
fe/src/test/java/org/apache/impala/customcluster/LdapSearchBindImpalaShellTest.java:

http://gerrit.cloudera.org:8080/#/c/16717/7/fe/src/test/java/org/apache/impala/customcluster/LdapSearchBindImpalaShellTest.java@46
PS7, Line 46: LdapSearchBindImpalaShellTest
I think that a lot of duplication could be potentially avoided with 
LdapSimpleBindImpalaShellTest, e.g. by creating a common base class. If you 
agree, even if you don't want to deal with this in the current review a 
followup jira could be created.


http://gerrit.cloudera.org:8080/#/c/16717/7/fe/src/test/java/org/apache/impala/customcluster/LdapSearchBindImpalaShellTest.java@291
PS7, Line 291: testLdapSearchBind
Can you make the name more descriptive? The whole file seems to be about 
testLdapSearchBind



--
To view, visit http://gerrit.cloudera.org:8080/16717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I978744ad05d9ef408328d1e4dd2d18c329f4d3b7
Gerrit-Change-Number: 16717
Gerrit-PatchSet: 7
Gerrit-Owner: Tamas Mate 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Mon, 01 Feb 2021 23:21:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10223: Implement INSERT OVERWRITE for Iceberg tables

2021-02-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17012 )

Change subject: IMPALA-10223: Implement INSERT OVERWRITE for Iceberg tables
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8059/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17012
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idf4acfb54cf62a3f3b2e8db9d04044580151299c
Gerrit-Change-Number: 17012
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 01 Feb 2021 17:06:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9867: Add Support for Spilling to S3: Milestone 1

2021-02-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16318 )

Change subject: IMPALA-9867: Add Support for Spilling to S3: Milestone 1
..


Patch Set 33:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8058/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16318
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I419b1d5dbbfe35334d9f964c4b65e553579fdc89
Gerrit-Change-Number: 16318
Gerrit-PatchSet: 33
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Mon, 01 Feb 2021 17:05:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10325: Parquet scan should use min/max statistics to skip pages based on equi-join predicate

2021-02-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16720 )

Change subject: IMPALA-10325: Parquet scan should use min/max statistics to 
skip pages based on equi-join predicate
..


Patch Set 59:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/8057/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/16720
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I379405ee75b14929df7d6b5d20dabc6f51375691
Gerrit-Change-Number: 16720
Gerrit-PatchSet: 59
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 01 Feb 2021 16:57:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10460: Impala should write normalized paths in Iceberg manifests

2021-02-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16993 )

Change subject: IMPALA-10460: Impala should write normalized paths in Iceberg 
manifests
..

IMPALA-10460: Impala should write normalized paths in Iceberg manifests

Currently Impala writes double slashes in the paths of datafiles
for non-partitioned Iceberg tables. Unnormalized paths can cause
problems later.

This patch removes the redundant slashes.

Testing:
 * Tested manually by inspecting the manifest files of the
   Iceberg tables. Used both non-partitioned and partitioned tables.

Change-Id: If5ecac78102ed35710dd70a18edc71f6e891e748
Reviewed-on: http://gerrit.cloudera.org:8080/16993
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M be/src/exec/hdfs-table-sink.cc
1 file changed, 8 insertions(+), 3 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/16993
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: If5ecac78102ed35710dd70a18edc71f6e891e748
Gerrit-Change-Number: 16993
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-10460: Impala should write normalized paths in Iceberg manifests

2021-02-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16993 )

Change subject: IMPALA-10460: Impala should write normalized paths in Iceberg 
manifests
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16993
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If5ecac78102ed35710dd70a18edc71f6e891e748
Gerrit-Change-Number: 16993
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 01 Feb 2021 16:55:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10223: Implement INSERT OVERWRITE for Iceberg tables

2021-02-01 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/17012


Change subject: IMPALA-10223: Implement INSERT OVERWRITE for Iceberg tables
..

IMPALA-10223: Implement INSERT OVERWRITE for Iceberg tables

This patch adds support for INSERT OVERWRITE statements for
Iceberg tables. We use Iceberg's ReplacePartitions interface
for this. This interface provides consistent behavior with
INSERT OVERWRITEs against regular tables. It's also consistent
with other engines dynamic overwrites, e.g. Spark.

INSERT OVERWRITE for partitioned tables replaces the partitions
affected by the INSERT, while keeping the other partitions
untouched.

INSERT OVERWRITE is prohibited for tables that use the BUCKET
partition transform because it would randomly overwrite table
data.

Testing
 * added e2e test

Change-Id: Idf4acfb54cf62a3f3b2e8db9d04044580151299c
---
M be/src/service/client-request-state.cc
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java
M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test
A testdata/workloads/functional-query/queries/QueryTest/iceberg-overwrite.test
M tests/query_test/test_iceberg.py
7 files changed, 245 insertions(+), 7 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/17012/1
--
To view, visit http://gerrit.cloudera.org:8080/17012
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Idf4acfb54cf62a3f3b2e8db9d04044580151299c
Gerrit-Change-Number: 17012
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-9867: Add Support for Spilling to S3: Milestone 1

2021-02-01 Thread Yida Wu (Code Review)
Yida Wu has uploaded a new patch set (#33). ( 
http://gerrit.cloudera.org:8080/16318 )

Change subject: IMPALA-9867: Add Support for Spilling to S3: Milestone 1
..

IMPALA-9867: Add Support for Spilling to S3: Milestone 1

Major Features
1) Local files as buffers for spilling to S3.
2) Async Upload for remote files.
3) Sync remote files deletion after query ends.
4) Local buffer files management.
5) Compatibility of spilling to local and remote.
6) All the errors from hdfs/s3 should terminate the query.

Changes on TmpFile:
* TmpFile is separated into two types of implementation, TmpFileLocal
  and TmpFileRemote.
  TmpFileLocal is used for Spilling to local file system.
  TmpFileRemote is a new type for Spilling to the remote. It contains
  two DiskFiles, one for local buffer, the other for the remote file.
* The DiskFile is an object that contains the information of a pysical
  file for passing to the DiskIOMgr to execute the IO operations on
  that specific file. The DiskFile also contains status information of
  the file,includes DiskFileStatus::INWRITING/PERSISTED/DELETED.
  When the DiskFile is initialized, it is in INWRITING status. If the
  file is persisted into the file system, it would become PERSISTED
  status. If the file is deleted, for example, the local buffer is
  evicted, so the DiskFile status of the buffer file would become
  deleted. After that, if the file is fetching from the remote, the
  DiskFile status of the buffer file would become INWRITING, and then
  PERSISTED if the fetching finishes successfully.

Implementation Details:
1) A new enum type is added to specify the disk type of files,
   indicating where the file physically locates.
   The types include DiskFileType::LOCAL/LOCAL_BUFFER/DFS/S3.
   DiskFileType::LOCAL indicates the file is in the local file system.
   DiskFileType::LOCAL_BUFFER indicates the file is in the local file
   system, and it is the buffer of a remote scratch file.
   DiskFileType::DFS/S3 indicates the file is in the HDFS/S3.
   The local buffer allows the buffer pool to pin(read), but mainly
   for remote files, buffer pool would pin(read) the page from the
   remote file system.
2) Two disk queues have been added to do the file operation jobs.
   Queue name: RemoteS3DiskFileOper/RemoteDfsDiskFileOper
   File operations on the remote disk like upload and fetch should
   be done in these queues. The purpose of the queues is to isolate
   the file operations from normal read/write IO operations in different
   queues. It could increase the efficiency of the file operations by
   not being interrupted during a relatively long execution time, and
   also provide a more accurate control on the thread number working on
   file operation jobs.
   RemoteOperRange is the new type to carry the file operation jobs.
   Previously,we have request types of READ and WRITE.
   Now FILE_FETCH/FILE_UPLOAD are added.
3) The tmp files are physically deleted when the tmp file group is
   deconstructing. For remote files, the entire directory would be
   deleted.
4) The local buffer files management is to control the total size
   of local buffer files and evict files if needed.
   A local buffer file can be evicted if the temporary file has uploaded
   a copy to the remote disk or the query ends.
   There are two modes to decide the sequence of choosing files to be
   evicted first. Default is LIFO, the other is FIFO. It can be
   controlled by startup option remote_tmp_files_avail_pool_lifo.
   Also, a thread TmpFileSpaceReserveThreadLoop in TmpFileMgr is
   running to allow to reserve buffer file space in an async way to
   avoid deadlocks.
   Startup option allow_spill_to_hdfs is added. By default the HDFS path
   is not allowed, but for testcases, the option can be set true to
   allow the use of HDFS path as scratch space for testing only.
5) Spilling to local has higher priority than spilling to remote.
   If no local scratch space is available, temporary data will be
   spilled to remote.
   The first available local directory is used for the local buffer
   for spilling to remote if any remote directory is configured.
   If remote directory is configured without any available local
   scratch space, an error will be returned during initialization.
   The purpose of the design is to simplify the implementation in
   milestone 1 with less changes on the configuration.

Example (setting remote scratch space):
Assume that the directories we have for scratch space:
* Local dir: /tmp/local_buffer, /tmp/local, /tmp/local_sec
* Remote dir: s3a://tmp/remote
The scratch space path is configured in the startup options, and could
have three types of configurations:
1. Pure local scratch space
  --scratch_dirs="/tmp/local"
2. Pure remote scratch space
  --scratch_dirs="s3a://tmp/remote,/tmp/local_buffer:16GB"
3. Mixed local and remote scratch space
  --scratch_dirs="s3a://tmp/romote:200GB,/tmp/local_buffer:1GB,

[Impala-ASF-CR] IMPALA-10325: Parquet scan should use min/max statistics to skip pages based on equi-join predicate

2021-02-01 Thread Qifan Chen (Code Review)
Qifan Chen has uploaded a new patch set (#59). ( 
http://gerrit.cloudera.org:8080/16720 )

Change subject: IMPALA-10325: Parquet scan should use min/max statistics to 
skip pages based on equi-join predicate
..

IMPALA-10325: Parquet scan should use min/max statistics to skip pages based on 
equi-join predicate

This patch adds a new class of predicates called overlap predicates
to aid in the determination of whether a Parquet row group or a page
overlap with a range computed from an equi hash join. If not, then
the entire row group or page are skipped.  When a row survives this way,
it can be subjected to the row-level overlapping test against the same
overlap predicate.

For the following query, the min and max in the overlap predicate are
computed with the values from the join column from table 'b'. To
evaluate the overlap predicate, these two values are compared against
the min/max of each row group or page at the scan node for 'a'.

  select straight_join count(*)
  from lineitem a join [SHUFFLE] lineitem b
  where a.l_shipdate = b.l_receiptdate
  and b.l_commitdate = "1992-01-31";

An overlap predicate associated with the column type J (in hash table)
and scan column type S will be formed when one of the following is true:
   Both J and S are booleans
   Both J and S are integers (tinyint, smallint, int, or bigint)
   Both J and S are approximate numeric (float or double)
   Both J and S are decimals with the same precision and scale
   Both J and S are strings (STRING, CHAR or VARCHAR)
   Both J and S are date
   Both J and S are timestamp

The overlap predicate is implemented as a min/max filter. Unlike
existing min/max filters, MAX_NUM_RUNTIME_FILTERS query option does
not apply to min/max filters created for overlap predicates. An overlap
predicate will be evaluated as long as the overlap ratio is less than a
thresold specified in a new query option 'minmax_filter_threshold'.
Setting the threshold to its minimal value 0.0 disables the feature,
and setting it to the maximal value 1.0 applies the filtering in all
cases. A second query option, disable_row_minmax_filtering, can be
used to disable row level filtering with overlap predicates.

In addition, two new run-time profile counters are added to report the
number of row groups or pages filtered out via the overlap predicates
respectively:
  1. NumRuntimeFilteredRowGroups
  2. NumRuntimeFilteredPages

Testing:
1. Unit tested on various column types with TPCH and TPCDS tables.
   Benefits were significant when the join column on the outer table
   is sorted and there exist many row groups or pages no overlapping
   with the implementing min/max filters;
2. Added following new tests:
a. in min_max_filters.test to demonstrate the number of filtered
   out pages and row groups with the two new profile counters;
b. in runtime-filter-propagation.test to demonstrate that the
   overlap predicates work with different column types;
c. data type specific overlap method tests in
   min-max-filter-test.cc;
3. Core testing;
4. Performance measurement.

To do in follow-up JIRAs:
1. Improve filtering efficiency;
2. Apply the overlap predicate on partition columns;
3. IR code-gen for various MinMaxFilter::EvalOverlap methods.

Change-Id: I379405ee75b14929df7d6b5d20dabc6f51375691
---
M be/src/exec/exec-node.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scanner-ir.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/parquet/hdfs-parquet-scanner.cc
M be/src/exec/parquet/hdfs-parquet-scanner.h
M be/src/exec/parquet/parquet-column-stats.cc
M be/src/exec/parquet/parquet-column-stats.h
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/scan-node.cc
M be/src/runtime/coordinator.cc
M be/src/runtime/date-value.cc
M be/src/runtime/date-value.h
M be/src/runtime/raw-value.h
M be/src/runtime/runtime-filter-ir.cc
M be/src/runtime/string-value-test.cc
M be/src/runtime/string-value.cc
M be/src/runtime/string-value.h
M be/src/runtime/timestamp-value.cc
M be/src/runtime/timestamp-value.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M be/src/util/debug-util.cc
M be/src/util/debug-util.h
M be/src/util/min-max-filter-ir.cc
M be/src/util/min-max-filter-test.cc
M be/src/util/min-max-filter.cc
M be/src/util/min-max-filter.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Predicate.java
M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.te

[Impala-ASF-CR] IMPALA-9588: Add extra logging to cancel tests

2021-02-01 Thread Tamas Mate (Code Review)
Tamas Mate has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16985 )

Change subject: IMPALA-9588: Add extra logging to cancel tests
..


Patch Set 1:

(2 comments)

Hi Gabor, found 2 nits, aside from those LGTM!

http://gerrit.cloudera.org:8080/#/c/16985/1/tests/util/cancel_util.py
File tests/util/cancel_util.py:

http://gerrit.cloudera.org:8080/#/c/16985/1/tests/util/cancel_util.py@42
PS1, Line 42: occured
nit: occurred


http://gerrit.cloudera.org:8080/#/c/16985/1/tests/util/cancel_util.py@48
PS1, Line 48: "\n"
nit: this could just go after the previous line, like
error_msg += str(thread.fetch_results_error) + "\n"



--
To view, visit http://gerrit.cloudera.org:8080/16985
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ied7100a9ea2e2f0611cf8e328e589b4c8e5d5100
Gerrit-Change-Number: 16985
Gerrit-PatchSet: 1
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 01 Feb 2021 13:09:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10161: User LDAP Search bind support

2021-02-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16717 )

Change subject: IMPALA-10161: User LDAP Search bind support
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8056/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I978744ad05d9ef408328d1e4dd2d18c329f4d3b7
Gerrit-Change-Number: 16717
Gerrit-PatchSet: 7
Gerrit-Owner: Tamas Mate 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Mon, 01 Feb 2021 11:48:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10460: Impala should write normalized paths in Iceberg manifests

2021-02-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16993 )

Change subject: IMPALA-10460: Impala should write normalized paths in Iceberg 
manifests
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8055/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16993
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If5ecac78102ed35710dd70a18edc71f6e891e748
Gerrit-Change-Number: 16993
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 01 Feb 2021 11:34:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10161: User LDAP Search bind support

2021-02-01 Thread Tamas Mate (Code Review)
Tamas Mate has uploaded a new patch set (#7). ( 
http://gerrit.cloudera.org:8080/16717 )

Change subject: IMPALA-10161: User LDAP Search bind support
..

IMPALA-10161: User LDAP Search bind support

This change adds user search bind support next to simple bind that can
be configured with LDAP filters. The group check was done with LDAP
search earlier, this change adds the possibility to configure it with
Hadoop library like options, which is the LDAP filter with optional
patterns. The '{0}' will be replaced with the user name while the
'{1}' pattern will be replaced with the user dn.

The following new flags have been added:
 --ldap_search_bind: a flag to change between simple and search bind
 --ldap_user_search_basedn: the base dn for the LDAP subtree to search
 --ldap_group_search_basedn: the base dn for the LDAP subtree to search

Tested:
 - Custom cluster tests have been added

Change-Id: I978744ad05d9ef408328d1e4dd2d18c329f4d3b7
---
M be/src/rpc/authentication.cc
M be/src/util/CMakeLists.txt
A be/src/util/ldap-search-bind.cc
A be/src/util/ldap-search-bind.h
A be/src/util/ldap-simple-bind.cc
A be/src/util/ldap-simple-bind.h
M be/src/util/ldap-util.cc
M be/src/util/ldap-util.h
M be/src/util/webserver.cc
M be/src/util/webserver.h
C 
fe/src/test/java/org/apache/impala/customcluster/LdapSearchBindImpalaShellTest.java
R 
fe/src/test/java/org/apache/impala/customcluster/LdapSimpleBindImpalaShellTest.java
M fe/src/test/java/org/apache/impala/testutil/LdapUtil.java
M fe/src/test/resources/users.ldif
14 files changed, 702 insertions(+), 298 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/17/16717/7
--
To view, visit http://gerrit.cloudera.org:8080/16717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I978744ad05d9ef408328d1e4dd2d18c329f4d3b7
Gerrit-Change-Number: 16717
Gerrit-PatchSet: 7
Gerrit-Owner: Tamas Mate 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-10456: Implement TRUNCATE for Iceberg tables

2021-02-01 Thread wangsheng (Code Review)
wangsheng has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16987 )

Change subject: IMPALA-10456: Implement TRUNCATE for Iceberg tables
..

IMPALA-10456: Implement TRUNCATE for Iceberg tables

This patch adds support for the TRUNCATE statement for
Iceberg tables.

The TRUNCATE operation creates a new snapshot for the target
table that doesn't have any data files. Table and column stats
are also cleared. This patch also fixes a bug that caused
table/column stats not being propagated.

Testing
 * added e2e tests for both partitioned and unpartitioned tables

Change-Id: I6116c7c36aba871c0be79f499e0ac618072ca7b8
Reviewed-on: http://gerrit.cloudera.org:8080/16987
Tested-by: Impala Public Jenkins 
Reviewed-by: wangsheng 
---
M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java
M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java
M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test
A testdata/workloads/functional-query/queries/QueryTest/iceberg-truncate.test
M tests/query_test/test_iceberg.py
8 files changed, 117 insertions(+), 17 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  wangsheng: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/16987
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I6116c7c36aba871c0be79f499e0ac618072ca7b8
Gerrit-Change-Number: 16987
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: wangsheng 


[Impala-ASF-CR] IMPALA-10456: Implement TRUNCATE for Iceberg tables

2021-02-01 Thread wangsheng (Code Review)
wangsheng has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16987 )

Change subject: IMPALA-10456: Implement TRUNCATE for Iceberg tables
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16987
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6116c7c36aba871c0be79f499e0ac618072ca7b8
Gerrit-Change-Number: 16987
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 01 Feb 2021 11:13:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10460: Impala should write normalized paths in Iceberg manifests

2021-02-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16993 )

Change subject: IMPALA-10460: Impala should write normalized paths in Iceberg 
manifests
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6865/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16993
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If5ecac78102ed35710dd70a18edc71f6e891e748
Gerrit-Change-Number: 16993
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 01 Feb 2021 11:13:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10460: Impala should write normalized paths in Iceberg manifests

2021-02-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16993 )

Change subject: IMPALA-10460: Impala should write normalized paths in Iceberg 
manifests
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16993
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If5ecac78102ed35710dd70a18edc71f6e891e748
Gerrit-Change-Number: 16993
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 01 Feb 2021 11:13:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10460: Impala should write normalized paths in Iceberg manifests

2021-02-01 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16993 )

Change subject: IMPALA-10460: Impala should write normalized paths in Iceberg 
manifests
..


Patch Set 2: Code-Review+2

(1 comment)

Carry +2

http://gerrit.cloudera.org:8080/#/c/16993/1/be/src/exec/hdfs-table-sink.cc
File be/src/exec/hdfs-table-sink.cc:

http://gerrit.cloudera.org:8080/#/c/16993/1/be/src/exec/hdfs-table-sink.cc@264
PS1, Line 264:
> optional: maybe it would be clearer to do two separate Substitute based on
Done



--
To view, visit http://gerrit.cloudera.org:8080/16993
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If5ecac78102ed35710dd70a18edc71f6e891e748
Gerrit-Change-Number: 16993
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 01 Feb 2021 11:13:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10460: Impala should write normalized paths in Iceberg manifests

2021-02-01 Thread Zoltan Borok-Nagy (Code Review)
Hello Gabor Kaszab, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16993

to look at the new patch set (#2).

Change subject: IMPALA-10460: Impala should write normalized paths in Iceberg 
manifests
..

IMPALA-10460: Impala should write normalized paths in Iceberg manifests

Currently Impala writes double slashes in the paths of datafiles
for non-partitioned Iceberg tables. Unnormalized paths can cause
problems later.

This patch removes the redundant slashes.

Testing:
 * Tested manually by inspecting the manifest files of the
   Iceberg tables. Used both non-partitioned and partitioned tables.

Change-Id: If5ecac78102ed35710dd70a18edc71f6e891e748
---
M be/src/exec/hdfs-table-sink.cc
1 file changed, 8 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/16993/2
--
To view, visit http://gerrit.cloudera.org:8080/16993
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If5ecac78102ed35710dd70a18edc71f6e891e748
Gerrit-Change-Number: 16993
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins