[Impala-ASF-CR] IMPALA-11401,IMPALA-10794: Add logs and thread names for catalogd RPCs

2022-08-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18772 )

Change subject: IMPALA-11401,IMPALA-10794: Add logs and thread names for 
catalogd RPCs
..


Patch Set 6: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/18772
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iac7f2eda8b95643a3d3c3bef64ea71b67b20595a
Gerrit-Change-Number: 18772
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Comment-Date: Sat, 06 Aug 2022 03:21:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11401,IMPALA-10794: Add logs and thread names for catalogd RPCs

2022-08-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/18772 )

Change subject: IMPALA-11401,IMPALA-10794: Add logs and thread names for 
catalogd RPCs
..

IMPALA-11401,IMPALA-10794: Add logs and thread names for catalogd RPCs

We've seen catalogd throws OutOfMemoryError when serializing large
responses (i.e. size > 2GB). However, the related table names are
missing in the logs. Admins would like to get the table names and
blacklist those tables until they are optimized (e.g. reducing
partitions).

To improve the supportability, this patch adds logs in the Catalogd RPC
code paths to log some details of the request, also add thread
annotations to improve readability of jstacks.

Tests:
 - Add unit tests for short descriptions of requests.
 - Manually add codes to throw OutOfMemoryError and verify the logs
   shown as expected.
 - Run test_concurrent_ddls.py and metadata tests. Capture jstacks and
   verify the thread annotations are shown.
 - Run CORE tests

Change-Id: Iac7f2eda8b95643a3d3c3bef64ea71b67b20595a
Reviewed-on: http://gerrit.cloudera.org:8080/18772
Reviewed-by: Csaba Ringhofer 
Tested-by: Impala Public Jenkins 
---
M fe/src/main/java/org/apache/impala/analysis/ColumnName.java
M fe/src/main/java/org/apache/impala/analysis/FunctionName.java
M fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableName.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
A fe/src/main/java/org/apache/impala/util/CatalogOpUtil.java
A fe/src/test/java/org/apache/impala/util/CatalogOpUtilTest.java
9 files changed, 513 insertions(+), 36 deletions(-)

Approvals:
  Csaba Ringhofer: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/18772
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Iac7f2eda8b95643a3d3c3bef64ea71b67b20595a
Gerrit-Change-Number: 18772
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tamas Mate 


[Impala-ASF-CR] IMPALA-11458: Update zlib, zstd, snappy

2022-08-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18816 )

Change subject: IMPALA-11458: Update zlib, zstd, snappy
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/18816
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If96fd5e8c49e581478b249bb8894001457742023
Gerrit-Change-Number: 18816
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Sat, 06 Aug 2022 02:54:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6684: Fix untracked memory in KRPC

2022-08-05 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18798 )

Change subject: IMPALA-6684: Fix untracked memory in KRPC
..


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/18798/4/be/src/runtime/krpc-data-stream-sender.cc
File be/src/runtime/krpc-data-stream-sender.cc:

http://gerrit.cloudera.org:8080/#/c/18798/4/be/src/runtime/krpc-data-stream-sender.cc@753
PS4, Line 753:   // on some channel
> I think it will make more sense to do this in Close()
It works.


http://gerrit.cloudera.org:8080/#/c/18798/5/be/src/runtime/row-batch.h
File be/src/runtime/row-batch.h:

http://gerrit.cloudera.org:8080/#/c/18798/5/be/src/runtime/row-batch.h@422
PS5, Line 422: us Serialize(OutboundRowBatch* output_batch, LockingFreePool* 
perm_free_pool,
 :   RuntimeProfile::SummaryStatsCounter* 
tuple_data_stats_counter,
We can return counters as integer, then update the profile counter in the 
caller. This will make unit-test code row-batch-serialize-test.cc easier.
Same comment for line #531.


http://gerrit.cloudera.org:8080/#/c/18798/5/be/src/runtime/row-batch.h@577
PS5, Line 577:   /// row batch). If the distinct_tuples argument is non-null, 
full deduplication is
nit: extra line



--
To view, visit http://gerrit.cloudera.org:8080/18798
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2ba2b907ce4f275a7a1fb8cf75453c7003eb4b82
Gerrit-Change-Number: 18798
Gerrit-PatchSet: 3
Gerrit-Owner: Omid Shahidi 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Omid Shahidi 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Fri, 05 Aug 2022 23:51:29 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11401,IMPALA-10794: Add logs and thread names for catalogd RPCs

2022-08-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18772 )

Change subject: IMPALA-11401,IMPALA-10794: Add logs and thread names for 
catalogd RPCs
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8407/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/18772
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iac7f2eda8b95643a3d3c3bef64ea71b67b20595a
Gerrit-Change-Number: 18772
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Comment-Date: Fri, 05 Aug 2022 22:29:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11401,IMPALA-10794: Add logs and thread names for catalogd RPCs

2022-08-05 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18772 )

Change subject: IMPALA-11401,IMPALA-10794: Add logs and thread names for 
catalogd RPCs
..


Patch Set 6:

(1 comment)

Thanks for your review!

http://gerrit.cloudera.org:8080/#/c/18772/5/fe/src/main/java/org/apache/impala/service/JniCatalog.java
File fe/src/main/java/org/apache/impala/service/JniCatalog.java:

http://gerrit.cloudera.org:8080/#/c/18772/5/fe/src/main/java/org/apache/impala/service/JniCatalog.java@227
PS5, Line 227: Strin
> Sure, I am ok with the follow up Jira, it could be also a nice ramp-up task
Filed IMPALA-11478



--
To view, visit http://gerrit.cloudera.org:8080/18772
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iac7f2eda8b95643a3d3c3bef64ea71b67b20595a
Gerrit-Change-Number: 18772
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Comment-Date: Fri, 05 Aug 2022 22:28:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] PROTOTYPE: IMPALA-11207: Use hadoop-cloud-storage for Cloud dependencies

2022-08-05 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18817 )

Change subject: PROTOTYPE: IMPALA-11207: Use hadoop-cloud-storage for Cloud 
dependencies
..


Patch Set 1: Code-Review+1

This seems reasonable. Some of this code is only used at runtime and I'm not 
sure how good the test coverage is for it. However the 
fe/target/build-classpath.txt diff between this and just reverting the 
google-oauth-client change (since this version of hadoop-cloud-storage still 
uses 0.31) is

34d33
< 
/home/michael/.m2/repository/com/google/cloud/bigdataoss/gcs-connector/2.1.2.7.2.16.0-77/gcs-connector-2.1.2.7.2.16.0-77.jar
59d57
< 
/home/michael/.m2/repository/com/jayway/jsonpath/json-path/2.4.0/json-path-2.4.0.jar
144,145d141
< 
/home/michael/.m2/repository/org/apache/directory/server/apacheds-i18n/2.0.0.AM25/apacheds-i18n-2.0.0.AM25.jar
< 
/home/michael/.m2/repository/org/apache/directory/server/apacheds-jdbm/2.0.0-M5/apacheds-jdbm-2.0.0-M5.jar
150a147
> /home/michael/.m2/repository/org/apache/hadoop/hadoop-cloud-storage/3.1.1.7.2.16.0-77/hadoop-cloud-storage-3.1.1.7.2.16.0-77.jar
226a224,227
> /home/michael/.m2/repository/org/apache/ranger/ranger-raz-hook-abfs/2.1.0.7.2.16.0-77/ranger-raz-hook-abfs-2.1.0.7.2.16.0-77.jar
> /home/michael/.m2/repository/org/apache/ranger/ranger-raz-hook-s3/2.1.0.7.2.16.0-77/ranger-raz-hook-s3-2.1.0.7.2.16.0-77.jar
> /home/michael/.m2/repository/org/apache/ranger/ranger-raz-intg/2.1.0.7.2.16.0-77/ranger-raz-intg-2.1.0.7.2.16.0-77.jar
> /home/michael/.m2/repository/org/apache/ranger/ranger-raz-s3-lib/2.1.0.7.2.16.0-77/ranger-raz-s3-lib-2.1.0.7.2.16.0-77.jar
238,244d238
< /home/michael/.m2/repository/org/codehaus/groovy/groovy/3.0.7/groovy-3.0.7.jar
< 
/home/michael/.m2/repository/org/codehaus/groovy/groovy-console/3.0.7/groovy-console-3.0.7.jar
< 
/home/michael/.m2/repository/org/codehaus/groovy/groovy-groovysh/3.0.7/groovy-groovysh-3.0.7.jar
< 
/home/michael/.m2/repository/org/codehaus/groovy/groovy-json/3.0.7/groovy-json-3.0.7.jar
< 
/home/michael/.m2/repository/org/codehaus/groovy/groovy-swing/3.0.7/groovy-swing-3.0.7.jar
< 
/home/michael/.m2/repository/org/codehaus/groovy/groovy-templates/3.0.7/groovy-templates-3.0.7.jar
< 
/home/michael/.m2/repository/org/codehaus/groovy/groovy-xml/3.0.7/groovy-xml-3.0.7.jar
276d269
< /home/michael/.m2/repository/org/fusesource/jansi/jansi/1.18/jansi-1.18.jar

which doesn't seem too concerning.


--
To view, visit http://gerrit.cloudera.org:8080/18817
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a1631289f990513823c2b17eb9241cc1b5a7ffd
Gerrit-Change-Number: 18817
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Fri, 05 Aug 2022 22:18:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11458: Update zlib, zstd, snappy

2022-08-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18816 )

Change subject: IMPALA-11458: Update zlib, zstd, snappy
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/11108/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/18816
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If96fd5e8c49e581478b249bb8894001457742023
Gerrit-Change-Number: 18816
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Fri, 05 Aug 2022 22:10:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11458: Update zlib, zstd, snappy

2022-08-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18816 )

Change subject: IMPALA-11458: Update zlib, zstd, snappy
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8406/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/18816
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If96fd5e8c49e581478b249bb8894001457742023
Gerrit-Change-Number: 18816
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Fri, 05 Aug 2022 22:02:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11458: Update zlib, zstd, snappy

2022-08-05 Thread Michael Smith (Code Review)
Hello Joe McDonnell, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/18816

to look at the new patch set (#2).

Change subject: IMPALA-11458: Update zlib, zstd, snappy
..

IMPALA-11458: Update zlib, zstd, snappy

Updates zlib to 1.2.12, zstd to 1.5.2, and snappy to 1.1.9. Requires a
new toolchain build.

The snappy update made some parquet files in the
stats-extrapolation.test slightly larger, leading to slightly different
estimates. I've checked that the data stored is unchanged with
`select * from alltypes` on tables written before and after this change.

Change-Id: If96fd5e8c49e581478b249bb8894001457742023
---
M bin/impala-config.sh
M testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test
2 files changed, 7 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/16/18816/2
--
To view, visit http://gerrit.cloudera.org:8080/18816
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If96fd5e8c49e581478b249bb8894001457742023
Gerrit-Change-Number: 18816
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 


[Impala-ASF-CR] PROTOTYPE: IMPALA-11207: Use hadoop-cloud-storage for Cloud dependencies

2022-08-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18817 )

Change subject: PROTOTYPE: IMPALA-11207: Use hadoop-cloud-storage for Cloud 
dependencies
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/18817
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a1631289f990513823c2b17eb9241cc1b5a7ffd
Gerrit-Change-Number: 18817
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Fri, 05 Aug 2022 20:59:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6684: Fix untracked memory in KRPC

2022-08-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18798 )

Change subject: IMPALA-6684: Fix untracked memory in KRPC
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/11107/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/18798
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2ba2b907ce4f275a7a1fb8cf75453c7003eb4b82
Gerrit-Change-Number: 18798
Gerrit-PatchSet: 5
Gerrit-Owner: Omid Shahidi 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Omid Shahidi 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Fri, 05 Aug 2022 20:20:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6684: Fix untracked memory in KRPC

2022-08-05 Thread Omid Shahidi (Code Review)
Omid Shahidi has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/18798 )

Change subject: IMPALA-6684: Fix untracked memory in KRPC
..

IMPALA-6684: Fix untracked memory in KRPC

During serialization of an row batch header, a tuple_data_ is created
which will hold the compressed tuple data for an outbound row batch.
We would like this tuple data to be trackable as it is responsible for
a significant portion of untrackable memory from the krpc data stream
sender. By using free pool, we are able to allocate tuple data and
compression scratch and account for it in the memory tracker of the
KrpcDataStreamSender. This solution creates a RAII class responsible
for memory allocation and changes the existing code to use a char buffer
pointed by a char* tuple_data_ instead of the previously used
std::string tuple_data_. The thrift implementation is left unchanged and
the protobuf implementation is seperated.

Testing:
Passed core tests and ran a single node benchmark which shows no
regression

Change-Id: I2ba2b907ce4f275a7a1fb8cf75453c7003eb4b82
---
M be/src/runtime/krpc-data-stream-sender.cc
M be/src/runtime/krpc-data-stream-sender.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
4 files changed, 296 insertions(+), 55 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/18798/5
--
To view, visit http://gerrit.cloudera.org:8080/18798
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2ba2b907ce4f275a7a1fb8cf75453c7003eb4b82
Gerrit-Change-Number: 18798
Gerrit-PatchSet: 5
Gerrit-Owner: Omid Shahidi 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Omid Shahidi 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-6684: Fix untracked memory in KRPC

2022-08-05 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18798 )

Change subject: IMPALA-6684: Fix untracked memory in KRPC
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18798/4/be/src/runtime/row-batch.cc
File be/src/runtime/row-batch.cc:

http://gerrit.cloudera.org:8080/#/c/18798/4/be/src/runtime/row-batch.cc@400
PS4, Line 400: Benchmarks (be/src/benchmarks/row-batch-serialize-benchmark.cc) 
and tests
 : /// (be/src/runtime/row-batch-serialize-test.cc) for 
serialization use TRowBatch and
 : /// Thrift so we need to keep the old implementation so they 
don't fail.
Change benchmark and unit-test code for your new RowBatch::Serialize()



--
To view, visit http://gerrit.cloudera.org:8080/18798
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2ba2b907ce4f275a7a1fb8cf75453c7003eb4b82
Gerrit-Change-Number: 18798
Gerrit-PatchSet: 4
Gerrit-Owner: Omid Shahidi 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Omid Shahidi 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Fri, 05 Aug 2022 17:56:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] PROTOTYPE: IMPALA-11207: Use hadoop-cloud-storage for Cloud dependencies

2022-08-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18817 )

Change subject: PROTOTYPE: IMPALA-11207: Use hadoop-cloud-storage for Cloud 
dependencies
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8405/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/18817
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a1631289f990513823c2b17eb9241cc1b5a7ffd
Gerrit-Change-Number: 18817
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 05 Aug 2022 16:15:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11408: Fill missing partition columns when INSERT INTO iceberg tbl (col list)

2022-08-05 Thread Anonymous Coward (Code Review)
lipeng...@sensorsdata.cn has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18790 )

Change subject: IMPALA-11408: Fill missing partition columns when INSERT INTO 
iceberg_tbl (col_list)
..


Patch Set 3:

(1 comment)

awesome, thanks for cr!

http://gerrit.cloudera.org:8080/#/c/18790/2/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java:

http://gerrit.cloudera.org:8080/#/c/18790/2/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java@907
PS2, Line 907:   // Add partition key expressions in the order of the 
Iceberg partition fields.
> We can also make things work by only removing this precondition check becau
Done



--
To view, visit http://gerrit.cloudera.org:8080/18790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I40c733755d65e5c81a12ffe09b6d16ed5d115368
Gerrit-Change-Number: 18790
Gerrit-PatchSet: 3
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 05 Aug 2022 16:08:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11408: Fill missing partition columns when INSERT INTO iceberg tbl (col list)

2022-08-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18790 )

Change subject: IMPALA-11408: Fill missing partition columns when INSERT INTO 
iceberg_tbl (col_list)
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/11106/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/18790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I40c733755d65e5c81a12ffe09b6d16ed5d115368
Gerrit-Change-Number: 18790
Gerrit-PatchSet: 3
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 05 Aug 2022 16:06:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11408: Fill missing partition columns when INSERT INTO iceberg tbl (col list)

2022-08-05 Thread Anonymous Coward (Code Review)
lipeng...@sensorsdata.cn has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/18790 )

Change subject: IMPALA-11408: Fill missing partition columns when INSERT INTO 
iceberg_tbl (col_list)
..

IMPALA-11408: Fill missing partition columns when INSERT INTO iceberg_tbl 
(col_list)

In the case of INSERT INTO iceberg_tbl (col_a, col_b, ...), if the
partition columns of Iceberg table are not in the columns permutation,
in order for data to be written to the default
partition '__HIVE_DEFAULT_PARTITION__' we will fill the missing
partition columns with NullLiteral.

Testing:
 - add e2e tests

Change-Id: I40c733755d65e5c81a12ffe09b6d16ed5d115368
---
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-partitioned-insert.test
3 files changed, 165 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/18790/3
--
To view, visit http://gerrit.cloudera.org:8080/18790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I40c733755d65e5c81a12ffe09b6d16ed5d115368
Gerrit-Change-Number: 18790
Gerrit-PatchSet: 3
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-11436: Change search bind authentication parameters

2022-08-05 Thread Tamas Mate (Code Review)
Tamas Mate has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/18819


Change subject: IMPALA-11436: Change search bind authentication parameters
..

IMPALA-11436: Change search bind authentication parameters

Impala's search bind authentication intends to mimic Spring's behaviour.
However, the login username and user dn paremeters were swapped for
group searches compared to Spring. This change intends to allign these
parameters.

For user search, Spring uses {0} to replace the login username.
Meanwhile, during group search {0} is used to replace the login user dn
and {1} is used to replace the login username.

Testing:
 - Ran LdapSearchBindImpalaShellTest frontend tests

Change-Id: I9808566a348f7c6200b0571fbc05e67f720f2075
---
M be/src/util/ldap-search-bind.cc
M docs/topics/impala_ldap.xml
M 
fe/src/test/java/org/apache/impala/customcluster/LdapSearchBindImpalaShellTest.java
3 files changed, 21 insertions(+), 20 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/18819/1
--
To view, visit http://gerrit.cloudera.org:8080/18819
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I9808566a348f7c6200b0571fbc05e67f720f2075
Gerrit-Change-Number: 18819
Gerrit-PatchSet: 1
Gerrit-Owner: Tamas Mate 


[Impala-ASF-CR] IMPALA-9482: Support for BINARY columns

2022-08-05 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16066 )

Change subject: IMPALA-9482: Support for BINARY columns
..


Patch Set 18:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16066/14/be/src/runtime/descriptors.h
File be/src/runtime/descriptors.h:

http://gerrit.cloudera.org:8080/#/c/16066/14/be/src/runtime/descriptors.h@256
PS14, Line 256: return col_descs_[slot_desc->col_path().back()];
> Hmm, I realized that I have to think nested binaries through more carefully
Checked how Hive handles this, it seems to utf8 encode the nested binary with 
replace:
select named_struct("b", unhex("00112233445566778899AABBCCDDEEFF"));
result: {"b":"3DUfw}

select array(unhex("00112233445566778899AABBCCDDEEFF"));
result: ["3DUfw]

A weird behavior around nested binary is that it is not quoted (unlike string):
select named_struct("s", "a", "b", cast("a" as binary));
{"s":"a","b":a}

I have also checked what happens when we read these back from Impala:
- if we return the whole struct, the non-valid utf8 binary is replaced with an 
empty string
- if we return the whole array, it is returned escaped:
"\000\021\"3DUfw"]
- if the member/item is returned directly, we print it the same way as Hive
- if we hex() the result back, it returns it correctly, so Hive does not write 
it in a lossy way

I am thinking about not allowing nested binary for now, and enable them once 
the struct/array printing functions are unified + I have verified whether not 
quoting them is intentional in Hive.



--
To view, visit http://gerrit.cloudera.org:8080/16066
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36861a9ca6c2047b0d76862507c86f7f153bc582
Gerrit-Change-Number: 16066
Gerrit-PatchSet: 18
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 05 Aug 2022 11:22:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9482: Support for BINARY columns

2022-08-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16066 )

Change subject: IMPALA-9482: Support for BINARY columns
..


Patch Set 18:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/11105/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16066
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36861a9ca6c2047b0d76862507c86f7f153bc582
Gerrit-Change-Number: 16066
Gerrit-PatchSet: 18
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 05 Aug 2022 11:04:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9482: Support for BINARY columns

2022-08-05 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16066 )

Change subject: IMPALA-9482: Support for BINARY columns
..


Patch Set 18:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16066/14/be/src/service/hs2-util.cc
File be/src/service/hs2-util.cc:

http://gerrit.cloudera.org:8080/#/c/16066/14/be/src/service/hs2-util.cc@872
PS14, Line 872:   AuxColumnType aux_type(columnType);
> Done
Reverted this, as while gcc compiled it, clang had issue I could not solve. 
This is strange to me, as case TYPE_DECIMAL: also has some local variables, but 
I did not want to dive too deeply.



--
To view, visit http://gerrit.cloudera.org:8080/16066
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36861a9ca6c2047b0d76862507c86f7f153bc582
Gerrit-Change-Number: 16066
Gerrit-PatchSet: 18
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 05 Aug 2022 10:46:34 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9482: Support for BINARY columns

2022-08-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16066 )

Change subject: IMPALA-9482: Support for BINARY columns
..


Patch Set 18:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/16066/18/tests/hs2/test_fetch.py
File tests/hs2/test_fetch.py:

http://gerrit.cloudera.org:8080/#/c/16066/18/tests/hs2/test_fetch.py@69
PS18, Line 69: "
flake8: E126 continuation line over-indented for hanging indent


http://gerrit.cloudera.org:8080/#/c/16066/18/tests/hs2/test_fetch.py@89
PS18, Line 89: "
flake8: E126 continuation line over-indented for hanging indent


http://gerrit.cloudera.org:8080/#/c/16066/18/tests/hs2/test_fetch.py@100
PS18, Line 100: "
flake8: E126 continuation line over-indented for hanging indent


http://gerrit.cloudera.org:8080/#/c/16066/18/tests/hs2/test_fetch.py@113
PS18, Line 113: "
flake8: E126 continuation line over-indented for hanging indent


http://gerrit.cloudera.org:8080/#/c/16066/18/tests/hs2/test_fetch.py@123
PS18, Line 123: "
flake8: E126 continuation line over-indented for hanging indent


http://gerrit.cloudera.org:8080/#/c/16066/18/tests/query_test/test_udfs.py
File tests/query_test/test_udfs.py:

http://gerrit.cloudera.org:8080/#/c/16066/18/tests/query_test/test_udfs.py@116
PS18, Line 116: y
flake8: E501 line too long (92 > 90 characters)



--
To view, visit http://gerrit.cloudera.org:8080/16066
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36861a9ca6c2047b0d76862507c86f7f153bc582
Gerrit-Change-Number: 16066
Gerrit-PatchSet: 18
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 05 Aug 2022 10:44:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9482: Support for BINARY columns

2022-08-05 Thread Csaba Ringhofer (Code Review)
Hello Quanlong Huang, Gabor Kaszab, Zoltan Borok-Nagy, Attila Jeges, Tim 
Armstrong, Steve Carlin, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16066

to look at the new patch set (#18).

Change subject: IMPALA-9482: Support for BINARY columns
..

IMPALA-9482: Support for BINARY columns

This patch adds support for BINARY columns for all table formats with
the exception of Kudu.

In Hive the main difference between STRING and BINARY is that STRING is
assumed to be UTF8 encoded, while BINARY can be any byte array.
Some other differences in Hive:
- BINARY can be only cast from/to STRING
- Only a small subset of built-in STRING functions support BINARY.
- In several file formats (e.g. text) BINARY is base64 encoded.
- No NDV is calculated during COMPUTE STATISTICS.

As Impala doesn't treat STRINGs as UTF8, BINARY and STRING become nearly
identical, especially from the backend's perspective. For this reason,
BINARY is implemented a bit differently compared to other types:
while the frontend treats STRING and BINARY as two separate types, most
of the backend uses PrimitiveType::TYPE_STRING for BINARY too, e.g.
in SlotDesc. Only the following parts of backend need to differentiate
between STRING and BINARY:
- table scanners
- table writers
- HS2/Beeswax service
These parts have access to column metadata, which allows to add special
handling for BINARY.

Only a very few builtins are allowed for BINARY at the moment:
- length
- min/max/count
- coalesce and similar "selector" functions
Other STRING functions can be only used by casting to STRING first.
Adding support for more of these functions is very easy, as simply
the BINARY type has to be "connected" to the already existing STRING
function's signature. Functions where the result depends on utf8_mode
need to ensure that with BINARY it always works as if utf8_mode=0 (for
example length() is mapped to bytes() as length count utf8 chars if
utf8_mode=1).

All kinds of UDFs (native, Hive legacy, Hive generic) support BINARY,
though in case of legacy Hive UDFs it is only supported if the argument
and return types are set explicitely to ensure backward compatibility.
See IMPALA-11340 for details.

The original plan was to behave as close to Hive as possible, but I
realized that Hive has more relaxed casting rules than Impala, which
led to STRING<->BINARY casts being necessary in more cases in Impala.
This was needed to disallow passing a BINARY to functions that expect
a STRING argument. An example for the difference is that in
INSERT ... VALUES () string literals need to be explicitly cast to
BINARY, while this is not needed in Hive.

Testing:
- Added functional.binary_tbl for all file formats (except Kudu)
  to test scanning.
- Removed functional.unsupported_types and related tests, as now
  Impala supports all (non-complex) types that Hive does.
- Added FE/EE tests mainly based on the ones added to the DATE type

Change-Id: I36861a9ca6c2047b0d76862507c86f7f153bc582
---
M be/src/exec/file-metadata-utils.cc
M be/src/exec/hbase-scan-node.cc
M be/src/exec/hbase-scan-node.h
M be/src/exec/hbase-table-writer.cc
M be/src/exec/hdfs-rcfile-scanner.cc
M be/src/exec/hdfs-scanner-ir.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/hdfs-text-table-writer.cc
M be/src/exec/orc-metadata-utils.cc
M be/src/exec/parquet/hdfs-parquet-table-writer.cc
M be/src/exec/parquet/parquet-metadata-utils.cc
M be/src/exec/parquet/parquet-metadata-utils.h
M be/src/exec/text-converter.cc
M be/src/exec/text-converter.h
M be/src/exec/text-converter.inline.h
M be/src/exprs/expr-test.cc
M be/src/exprs/utility-functions-ir.cc
M be/src/exprs/utility-functions.h
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M be/src/runtime/types.cc
M be/src/runtime/types.h
M be/src/service/hs2-util.cc
M be/src/service/hs2-util.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/query-result-set.cc
M be/src/testutil/test-udfs.cc
M be/src/util/coding-util.cc
M be/src/util/coding-util.h
M bin/rat_exclude_files.txt
M common/function-registry/impala_functions.py
M fe/src/main/java/org/apache/impala/analysis/CastExpr.java
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
M fe/src/main/java/org/apache/impala/analysis/LikePredicate.java
M fe/src/main/java/org/apache/impala/analysis/LiteralExpr.java
M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
M fe/src/main/java/org/apache/impala/catalog/ColumnStats.java
M fe/src/main/java/org/apache/impala/catalog/Function.java
M fe/src/main/java/org/apache/impala/catalog/PrimitiveType.java
M fe/src/main/java/org/apache/impala/catalog/ScalarFunction.java
M fe/src/main/java/org/apache/impala/catalog/ScalarType.java
M fe/src/main/java/org/apache/impala/catalog/Type.java
M fe/sr

[Impala-ASF-CR] IMPALA-11401,IMPALA-10794: Add logs and thread names for catalogd RPCs

2022-08-05 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18772 )

Change subject: IMPALA-11401,IMPALA-10794: Add logs and thread names for 
catalogd RPCs
..


Patch Set 6: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/18772/5/fe/src/main/java/org/apache/impala/service/JniCatalog.java
File fe/src/main/java/org/apache/impala/service/JniCatalog.java:

http://gerrit.cloudera.org:8080/#/c/18772/5/fe/src/main/java/org/apache/impala/service/JniCatalog.java@227
PS5, Line 227: Strin
> I can try to add more. But some RPCs are easy to get their target. E.g. get
Sure, I am ok with the follow up Jira, it could be also a nice ramp-up task for 
someone new to Impala.

If someone touches this area, it could also make sense to create a class 
hierarchy like CatalogJniOp with the basic common skeleton of this functions


http://gerrit.cloudera.org:8080/#/c/18772/5/fe/src/main/java/org/apache/impala/service/JniCatalog.java@261
PS5, Line 261:   JniUtil.logResponse(res.length, start, params, "exe
> We calculate the duration in JniUtil.logResponse() and add warnings for slo
yeah, I think that this can be often useful, e.g. to get a picture of 
catalog/HMS speed with a simple grep on the log



--
To view, visit http://gerrit.cloudera.org:8080/18772
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iac7f2eda8b95643a3d3c3bef64ea71b67b20595a
Gerrit-Change-Number: 18772
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Comment-Date: Fri, 05 Aug 2022 10:33:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10453: Support file pruning via runtime filters on Iceberg

2022-08-05 Thread Tamas Mate (Code Review)
Tamas Mate has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18531 )

Change subject: IMPALA-10453: Support file pruning via runtime filters on 
Iceberg
..


Patch Set 15:

Thank you for the reviews everyone!


--
To view, visit http://gerrit.cloudera.org:8080/18531
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7762e1238bdf236b85d2728881a402a2bb41f36a
Gerrit-Change-Number: 18531
Gerrit-PatchSet: 15
Gerrit-Owner: Tamas Mate 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 05 Aug 2022 10:36:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11401,IMPALA-10794: Add logs and thread names for catalogd RPCs

2022-08-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18772 )

Change subject: IMPALA-11401,IMPALA-10794: Add logs and thread names for 
catalogd RPCs
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/11104/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/18772
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iac7f2eda8b95643a3d3c3bef64ea71b67b20595a
Gerrit-Change-Number: 18772
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Comment-Date: Fri, 05 Aug 2022 09:04:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11401,IMPALA-10794: Add logs and thread names for catalogd RPCs

2022-08-05 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18772 )

Change subject: IMPALA-11401,IMPALA-10794: Add logs and thread names for 
catalogd RPCs
..


Patch Set 6:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/18772/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/18772/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2633
PS5, Line 2633:
> We have already created a ThreadNameAnnotator in JniCatalog.ExecDdl, right?
Oops, this is part of a previous patch. I should have reverted them.


http://gerrit.cloudera.org:8080/#/c/18772/5/fe/src/main/java/org/apache/impala/service/JniCatalog.java
File fe/src/main/java/org/apache/impala/service/JniCatalog.java:

http://gerrit.cloudera.org:8080/#/c/18772/5/fe/src/main/java/org/apache/impala/service/JniCatalog.java@227
PS5, Line 227: Strin
> Is there a reason for creating a ThreadAnnotator in some cases, but not doi
I can try to add more. But some RPCs are easy to get their target. E.g. 
getCatalogDelta() is only called when statestore polls catalog updates. Some 
are dead codes, e.g. getDbs, getTableNames, getFunctions. We can remove them in 
a follow-up JIRA.


http://gerrit.cloudera.org:8080/#/c/18772/5/fe/src/main/java/org/apache/impala/service/JniCatalog.java@261
PS5, Line 261:   JniUtil.logResponse(res.length, start, params, "exe
> optional: logging the duration could be useful (or we do this somewhere els
We calculate the duration in JniUtil.logResponse() and add warnings for slow 
response (>60s) there. We can add duration here to ease performance comparison.


http://gerrit.cloudera.org:8080/#/c/18772/5/fe/src/main/java/org/apache/impala/util/CatalogOpUtil.java
File fe/src/main/java/org/apache/impala/util/CatalogOpUtil.java:

http://gerrit.cloudera.org:8080/#/c/18772/5/fe/src/main/java/org/apache/impala/util/CatalogOpUtil.java@133
PS5, Line 133: if (req.isSetDb_name()) {
 :   if (req.is_refresh) cmd += "FUNCTIONS IN ";
 :   cmd += "DATABASE " + req.getDb_name();
 : } else if (req.isSetTable_name()) {
> Is this the right order to check these? So if we are refreshing a table, wo
This is consistent with the checks in execResetMetadata(). The name is 
misleading that "Table_name" here is not just the table name but a TTableName 
which consists of the db and table string:
https://github.com/apache/impala/blob/29a26a536c004e4d16fdb56e1bb5a41605ae4e30/common/thrift/CatalogObjects.thrift#L171-L177
https://github.com/apache/impala/blob/29a26a536c004e4d16fdb56e1bb5a41605ae4e30/common/thrift/CatalogService.thrift#L278-L280

If the target is a database, the db_name is set:
https://github.com/apache/impala/blob/29a26a536c004e4d16fdb56e1bb5a41605ae4e30/common/thrift/CatalogService.thrift#L286-L287



--
To view, visit http://gerrit.cloudera.org:8080/18772
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iac7f2eda8b95643a3d3c3bef64ea71b67b20595a
Gerrit-Change-Number: 18772
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Comment-Date: Fri, 05 Aug 2022 08:44:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11401,IMPALA-10794: Add logs and thread names for catalogd RPCs

2022-08-05 Thread Quanlong Huang (Code Review)
Hello Tamas Mate, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/18772

to look at the new patch set (#6).

Change subject: IMPALA-11401,IMPALA-10794: Add logs and thread names for 
catalogd RPCs
..

IMPALA-11401,IMPALA-10794: Add logs and thread names for catalogd RPCs

We've seen catalogd throws OutOfMemoryError when serializing large
responses (i.e. size > 2GB). However, the related table names are
missing in the logs. Admins would like to get the table names and
blacklist those tables until they are optimized (e.g. reducing
partitions).

To improve the supportability, this patch adds logs in the Catalogd RPC
code paths to log some details of the request, also add thread
annotations to improve readability of jstacks.

Tests:
 - Add unit tests for short descriptions of requests.
 - Manually add codes to throw OutOfMemoryError and verify the logs
   shown as expected.
 - Run test_concurrent_ddls.py and metadata tests. Capture jstacks and
   verify the thread annotations are shown.
 - Run CORE tests

Change-Id: Iac7f2eda8b95643a3d3c3bef64ea71b67b20595a
---
M fe/src/main/java/org/apache/impala/analysis/ColumnName.java
M fe/src/main/java/org/apache/impala/analysis/FunctionName.java
M fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableName.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
A fe/src/main/java/org/apache/impala/util/CatalogOpUtil.java
A fe/src/test/java/org/apache/impala/util/CatalogOpUtilTest.java
9 files changed, 513 insertions(+), 36 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/18772/6
--
To view, visit http://gerrit.cloudera.org:8080/18772
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iac7f2eda8b95643a3d3c3bef64ea71b67b20595a
Gerrit-Change-Number: 18772
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tamas Mate 


[Impala-ASF-CR] IMPALA-11474: Codegen Tuple size in sorter-ir.cc

2022-08-05 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18802 )

Change subject: IMPALA-11474: Codegen Tuple size in sorter-ir.cc
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18802/5/be/src/exec/partial-sort-node.cc
File be/src/exec/partial-sort-node.cc:

http://gerrit.cloudera.org:8080/#/c/18802/5/be/src/exec/partial-sort-node.cc@98
PS5, Line 98:  
children_[0]->row_descriptor_->tuple_descriptors()[0]->byte_size()
Looked at this after seeing that the tests failed in Kudu inserts (which uses 
partial sort)

This doesn't look right to me, as the child's tuple size can be different than 
the materialized sort tuple size. The child can also have multiple tuples.

We should use row_descriptor_'s tuple size similarly to SortNode.



--
To view, visit http://gerrit.cloudera.org:8080/18802
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4161a61db1782dc448dae9a1d4c1d120b055b3c
Gerrit-Change-Number: 18802
Gerrit-PatchSet: 5
Gerrit-Owner: Noemi Pap-Takacs 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Comment-Date: Fri, 05 Aug 2022 07:13:44 +
Gerrit-HasComments: Yes