[Impala-ASF-CR] IMPALA-10928: Upgrade source of Kudu's EasyCurl in kudu/util

2021-09-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17864 )

Change subject: IMPALA-10928: Upgrade source of Kudu's EasyCurl in kudu/util
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9492/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17864
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I51fea9ff4de767bfda2a4ebcd3744d63440adf14
Gerrit-Change-Number: 17864
Gerrit-PatchSet: 2
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Thu, 23 Sep 2021 04:41:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10928: Upgrade source of Kudu's EasyCurl in kudu/util

2021-09-22 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/17864 )

Change subject: IMPALA-10928: Upgrade source of Kudu's EasyCurl in kudu/util
..

IMPALA-10928: Upgrade source of Kudu's EasyCurl in kudu/util

We are going to use Kudu's EasyCurl wrapper to download file in Impala.
There are some enhancements for Kudu's EasyCurl in Kudu upstream since
the last sync of kudu/util with Kudu repo. This patch upgraded source
code for Kudu's EasyCurl wrapper (curl_util.cc and curl_util.h in
kudu/util) from the Kudu upstream (commit hash: 37bc427a60, b417479266,
a33b0951d9, 9050941663, e371d80f3e).

Note that curl_util.cc/curl_util.h have dependency on libcurl, which
is not integrated in Impala. curl_util.cc is not built with current
kudu/util/CMakeLists.txt. We will integrate curl_util.cc/curl_util.h
when we integrate libcurl in following patch.

Testing:
 - There is no impact for Impala tests.

Change-Id: I51fea9ff4de767bfda2a4ebcd3744d63440adf14
---
M be/src/kudu/util/curl_util-test.cc
M be/src/kudu/util/curl_util.cc
M be/src/kudu/util/curl_util.h
3 files changed, 155 insertions(+), 52 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/17864/2
--
To view, visit http://gerrit.cloudera.org:8080/17864
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I51fea9ff4de767bfda2a4ebcd3744d63440adf14
Gerrit-Change-Number: 17864
Gerrit-PatchSet: 2
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 


[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

2021-09-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS 
event id when performing DDL operations via catalog HMS endpoints
..


Patch Set 4: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7483/


--
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c474eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 4
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 23 Sep 2021 03:53:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9857: Batching of consecutive partition events

2021-09-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17848 )

Change subject: IMPALA-9857: Batching of consecutive partition events
..


Patch Set 7: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7482/


--
To view, visit http://gerrit.cloudera.org:8080/17848
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5d27a68a64436d31731e9a219b1efd6fc842de73
Gerrit-Change-Number: 17848
Gerrit-PatchSet: 7
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 23 Sep 2021 03:07:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables

2021-09-22 Thread Sourabh Goyal (Code Review)
Sourabh Goyal has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17858 )

Change subject: IMPALA-10923: Fine grained table refreshing at partition level 
events for transactional tables
..


Patch Set 1:

(14 comments)

http://gerrit.cloudera.org:8080/#/c/17858/1/common/thrift/BackendGflags.thrift
File common/thrift/BackendGflags.thrift:

http://gerrit.cloudera.org:8080/#/c/17858/1/common/thrift/BackendGflags.thrift@219
PS1, Line 219:   97: required bool incremental_refresh_acid
nit: rename it to hms_event_incremental_refresh or 
hms_event_incremental_refresh_transactional_table

Also where are we setting the default value of this flag?


http://gerrit.cloudera.org:8080/#/c/17858/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17858/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2582
PS1, Line 2582:   List tPartSpec, long writeId, boolean 
loadFileMetadata,
I feel it would be cleaner if we don't add writeId and eventId as parameters 
here and instead call this method only if these checks pass in the caller. 
Otherwise the caller of this method would need to know writeId and eventId 
before calling this method which does not seem right. We can add 
loadFileMetadata here though.


http://gerrit.cloudera.org:8080/#/c/17858/1/fe/src/main/java/org/apache/impala/catalog/TableWriteId.java
File fe/src/main/java/org/apache/impala/catalog/TableWriteId.java:

http://gerrit.cloudera.org:8080/#/c/17858/1/fe/src/main/java/org/apache/impala/catalog/TableWriteId.java@29
PS1, Line 29: this.tableName = TableName.parse(tableName);
We are probably expecting tableName to be fullyQualified but not enforcing that 
check here. Instead of TableName, we can store String DbName, String tableName 
separately. .


http://gerrit.cloudera.org:8080/#/c/17858/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/17858/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@20
PS1, Line 20: import com.amazonaws.services.servicequotas.model.MetricInfo;
nit: remove it if not being used?


http://gerrit.cloudera.org:8080/#/c/17858/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@184
PS1, Line 184: switch (metastoreEventType) {
this seems incomplete. Are we missing anything here? We didn't create objs for 
ALLOC_WRITE_ID_EVENT, COMMIT_TXN and ABORT_TXN anywhere in this method.


http://gerrit.cloudera.org:8080/#/c/17858/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1458
PS1, Line 1458:   public static class AcidAddPartitionEvent extends 
AddPartitionEvent {
1. Where is AcidAddPartitionEvent getting initialized?
2. Instead of creating a new event i.e AcidAddPartitionEvent, we can modify the 
process() logic in AddPartitionEvent itself based on the incremental refresh 
flag


http://gerrit.cloudera.org:8080/#/c/17858/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1717
PS1, Line 1717:   if (tbl_.getCreateEventId() > eventId_) {
IMO, there is no need to acquire read lock on table to read createEventId since 
it is set only once when creating table in db. Instead, we can make 
createEventId member variable in table as volatile to make sure we are always 
reading most recent value.


http://gerrit.cloudera.org:8080/#/c/17858/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1735
PS1, Line 1735:   if (msTbl_ == null) {
can we move the check tbl_.getCreateEventId() > eventId_ here ? As msTbl_ is 
not being used anywhere else,  better to avoid it.


http://gerrit.cloudera.org:8080/#/c/17858/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1738
PS1, Line 1738:   List writeIds = txnToWriteIdList_.stream()
txnToWriteIdList_ does not seem to be set anywhere in the event. Where are we 
reading its values from?


http://gerrit.cloudera.org:8080/#/c/17858/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1786
PS1, Line 1786: throw new CatalogException("Failed to get write event 
infos", e);
nit: Add txnId in the message?


http://gerrit.cloudera.org:8080/#/c/17858/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1789
PS1, Line 1789: commitTxnMessage_.addWriteEventInfo(writeEventInfoList);
Didn't understand the purpose of this.


http://gerrit.cloudera.org:8080/#/c/17858/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1793
PS1, Line 1793:   throw new CatalogException("Commit event processing 
error", e);
nit: add more info like txnId etc.



[Impala-ASF-CR] IMPALA-10876: Support to download JWKS from given URL

2021-09-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17802 )

Change subject: IMPALA-10876: Support to download JWKS from given URL
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9490/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17802
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic6ac8cf0010c13db30219776d1d275709bf211df
Gerrit-Change-Number: 17802
Gerrit-PatchSet: 5
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 23 Sep 2021 00:45:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10928: Upgrade source of Kudu's EasyCurl in kudu/util

2021-09-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17864 )

Change subject: IMPALA-10928: Upgrade source of Kudu's EasyCurl in kudu/util
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9491/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17864
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I51fea9ff4de767bfda2a4ebcd3744d63440adf14
Gerrit-Change-Number: 17864
Gerrit-PatchSet: 1
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Thu, 23 Sep 2021 00:45:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10928: Upgrade source of Kudu's EasyCurl in kudu/util

2021-09-22 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/17864


Change subject: IMPALA-10928: Upgrade source of Kudu's EasyCurl in kudu/util
..

IMPALA-10928: Upgrade source of Kudu's EasyCurl in kudu/util

We are going to use Kudu's EasyCurl wrapper to download file in Impala.
There are some enhancements for Kudu's EasyCurl in Kudu upstream since
the last sync of kudu/util with Kudu repo. This patch upgraded source
code for Kudu's EasyCurl wrapper (curl_util.cc and curl_util.h in
kudu/util) from the top of Kudu upstream (commit hash: 41ebabf2eb).

Note that curl_util.cc/curl_util.h have dependency on libcurl, which
is not integrated in Impala. curl_util.cc is not built with current
kudu/util/CMakeLists.txt. We will integrate curl_util.cc/curl_util.h
when we integrate libcurl in following patch.

Testing:
 - There is no impact for Impala tests.

Change-Id: I51fea9ff4de767bfda2a4ebcd3744d63440adf14
---
M be/src/kudu/util/curl_util-test.cc
M be/src/kudu/util/curl_util.cc
M be/src/kudu/util/curl_util.h
3 files changed, 155 insertions(+), 52 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/17864/1
--
To view, visit http://gerrit.cloudera.org:8080/17864
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I51fea9ff4de767bfda2a4ebcd3744d63440adf14
Gerrit-Change-Number: 17864
Gerrit-PatchSet: 1
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Joe McDonnell 


[Impala-ASF-CR] IMPALA-10876: Support to download JWKS from given URL

2021-09-22 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/17802 )

Change subject: IMPALA-10876: Support to download JWKS from given URL
..

IMPALA-10876: Support to download JWKS from given URL

This patch added functionality to download JWKS from a given URL and
support key rotation by periodically checking the JWKS URL for updates.

We use Kudu's EasyCurl wrapper to download file from the given URL.
curl was added to native-toolchain. This patch modified makefiles
and bootstrap_toolchain.py to integrate libcurl and libkudu_curl_util.

Added end-end JWT authentication test cases with JWKS specified as
HTTP/HTTPS URL.

Testing:
 - Passed core run, including new test cases.

Change-Id: Ic6ac8cf0010c13db30219776d1d275709bf211df
---
M CMakeLists.txt
M be/CMakeLists.txt
M be/src/kudu/util/CMakeLists.txt
M be/src/rpc/authentication.cc
M be/src/service/impala-server.cc
M be/src/util/jwt-util-internal.h
M be/src/util/jwt-util-test.cc
M be/src/util/jwt-util.cc
M be/src/util/jwt-util.h
M bin/bootstrap_toolchain.py
M bin/impala-config.sh
M bin/rat_exclude_files.txt
A cmake_modules/FindCurl.cmake
M fe/src/test/java/org/apache/impala/customcluster/CustomClusterRunner.java
M fe/src/test/java/org/apache/impala/customcluster/JwtHttpTest.java
M fe/src/test/java/org/apache/impala/customcluster/JwtWebserverTest.java
A testdata/jwt/jwks_es256.json
17 files changed, 570 insertions(+), 140 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/17802/5
--
To view, visit http://gerrit.cloudera.org:8080/17802
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic6ac8cf0010c13db30219776d1d275709bf211df
Gerrit-Change-Number: 17802
Gerrit-PatchSet: 5
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-10876: Support to download JWKS from given URL

2021-09-22 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17802 )

Change subject: IMPALA-10876: Support to download JWKS from given URL
..


Patch Set 4:

(1 comment)

Thanks for the changes, I'm working my way through. I had one initial comment.

http://gerrit.cloudera.org:8080/#/c/17802/4/be/src/kudu/util/curl_util.h
File be/src/kudu/util/curl_util.h:

http://gerrit.cloudera.org:8080/#/c/17802/4/be/src/kudu/util/curl_util.h@17
PS4, Line 17:
: #pragma once
: 
: #include 
: #include 
: #include 
: #include 
:
: #include "kudu/gutil/macros.h"
: #include "kudu/util/monotime.h"
: #include "kudu/util/status.h"
:
: typedef void CURL;
:
: namespace kudu {
:
: class faststring;
:
: enum class CurlAuthType {
:   NONE,
:   BASIC,
:   DIGEST,
:   SPNEGO,
: };
:
: // Simple wrapper around curl's "easy" interface, allowing the 
user to
: // fetch web pages into memory using a blocking API.
: //
: // This is not thread-safe.
: class EasyCurl {
:  public:
:   EasyCurl();
:   ~EasyCurl();
:
:   // Fetch the given URL into the provided buffer.
:   // Any existing data in the buffer is replaced.
:   // The optional param 'headers' holds additional headers.
:   // e.g. {"Accept-Encoding: gzip"}
:   Status FetchURL(const std::string& url,
:   faststring* dst,
:   const std::vector& headers = {});
:
:   // Issue an HTTP POST to the given URL with the given data.
:   // Returns results in 'dst' as above.
:   // The optional param 'headers' holds additional headers.
:   // e.g. {"Accept-Encoding: gzip"}
:   Status PostToURL(const std::string& url,
:const std::string& post_data,
:faststring* dst,
:const std::vector& headers = {});
:
:   // Set whether to verify the server's SSL certificate in the 
case of an HTTPS
:   // connection.
:   void set_verify_peer(bool verify)
Can we split out the change to be/src/kudu into its own commit? That commit can 
list out which Kudu JIRAs are being incorporated.

It will be a bit of logistic annoyance, but that will keep it clear what we are 
doing to be/src/kudu.



--
To view, visit http://gerrit.cloudera.org:8080/17802
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic6ac8cf0010c13db30219776d1d275709bf211df
Gerrit-Change-Number: 17802
Gerrit-PatchSet: 4
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 22 Sep 2021 22:22:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

2021-09-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS 
event id when performing DDL operations via catalog HMS endpoints
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7483/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c474eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 4
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 22 Sep 2021 21:33:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9857: Batching of consecutive partition events

2021-09-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17848 )

Change subject: IMPALA-9857: Batching of consecutive partition events
..


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7482/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/17848
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5d27a68a64436d31731e9a219b1efd6fc842de73
Gerrit-Change-Number: 17848
Gerrit-PatchSet: 7
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 22 Sep 2021 20:50:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2581: LIMIT can be propagated down into some aggregations

2021-09-22 Thread Qifan Chen (Code Review)
Qifan Chen has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/17821 )

Change subject: IMPALA-2581: LIMIT can be propagated down into some aggregations
..

IMPALA-2581: LIMIT can be propagated down into some aggregations

This patch contains 2 parts:
1. When both conditions below are true, push down limit to
pre-aggregation
 a) aggregation node has no aggregate function
 b) aggregation node has no predicate
2. finish aggregation when number of unique keys of hash table has
exceeded the limit.

Sample queries:
SELECT DISTINCT f FROM t LIMIT n
Can pass the LIMIT all the way down to the pre-aggregation, which
leads to a nearly unbounded speedup on these queries in large tables
when n is low.

Testing:
Add test targeted-perf/queries/aggregation.test
Pass core test

Change-Id: I930a6cb203615acfc03f23118d1bc1f0ea360995
Reviewed-on: http://gerrit.cloudera.org:8080/17821
Reviewed-by: Qifan Chen 
Tested-by: Impala Public Jenkins 
---
M be/src/exec/aggregation-node-base.cc
M be/src/exec/aggregation-node-base.h
M be/src/exec/aggregation-node.cc
M be/src/exec/aggregator.h
M be/src/exec/grouping-aggregator.cc
M be/src/exec/grouping-aggregator.h
M be/src/exec/non-grouping-aggregator.h
M be/src/exec/streaming-aggregation-node.cc
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/planner/AggregationNode.java
M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/setoperation-rewrite.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q06.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q54.test
M testdata/workloads/functional-query/queries/QueryTest/spilling.test
M testdata/workloads/targeted-perf/queries/aggregation.test
19 files changed, 147 insertions(+), 29 deletions(-)

Approvals:
  Qifan Chen: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/17821
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I930a6cb203615acfc03f23118d1bc1f0ea360995
Gerrit-Change-Number: 17821
Gerrit-PatchSet: 18
Gerrit-Owner: liuyao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: liuyao 


[Impala-ASF-CR] IMPALA-10922: Fix wrong sarg on ORC Date column in release build

2021-09-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17861 )

Change subject: IMPALA-10922: Fix wrong sarg on ORC Date column in release build
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/17861
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ieda467c822cf5c6f0edc0ddfe4da61c46b04e51f
Gerrit-Change-Number: 17861
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Wed, 22 Sep 2021 20:36:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10922: Fix wrong sarg on ORC Date column in release build

2021-09-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/17861 )

Change subject: IMPALA-10922: Fix wrong sarg on ORC Date column in release build
..

IMPALA-10922: Fix wrong sarg on ORC Date column in release build

One line of code is wrapped with DCHECK in generating ORC Date type
literal for predicate pushdown. Release build will ignore all codes
wrapped in DCHECK so that line of logic is missing. This patch fixes it
by moving the code outside of DCHECK.

Tests:
 - Reproduced the issue and verified the fix in release build.

Change-Id: Ieda467c822cf5c6f0edc0ddfe4da61c46b04e51f
Reviewed-on: http://gerrit.cloudera.org:8080/17861
Reviewed-by: Qifan Chen 
Tested-by: Impala Public Jenkins 
---
M be/src/exec/hdfs-orc-scanner.cc
1 file changed, 2 insertions(+), 1 deletion(-)

Approvals:
  Qifan Chen: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/17861
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ieda467c822cf5c6f0edc0ddfe4da61c46b04e51f
Gerrit-Change-Number: 17861
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

2021-09-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS 
event id when performing DDL operations via catalog HMS endpoints
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9489/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c474eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 4
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 22 Sep 2021 20:27:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

2021-09-22 Thread Sourabh Goyal (Code Review)
Hello kis...@cloudera.com, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#4).

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS 
event id when performing DDL operations via catalog HMS endpoints
..

IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when 
performing
DDL operations via catalog HMS endpoints

Change-Id: I36364e401911352c474eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M 
fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M 
fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M 
fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
M 
fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsFileMetadataTest.java
A 
fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
29 files changed, 3,403 insertions(+), 273 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/4
--
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c474eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 4
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

2021-09-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS 
event id when performing DDL operations via catalog HMS endpoints
..


Patch Set 3:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/9488/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c474eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 3
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 22 Sep 2021 19:19:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

2021-09-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS 
event id when performing DDL operations via catalog HMS endpoints
..


Patch Set 3:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/17859/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/17859/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@166
PS3, Line 166: public synchronized static MetastoreEventFactory 
getInstance(CatalogOpExecutor opExecutor) {
line too long (96 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@227
PS3, Line 227: List 
getFilteredEvents(List events, Metrics metrics)
line too long (91 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@295
PS3, Line 295: public static synchronized MetastoreEventFactory 
getInstance(CatalogOpExecutor opExecutor) {
line too long (96 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/3/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
File 
fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java:

http://gerrit.cloudera.org:8080/#/c/17859/3/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@245
PS3, Line 245:   metastoreEventFactory_.get(events.get(0), 
metastoreEventsMetrics_).processIfEnabled();
line too long (92 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/3/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@509
PS3, Line 509: org.apache.impala.catalog.Table tbl = 
getTableAndAcquireWriteLock(partition.getDbName(),
line too long (92 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/3/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
File 
fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java:

http://gerrit.cloudera.org:8080/#/c/17859/3/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java@367
PS3, Line 367: HdfsTable currentTbl = (HdfsTable) 
catalog_.getTable(TEST_DB_NAME, tblNameLowerCase);
line too long (97 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/3/fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
File fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17859/3/fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java@116
PS3, Line 116: 
cs.setEventFactoryForSyncToLatestEvent(EventFactoryForSyncToLatestEvent.getInstance(opExecutor));
line too long (101 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c474eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 3
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 22 Sep 2021 19:11:59 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

2021-09-22 Thread Sourabh Goyal (Code Review)
Hello kis...@cloudera.com, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#3).

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS 
event id when performing DDL operations via catalog HMS endpoints
..

IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when 
performing
DDL operations via catalog HMS endpoints

Change-Id: I36364e401911352c474eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M 
fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M 
fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M 
fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
M 
fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsFileMetadataTest.java
A 
fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
29 files changed, 3,396 insertions(+), 272 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/3
--
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c474eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 3
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-10927 TestFetchAndSpooling.test rows sent counters is flaky in core-s3 based test

2021-09-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17862 )

Change subject: IMPALA-10927 TestFetchAndSpooling.test_rows_sent_counters is 
flaky in core-s3 based test
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9487/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie6f1a8bc80c24c1368282be097aa8f943dd95d1e
Gerrit-Change-Number: 17862
Gerrit-PatchSet: 4
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 22 Sep 2021 18:43:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10714: Defer advancing read page until the buffer is attached

2021-09-22 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17853 )

Change subject: IMPALA-10714: Defer advancing read page until the buffer is 
attached
..


Patch Set 4: Code-Review+1

Looks great!


--
To view, visit http://gerrit.cloudera.org:8080/17853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I586ed72ba01cc3f28b0dcb1e202b3ca32a6c3b83
Gerrit-Change-Number: 17853
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 22 Sep 2021 18:38:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10927 TestFetchAndSpooling.test rows sent counters is flaky in core-s3 based test

2021-09-22 Thread Qifan Chen (Code Review)
Qifan Chen has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/17862 )

Change subject: IMPALA-10927 TestFetchAndSpooling.test_rows_sent_counters is 
flaky in core-s3 based test
..

IMPALA-10927 TestFetchAndSpooling.test_rows_sent_counters is flaky in core-s3 
based test

This fix removes the flakiness for test_rows_sent_counters test by
disabling it in S3 testing environment.

Testing:
1. Unit test in a non-S3 environment.

Change-Id: Ie6f1a8bc80c24c1368282be097aa8f943dd95d1e
---
M tests/query_test/test_fetch.py
1 file changed, 2 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/17862/4
--
To view, visit http://gerrit.cloudera.org:8080/17862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie6f1a8bc80c24c1368282be097aa8f943dd95d1e
Gerrit-Change-Number: 17862
Gerrit-PatchSet: 4
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-9857: Batching of consecutive partition events

2021-09-22 Thread Sourabh Goyal (Code Review)
Sourabh Goyal has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17848 )

Change subject: IMPALA-9857: Batching of consecutive partition events
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17848/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/17848/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1641
PS2, Line 1641:   String serviceIdOfEvent = 
MetastoreEvents.getStringProperty(parametersFromEvent,
> I spent some time one this yesterday. It definitely looks like a couple of
Makes sense. We can take it up later. Please do file a jira to track this.



--
To view, visit http://gerrit.cloudera.org:8080/17848
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5d27a68a64436d31731e9a219b1efd6fc842de73
Gerrit-Change-Number: 17848
Gerrit-PatchSet: 7
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 22 Sep 2021 18:20:29 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10714: Defer advancing read page until the buffer is attached

2021-09-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17853 )

Change subject: IMPALA-10714: Defer advancing read page until the buffer is 
attached
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9486/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I586ed72ba01cc3f28b0dcb1e202b3ca32a6c3b83
Gerrit-Change-Number: 17853
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 22 Sep 2021 18:10:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9857: Batching of consecutive partition events

2021-09-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17848 )

Change subject: IMPALA-9857: Batching of consecutive partition events
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9485/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17848
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5d27a68a64436d31731e9a219b1efd6fc842de73
Gerrit-Change-Number: 17848
Gerrit-PatchSet: 7
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 22 Sep 2021 18:02:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10714: Defer advancing read page until the buffer is attached

2021-09-22 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17853 )

Change subject: IMPALA-10714: Defer advancing read page until the buffer is 
attached
..


Patch Set 4:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/17853/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17853/3//COMMIT_MSG@12
PS3, Line 12: However, BufferedTupleStream::GetNextInternal() will not
: attach the page buffer of a fully read page if it is a read-write 
page.
:
> The attachment that is NOT invoked is at https://github.com/apache/impala/b
I describe the scenario that lead to the assertion failure in patch set 4.


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc
File be/src/runtime/buffered-tuple-stream-test.cc:

http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc@573
PS3, Line 573: StreamStateTest
> cool class!
Thanks!


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc@575
PS3, Line 575: defer
> nit. missing s.
Done


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc@1450
PS3, Line 1450:
> nit: this can be simplified to be
Done


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc@1455
PS3, Line 1455:
> Can we assert on this assertion here:
Added the assertion, but rephrasing it a bit.


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc@1484
PS3, Line 1484:   num_pinned_pages = 1;
  :   ASSERT_TRUE(stream.pages_.back().is_pinned());
  : }
> num_pinned_pages = 0 can be moved out of the IF.
Done


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc@1490
PS3, Line 1490:   }
> line too long (93 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc@2220
PS3, Line 2220: nAfterFullStream
> Maybe add a comment in the code to explain why these two combos are not tes
Those combos are not tested because, as far as I know, read-only stream can not 
return to writing state after it gets into reading state.
I add an explanation message in the first DCHECK at 
StreamStateTest::TestUnpinAfterFullStreamRead().


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc@2213
PS3, Line 2213: }
  :
  : TEST_F(StreamStateTest, 
UnpinFullyExhaustedReadPageOnReadWriteStreamNoAttachRefill) {
  :   TestUnpinAfterFullStreamRead(true, false, true);
  : }
  :
  : TEST_F(StreamStateTest, 
UnpinFullyExhaustedReadPageOnReadWriteStreamNoAttachNoRefill) {
  :   TestUnpinAfterFullStreamRead(true, false, false);
  : }
  :
> nit: I'd suggest spliting these into 6 tests so it'd be easier to locate th
Done


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream.cc
File be/src/runtime/buffered-tuple-stream.cc:

http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream.cc@724
PS3, Line 724:   && (num_pages_ <= 2 || 
!read_it_.read_page_->attached_to_output_batch)) {
> nit: we can merge these two if-clauses into one.
Done


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream.cc@730
PS3, Line 730: //reservation if the stream ended up with only 1 
read/write page after
> Could you update the comments to mention the new case?
Done



--
To view, visit http://gerrit.cloudera.org:8080/17853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I586ed72ba01cc3f28b0dcb1e202b3ca32a6c3b83
Gerrit-Change-Number: 17853
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 22 Sep 2021 17:57:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10714: Defer advancing read page until the buffer is attached

2021-09-22 Thread Riza Suminto (Code Review)
Hello Quanlong Huang, Qifan Chen, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17853

to look at the new patch set (#4).

Change subject: IMPALA-10714: Defer advancing read page until the buffer is 
attached
..

IMPALA-10714: Defer advancing read page until the buffer is attached

If a BufferedTupleStream is a read-write stream and set up with
attach_on_read = true, BufferedTupleStream::NextReadPage() expects that
the page buffer is attached to the output row batch before advancing the
read iterator. However, BufferedTupleStream::GetNextInternal() will not
attach the page buffer of a fully read page if it is a read-write page.

Consider the following scenario:
1. Only 1 page left in stream. This is a read-write page.
2. GetNext() has fully read the page, but does NOT attach the buffer to
   output row batch because it is a read-write page.
3. Stream writer insert more rows, but the read-write page can not fit
   any more rows. Therefore, new pages are created.
4. Stream writer call UnpinStream().
5. UnpinStream() call NextReadPage(), which in turn will fail the
   assertion "read_iter->read_page_->attached_to_output_batch".

BufferedTupleStream::UnpinStream() need to defer advancing the read page
if this situation happens.

This patch adds BE test StreamStateTest.UnpinFullyExhaustedReadPage that
simulates the corner case. This patch also moves BE test
DeferAdvancingReadPage and ShortDebugString into class StreamStateTest
to reduce friend class declaration in buffered-tuple-stream.h

Testing:
- Run and pass BE test StreamStateTest.UnpinFullyExhaustedReadPage.

Change-Id: I586ed72ba01cc3f28b0dcb1e202b3ca32a6c3b83
---
M be/src/runtime/buffered-tuple-stream-test.cc
M be/src/runtime/buffered-tuple-stream.cc
M be/src/runtime/buffered-tuple-stream.h
3 files changed, 244 insertions(+), 19 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/17853/4
--
To view, visit http://gerrit.cloudera.org:8080/17853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I586ed72ba01cc3f28b0dcb1e202b3ca32a6c3b83
Gerrit-Change-Number: 17853
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

2021-09-22 Thread Anonymous Coward (Code Review)
kis...@cloudera.com has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS 
event id when performing DDL operations via catalog HMS endpoints
..


Patch Set 2:

(19 comments)

http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@223
PS2, Line 223:   public static final long LOCK_RETRY_TIMEOUT_MS = 720;
720 seems to be too large. How did you pick this value ?


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2142
PS2, Line 2142:   LOG.trace("table {} exits in cache, last synced id {}", 
tbl.getFullName(),
Why do you have many .trace() instead of .debug() ?


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3654
PS2, Line 3654: tableLoadingMgr_.start();
Since its no longer final, can you check the code to make sure, we are not 
keeping its reference somewhere else ?


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/TableLoader.java
File fe/src/main/java/org/apache/impala/catalog/TableLoader.java:

http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/TableLoader.java@51
PS2, Line 51:   private static final Logger LOG = 
LoggerFactory.getLogger(TableLoader.class);
What is the difference between LoggerFactory vs Logger ?


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/TableLoader.java@164
PS2, Line 164: // write lock is not required since it is full table 
reload
Can there be multiple threads trying to do full table reload at the same time ? 
Would a write lock be useful in that case ?


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
File fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java:

http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java@159
PS2, Line 159:   private CatalogServiceCatalog catalog_;
Why are you removing "final" for CatalogServiceCatalog and CatalogOpExecutor ?


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@266
PS2, Line 266:   // TODO: Should we skip creating a new metastore events factory
I think it all depends on whether we can toggle this flag, without restarting 
CatalogD or not.


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@835
PS2, Line 835: protected boolean shouldSkipWhenSyncingToLatestEventId() 
throws CatalogException {
Please write the Javadoc for this method.


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@966
PS2, Line 966: protected boolean shouldSkipWhenSyncingToLatestEventId() {
May be you can add default interface method which returns false in the abstract 
class ?


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1163
PS2, Line 1163:   String oldDbName = tableBefore_.getDbName();
Please use the variables directly from tableBefore and tableAfter.


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1504
PS2, Line 1504: @Override
You can avoid most of these by having default implementation, which returns 
false.


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@357
PS2, Line 357:1. Should we stop after processing drop_table
Please remove these TODOs or update them as per what is needed in future, 
before you commit the changes. Looks like some of them are discussions and not 
strictly TODOs.


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@475
PS2, Line 475:   // TODO: should we ignore case?
I think it should become a utility method. Otherwise, everywhere we have to 
keep thinking whether it should be ignore case or not !



[Impala-ASF-CR] IMPALA-9857: Batching of consecutive partition events

2021-09-22 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17848 )

Change subject: IMPALA-9857: Batching of consecutive partition events
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17848/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/17848/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1641
PS2, Line 1641:   return versionNumberFromEvent_ == versionNumberOfEvent && 
serviceIdFromEvent_.equals(
> Okay. Let me look into this and see how much additional changes will need t
I spent some time one this yesterday. It definitely looks like a couple of days 
worth of effort. Given that the self-event logic is going to be simplified soon 
with https://gerrit.cloudera.org/#/c/17756/, I am inclined to take this up once 
we have this change merged so that it would become much easier to implement 
this suggestion. Thoughts?



--
To view, visit http://gerrit.cloudera.org:8080/17848
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5d27a68a64436d31731e9a219b1efd6fc842de73
Gerrit-Change-Number: 17848
Gerrit-PatchSet: 2
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 22 Sep 2021 17:41:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9857: Batching of consecutive partition events

2021-09-22 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has uploaded a new patch set (#7). ( 
http://gerrit.cloudera.org:8080/17848 )

Change subject: IMPALA-9857: Batching of consecutive partition events
..

IMPALA-9857: Batching of consecutive partition events

This patch improves the performance of events processor
by batching together consecutive ALTER_PARTITION or
INSERT events. Currently, without this patch, if
the events stream consists of a lot of consecutive
ALTER_PARTITION events which cannot be skipped,
events processor will refresh partition from each
event one by one. Similarly, in case of INSERT events
in a partition events processor refresh one partition
at a time.

By batching together such consecutive ALTER_PARTITION or
INSERT events, events processor needs to take lock on the table
only once per batch and can refresh all the partitions from
the events using multiple threads. For transactional (acid)
tables, this provides even significant performance gain
since currently we refresh the whole table in case of
ALTER_PARTITION or INSERT partition events. By batching them
together, events processor will refresh the table once per
batch.

The batch of eligible ALTER_PARTITION and INSERT events will
be processed as ALTER_PARTITIONS and INSERT_PARTITIONS event
respectively.

Performance tests:
In order to simulate bunch of ALTER_PARTITION and INSERT
events, a simple test was performed by running the following
query from hive:
insert into store_sales_copy partition(ss_sold_date_sk)
select * from store_sales;

This query generates 1824 ALTER_PARTITION and 1824 INSERT
events and time taken to process all the events generated
was measured before and after the patch for external and
ACID table.

Table Type  Before  After
==
External table  75 sec  25 sec
ACID tables 313 sec 47 sec

Additionally, the patch also fixes a minor bug in
evaluateSelfEvent() method which should return false when
serviceId does not match.

Testing Done:
1. Added new tests which cover the batching logic of events.
2. Exhaustive tests.

Change-Id: I5d27a68a64436d31731e9a219b1efd6fc842de73
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/SelfEventContext.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M tests/custom_cluster/test_events_custom_configs.py
9 files changed, 924 insertions(+), 185 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/17848/7
--
To view, visit http://gerrit.cloudera.org:8080/17848
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5d27a68a64436d31731e9a219b1efd6fc842de73
Gerrit-Change-Number: 17848
Gerrit-PatchSet: 7
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-10927 TestFetchAndSpooling.test rows sent counters is flaky in core-s3 based test

2021-09-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17862 )

Change subject: IMPALA-10927 TestFetchAndSpooling.test_rows_sent_counters is 
flaky in core-s3 based test
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9483/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie6f1a8bc80c24c1368282be097aa8f943dd95d1e
Gerrit-Change-Number: 17862
Gerrit-PatchSet: 2
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 22 Sep 2021 17:33:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10927 TestFetchAndSpooling.test rows sent counters is flaky in core-s3 based test

2021-09-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17862 )

Change subject: IMPALA-10927 TestFetchAndSpooling.test_rows_sent_counters is 
flaky in core-s3 based test
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9484/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie6f1a8bc80c24c1368282be097aa8f943dd95d1e
Gerrit-Change-Number: 17862
Gerrit-PatchSet: 3
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 22 Sep 2021 17:33:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2581: LIMIT can be propagated down into some aggregations

2021-09-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17821 )

Change subject: IMPALA-2581: LIMIT can be propagated down into some aggregations
..


Patch Set 17: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/17821
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I930a6cb203615acfc03f23118d1bc1f0ea360995
Gerrit-Change-Number: 17821
Gerrit-PatchSet: 17
Gerrit-Owner: liuyao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: liuyao 
Gerrit-Comment-Date: Wed, 22 Sep 2021 17:17:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10927 TestFetchAndSpooling.test rows sent counters is flaky in core-s3 based test

2021-09-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17862 )

Change subject: IMPALA-10927 TestFetchAndSpooling.test_rows_sent_counters is 
flaky in core-s3 based test
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17862/3/tests/query_test/test_fetch.py
File tests/query_test/test_fetch.py:

http://gerrit.cloudera.org:8080/#/c/17862/3/tests/query_test/test_fetch.py@24
PS3, Line 24: from tests.util.filesystem_utils import IS_S3
flake8: F401 'tests.util.filesystem_utils.IS_S3' imported but unused



--
To view, visit http://gerrit.cloudera.org:8080/17862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie6f1a8bc80c24c1368282be097aa8f943dd95d1e
Gerrit-Change-Number: 17862
Gerrit-PatchSet: 3
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 22 Sep 2021 17:12:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10927 TestFetchAndSpooling.test rows sent counters is flaky in core-s3 based test

2021-09-22 Thread Qifan Chen (Code Review)
Qifan Chen has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/17862 )

Change subject: IMPALA-10927 TestFetchAndSpooling.test_rows_sent_counters is 
flaky in core-s3 based test
..

IMPALA-10927 TestFetchAndSpooling.test_rows_sent_counters is flaky in core-s3 
based test

This fix removes the flakiness for test_rows_sent_counters test by
disabling it in S3 testing environment.

Testing:
1. Unit test in a non-S3 environment.

Change-Id: Ie6f1a8bc80c24c1368282be097aa8f943dd95d1e
---
M tests/query_test/test_fetch.py
1 file changed, 3 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/17862/3
--
To view, visit http://gerrit.cloudera.org:8080/17862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie6f1a8bc80c24c1368282be097aa8f943dd95d1e
Gerrit-Change-Number: 17862
Gerrit-PatchSet: 3
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-10927 TestFetchAndSpooling.test rows sent counters is flaky in core-s3 based test

2021-09-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17862 )

Change subject: IMPALA-10927 TestFetchAndSpooling.test_rows_sent_counters is 
flaky in core-s3 based test
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17862/2/tests/query_test/test_fetch.py
File tests/query_test/test_fetch.py:

http://gerrit.cloudera.org:8080/#/c/17862/2/tests/query_test/test_fetch.py@24
PS2, Line 24: from tests.util.filesystem_utils import IS_S3
flake8: F401 'tests.util.filesystem_utils.IS_S3' imported but unused



--
To view, visit http://gerrit.cloudera.org:8080/17862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie6f1a8bc80c24c1368282be097aa8f943dd95d1e
Gerrit-Change-Number: 17862
Gerrit-PatchSet: 2
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 22 Sep 2021 17:11:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10927 TestFetchAndSpooling.test rows sent counters is flaky in core-s3 based test

2021-09-22 Thread Qifan Chen (Code Review)
Qifan Chen has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/17862


Change subject: IMPALA-10927 TestFetchAndSpooling.test_rows_sent_counters is 
flaky in core-s3 based test
..

IMPALA-10927 TestFetchAndSpooling.test_rows_sent_counters is flaky in core-s3 
based test

This fix removes the flakiness for test_rows_sent_counters test by disabling
it in S3 testing environment.

Testing:
1. Unit test in a non-S3 environment.

Change-Id: Ie6f1a8bc80c24c1368282be097aa8f943dd95d1e
---
M tests/query_test/test_fetch.py
1 file changed, 3 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/17862/2
--
To view, visit http://gerrit.cloudera.org:8080/17862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ie6f1a8bc80c24c1368282be097aa8f943dd95d1e
Gerrit-Change-Number: 17862
Gerrit-PatchSet: 2
Gerrit-Owner: Qifan Chen 


[Impala-ASF-CR] IMPALA-10921 Add script to compare TPCDS runs.

2021-09-22 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17855 )

Change subject: IMPALA-10921 Add script to compare TPCDS runs.
..


Patch Set 1:

(5 comments)

Thanks for adding using useful tools to the repository!

http://gerrit.cloudera.org:8080/#/c/17855/1/bin/diagnostics/experimental/tpcds_run_comparator.py
File bin/diagnostics/experimental/tpcds_run_comparator.py:

http://gerrit.cloudera.org:8080/#/c/17855/1/bin/diagnostics/experimental/tpcds_run_comparator.py@57
PS1, Line 57: SECTIONS = [
nit: could you please add some comments for SECTIONS? It's a bit misleading to 
use the next section's first line to mark the end of the current section.


http://gerrit.cloudera.org:8080/#/c/17855/1/bin/diagnostics/experimental/tpcds_run_comparator.py@93
PS1, Line 93: byte *= 1024**3
Maybe raise an error for unknown 'unit'.


http://gerrit.cloudera.org:8080/#/c/17855/1/bin/diagnostics/experimental/tpcds_run_comparator.py@132
PS1, Line 132: if len(parts) == 7:
Could you add some comments here?


http://gerrit.cloudera.org:8080/#/c/17855/1/bin/diagnostics/experimental/tpcds_run_comparator.py@180
PS1, Line 180: <
maybe it would be interesting to see if there was an increase in some cases.


http://gerrit.cloudera.org:8080/#/c/17855/1/bin/diagnostics/experimental/tpcds_run_comparator.py@231
PS1, Line 231: "{path} is not a valid path
needs format?



--
To view, visit http://gerrit.cloudera.org:8080/17855
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib2e9ae1a2919156b0022072f47ff71d7775b20e6
Gerrit-Change-Number: 17855
Gerrit-PatchSet: 1
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 22 Sep 2021 16:02:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10714: Defer advancing read page until the buffer is attached

2021-09-22 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17853 )

Change subject: IMPALA-10714: Defer advancing read page until the buffer is 
attached
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17853/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17853/3//COMMIT_MSG@12
PS3, Line 12: However, BufferedTupleStream::GetNextInternal() will not
: attach the page buffer of a completely read page if it is a 
read-write
: page.
> Isn't this done at https://github.com/apache/impala/blob/35b21083b1866b7056
The attachment that is NOT invoked is at 
https://github.com/apache/impala/blob/35b21083b1866b7056e3810ae5a8daf7bc77ddda/be/src/runtime/buffered-tuple-stream.cc#L888
Notice the condition 'if (!has_read_write_page())' right above it.



--
To view, visit http://gerrit.cloudera.org:8080/17853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I586ed72ba01cc3f28b0dcb1e202b3ca32a6c3b83
Gerrit-Change-Number: 17853
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 22 Sep 2021 15:42:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10714: Defer advancing read page until the buffer is attached

2021-09-22 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17853 )

Change subject: IMPALA-10714: Defer advancing read page until the buffer is 
attached
..


Patch Set 3:

(7 comments)

Looks great! Thanks a lot for the refactoring which makes the new code 
enjoyable to read.

http://gerrit.cloudera.org:8080/#/c/17853/2/be/src/runtime/buffered-tuple-stream-test.cc
File be/src/runtime/buffered-tuple-stream-test.cc:

http://gerrit.cloudera.org:8080/#/c/17853/2/be/src/runtime/buffered-tuple-stream-test.cc@1388
PS2, Line 1388: eservation to alloc
> This is now addressed by:
Done


http://gerrit.cloudera.org:8080/#/c/17853/2/be/src/runtime/buffered-tuple-stream-test.cc@1397
PS2, Line 1397:
> I believe this is now addressed by TestUnpinAfterFullStreamRead with attach
Done


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc
File be/src/runtime/buffered-tuple-stream-test.cc:

http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc@573
PS3, Line 573: StreamStateTest
cool class!


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc@575
PS3, Line 575: defer
nit. missing s.


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc@1455
PS3, Line 1455: removed from stream.pages_
Can we assert on this assertion here:

if (!read_write && attach_on_read && stream.pinned()) ASSERT_TRUE(stream.pages_ 
> 0);


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc@1484
PS3, Line 1484: num_pinned_pages = 0;
  :   } else {
  : num_pinned_pages = 0;
num_pinned_pages = 0 can be moved out of the IF.


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc@2220
PS3, Line 2220: read-only stream
Maybe add a comment in the code to explain why these two combos are not tested: 
 (false, false, true) and (false, true, true).



--
To view, visit http://gerrit.cloudera.org:8080/17853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I586ed72ba01cc3f28b0dcb1e202b3ca32a6c3b83
Gerrit-Change-Number: 17853
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 22 Sep 2021 14:22:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10922: Fix wrong sarg on ORC Date column in release build

2021-09-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17861 )

Change subject: IMPALA-10922: Fix wrong sarg on ORC Date column in release build
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7481/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/17861
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ieda467c822cf5c6f0edc0ddfe4da61c46b04e51f
Gerrit-Change-Number: 17861
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Wed, 22 Sep 2021 14:26:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10922: Fix wrong sarg on ORC Date column in release build

2021-09-22 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17861 )

Change subject: IMPALA-10922: Fix wrong sarg on ORC Date column in release build
..


Patch Set 1: Code-Review+2

Good catch.


--
To view, visit http://gerrit.cloudera.org:8080/17861
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ieda467c822cf5c6f0edc0ddfe4da61c46b04e51f
Gerrit-Change-Number: 17861
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Wed, 22 Sep 2021 14:25:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9857: Batching of consecutive partition events

2021-09-22 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17848 )

Change subject: IMPALA-9857: Batching of consecutive partition events
..


Patch Set 6:

(11 comments)

The patch looks great. I mostly found nits.

http://gerrit.cloudera.org:8080/#/c/17848/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/17848/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2716
PS6, Line 2716: the
nit: then?


http://gerrit.cloudera.org:8080/#/c/17848/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2733
PS6, Line 2733: value for a given partition key is null
When we build the partition map we only put non-null values into it at L2709.


http://gerrit.cloudera.org:8080/#/c/17848/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2741
PS6, Line 2741: hmsPartsToHdfsParts.get(hmsPartition);
You could iterate through 'hmsPartsToHdfsParts.entrySet()' in the for loop, so 
this lookup wouldn't be needed.


http://gerrit.cloudera.org:8080/#/c/17848/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/17848/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@284
PS6, Line 284: + "to {}",
nit: Could fit prev line, then L285 could fit this line. Just like at L298.


http://gerrit.cloudera.org:8080/#/c/17848/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@602
PS6, Line 602: ForBatching
nit: maybe just 'getPartition()'? Adding 'ForBatching' makes me think that we 
only batch events of the same partition.


http://gerrit.cloudera.org:8080/#/c/17848/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1792
PS6, Line 1792:
nit: indentation


http://gerrit.cloudera.org:8080/#/c/17848/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1836
PS6, Line 1836: //we
nit: missing space


http://gerrit.cloudera.org:8080/#/c/17848/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1836
PS6, Line 1836: a
nit: an


http://gerrit.cloudera.org:8080/#/c/17848/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1838
PS6, Line 1838: batchedEvents_.get(0)
nit: baseEvent_?


http://gerrit.cloudera.org:8080/#/c/17848/6/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/17848/6/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@4140
PS6, Line 4140: );
Shouldn't we still include 'e'?


http://gerrit.cloudera.org:8080/#/c/17848/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/17848/6/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2335
PS6, Line 2335: table name
database?



--
To view, visit http://gerrit.cloudera.org:8080/17848
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5d27a68a64436d31731e9a219b1efd6fc842de73
Gerrit-Change-Number: 17848
Gerrit-PatchSet: 6
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 22 Sep 2021 14:23:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10714: Defer advancing read page until the buffer is attached

2021-09-22 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17853 )

Change subject: IMPALA-10714: Defer advancing read page until the buffer is 
attached
..


Patch Set 3:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/17853/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17853/3//COMMIT_MSG@12
PS3, Line 12: However, BufferedTupleStream::GetNextInternal() will not
: attach the page buffer of a completely read page if it is a 
read-write
: page.
Isn't this done at 
https://github.com/apache/impala/blob/35b21083b1866b7056e3810ae5a8daf7bc77ddda/be/src/runtime/buffered-tuple-stream.cc#L812
 ? Or do you mean the case when the writer unpins the stream before the reader 
goes to this line?


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc
File be/src/runtime/buffered-tuple-stream-test.cc:

http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc@1450
PS3, Line 1450: bool attached = !read_write && attach_on_read ? true : 
false;
nit: this can be simplified to be

 bool attached = !read_write && attach_on_read;


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream-test.cc@2213
PS3, Line 2213:   TestUnpinAfterFullStreamRead(num_rows, buffer_size, 
max_num_pages, true, false, true);
  :   TestUnpinAfterFullStreamRead(num_rows, buffer_size, 
max_num_pages, true, false, false);
  :
  :   // Test read-write stream with attach_on_read.
  :   TestUnpinAfterFullStreamRead(num_rows, buffer_size, 
max_num_pages, true, true, true);
  :   TestUnpinAfterFullStreamRead(num_rows, buffer_size, 
max_num_pages, true, true, false);
  :
  :   // Test read-only stream both with and without attach_on_read.
  :   TestUnpinAfterFullStreamRead(num_rows, buffer_size, 
max_num_pages, false, false, false);
  :   TestUnpinAfterFullStreamRead(num_rows, buffer_size, 
max_num_pages, false, true, false);
nit: I'd suggest spliting these into 6 tests so it'd be easier to locate the 
failed case if any of them fail.


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream.cc
File be/src/runtime/buffered-tuple-stream.cc:

http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream.cc@724
PS3, Line 724: if (num_pages_ <= 2 || 
!read_it_.read_page_->attached_to_output_batch) {
nit: we can merge these two if-clauses into one.


http://gerrit.cloudera.org:8080/#/c/17853/3/be/src/runtime/buffered-tuple-stream.cc@730
PS3, Line 730:   // (see IMPALA-10584).
Could you update the comments to mention the new case?



--
To view, visit http://gerrit.cloudera.org:8080/17853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I586ed72ba01cc3f28b0dcb1e202b3ca32a6c3b83
Gerrit-Change-Number: 17853
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 22 Sep 2021 14:06:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2581: LIMIT can be propagated down into some aggregations

2021-09-22 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17821 )

Change subject: IMPALA-2581: LIMIT can be propagated down into some aggregations
..


Patch Set 17: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17821/14/testdata/workloads/functional-query/queries/QueryTest/spilling.test
File testdata/workloads/functional-query/queries/QueryTest/spilling.test:

http://gerrit.cloudera.org:8080/#/c/17821/14/testdata/workloads/functional-query/queries/QueryTest/spilling.test@450
PS14, Line 450:
> In some cases, FastLimitCheckExceededRows is 0,
Yeah, the new code ignores the spilled partitions. Done.



--
To view, visit http://gerrit.cloudera.org:8080/17821
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I930a6cb203615acfc03f23118d1bc1f0ea360995
Gerrit-Change-Number: 17821
Gerrit-PatchSet: 17
Gerrit-Owner: liuyao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: liuyao 
Gerrit-Comment-Date: Wed, 22 Sep 2021 13:26:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2581: LIMIT can be propagated down into some aggregations

2021-09-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17821 )

Change subject: IMPALA-2581: LIMIT can be propagated down into some aggregations
..


Patch Set 17:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7480/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/17821
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I930a6cb203615acfc03f23118d1bc1f0ea360995
Gerrit-Change-Number: 17821
Gerrit-PatchSet: 17
Gerrit-Owner: liuyao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: liuyao 
Gerrit-Comment-Date: Wed, 22 Sep 2021 11:01:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2581: LIMIT can be propagated down into some aggregations

2021-09-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17821 )

Change subject: IMPALA-2581: LIMIT can be propagated down into some aggregations
..


Patch Set 17: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7479/


--
To view, visit http://gerrit.cloudera.org:8080/17821
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I930a6cb203615acfc03f23118d1bc1f0ea360995
Gerrit-Change-Number: 17821
Gerrit-PatchSet: 17
Gerrit-Owner: liuyao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: liuyao 
Gerrit-Comment-Date: Wed, 22 Sep 2021 10:21:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10922: Fix wrong sarg on ORC Date column in release build

2021-09-22 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17861 )

Change subject: IMPALA-10922: Fix wrong sarg on ORC Date column in release build
..


Patch Set 1:

Verified the RELEASE build: https://jenkins.impala.io/job/pre-review-test/1139/


--
To view, visit http://gerrit.cloudera.org:8080/17861
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ieda467c822cf5c6f0edc0ddfe4da61c46b04e51f
Gerrit-Change-Number: 17861
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Wed, 22 Sep 2021 08:46:23 +
Gerrit-HasComments: No