[Impala-ASF-CR] IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables

2021-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17858 )

Change subject: IMPALA-10923: Fine grained table refreshing at partition level 
events for transactional tables
..


Patch Set 10:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9661/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17858
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9
Gerrit-Change-Number: 17858
Gerrit-PatchSet: 10
Gerrit-Owner: Yu-Wen Lai 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Fucun Chu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Tue, 26 Oct 2021 06:55:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

2021-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17964 )

Change subject: [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to 
latest HMS event id when performing DDL operations via catalog HMS endpoints
..


Patch Set 3: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7563/


--
To view, visit http://gerrit.cloudera.org:8080/17964
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I869268c4c23366ed0719b153252338af9738a5f6
Gerrit-Change-Number: 17964
Gerrit-PatchSet: 3
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 26 Oct 2021 06:38:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables

2021-10-25 Thread Yu-Wen Lai (Code Review)
Yu-Wen Lai has uploaded a new patch set (#10). ( 
http://gerrit.cloudera.org:8080/17858 )

Change subject: IMPALA-10923: Fine grained table refreshing at partition level 
events for transactional tables
..

IMPALA-10923: Fine grained table refreshing at partition level events
for transactional tables

To enable fine-grained table refreshing, there are three main changes
in this commit.
1. Maintain validWriteIdList in Catalogd for transactional tables. We
  will keep track of write id changes for partitioned tables by
  AllocWriteIdEvents, CommitTxnEvents, and AbortTxnEvents.
2. Conduct partition level refreshing for transactional tables
  addPartitionEvents, dropPartitionEvents, and AlterPartitionEvents.
3. Introduce a config
  hms_event_incremental_refresh_transactional_table, which can switch
  on/off the fine-grained table refreshing.

Performance Tests:
A simple test was performed by running insert into one partition for
partitioned ACID tables (50,000 partitions). Below are the time taken
to refresh this table by the event.

StorageBefore  After
=
S3 50 secs 50 msecs
local  3 secs  3 msecs

Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
A fe/src/main/java/org/apache/impala/catalog/TableWriteId.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M 
fe/src/main/java/org/apache/impala/hive/common/MutableValidReaderWriteIdList.java
M fe/src/main/java/org/apache/impala/hive/common/MutableValidWriteIdList.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
A fe/src/test/java/org/apache/impala/catalog/CatalogTableWriteIdTest.java
M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
17 files changed, 956 insertions(+), 46 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/17858/10
--
To view, visit http://gerrit.cloudera.org:8080/17858
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9
Gerrit-Change-Number: 17858
Gerrit-PatchSet: 10
Gerrit-Owner: Yu-Wen Lai 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Fucun Chu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 


[Impala-ASF-CR] IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables

2021-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17858 )

Change subject: IMPALA-10923: Fine grained table refreshing at partition level 
events for transactional tables
..


Patch Set 10:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7565/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/17858
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9
Gerrit-Change-Number: 17858
Gerrit-PatchSet: 10
Gerrit-Owner: Yu-Wen Lai 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Fucun Chu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Tue, 26 Oct 2021 06:34:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast expr in some situations

2021-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17933 )

Change subject: IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast 
expr in some situations
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7564/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/17933
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id8fac7100060d4e139a8b24d4795c6f279c55954
Gerrit-Change-Number: 17933
Gerrit-PatchSet: 5
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Xianqing He 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Tue, 26 Oct 2021 02:19:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast expr in some situations

2021-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17933 )

Change subject: IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast 
expr in some situations
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17933
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id8fac7100060d4e139a8b24d4795c6f279c55954
Gerrit-Change-Number: 17933
Gerrit-PatchSet: 5
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Xianqing He 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Tue, 26 Oct 2021 02:19:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast expr in some situations

2021-10-25 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17933 )

Change subject: IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast 
expr in some situations
..


Patch Set 4: Code-Review+2

Thanks for the patch!


--
To view, visit http://gerrit.cloudera.org:8080/17933
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id8fac7100060d4e139a8b24d4795c6f279c55954
Gerrit-Change-Number: 17933
Gerrit-PatchSet: 4
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Xianqing He 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Tue, 26 Oct 2021 02:18:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast expr in some situations

2021-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17933 )

Change subject: IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast 
expr in some situations
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9660/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17933
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id8fac7100060d4e139a8b24d4795c6f279c55954
Gerrit-Change-Number: 17933
Gerrit-PatchSet: 4
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Xianqing He 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Tue, 26 Oct 2021 02:12:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] Add TODO comment for future enhancement.

2021-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17967 )

Change subject: Add TODO comment for future enhancement.
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9659/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17967
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iada0f494baf680c3b33ee122552f0d49608feb67
Gerrit-Change-Number: 17967
Gerrit-PatchSet: 1
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 26 Oct 2021 01:51:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast expr in some situations

2021-10-25 Thread wangsheng (Code Review)
wangsheng has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17933 )

Change subject: IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast 
expr in some situations
..


Patch Set 4:

(6 comments)

Thanks for review!

http://gerrit.cloudera.org:8080/#/c/17933/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17933/3//COMMIT_MSG@10
PS3, Line 10:
> nit: we use 72 chars width lines in commit messages
Done


http://gerrit.cloudera.org:8080/#/c/17933/3//COMMIT_MSG@11
PS3, Line 11:  r
> nit: are
Done


http://gerrit.cloudera.org:8080/#/c/17933/3/fe/src/main/java/org/apache/impala/rewrite/SimplifyCastExprRule.java
File fe/src/main/java/org/apache/impala/rewrite/SimplifyCastExprRule.java:

http://gerrit.cloudera.org:8080/#/c/17933/3/fe/src/main/java/org/apache/impala/rewrite/SimplifyCastExprRule.java@26
PS3, Line 26: simplifi
> nit: simplifies
Done


http://gerrit.cloudera.org:8080/#/c/17933/3/fe/src/main/java/org/apache/impala/rewrite/SimplifyCastExprRule.java@42
PS3, Line 42: lengths are the
> nit: lengths are the same
Done


http://gerrit.cloudera.org:8080/#/c/17933/3/fe/src/main/java/org/apache/impala/rewrite/SimplifyCastExprRule.java@43
PS3, Line 43: precisions and scales are the
> nit: precisions and scales are the same
Done


http://gerrit.cloudera.org:8080/#/c/17933/3/fe/src/main/java/org/apache/impala/rewrite/SimplifyCastExprRule.java@50
PS3, Line 50:
> nit: need one more space
Done



--
To view, visit http://gerrit.cloudera.org:8080/17933
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id8fac7100060d4e139a8b24d4795c6f279c55954
Gerrit-Change-Number: 17933
Gerrit-PatchSet: 4
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Xianqing He 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Tue, 26 Oct 2021 01:50:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast expr in some situations

2021-10-25 Thread wangsheng (Code Review)
Hello Quanlong Huang, Xianqing He, Zoltan Borok-Nagy, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17933

to look at the new patch set (#4).

Change subject: IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast 
expr in some situations
..

IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast expr in some 
situations

This patch adds a new expr rewrite rule to simplify some cast expr when
cast target data type is the same as inner expr data type. We will
remove unnecessary cast expr if any rules are matched. This kind of
rewrite will improve query performance when casting a non-partition
column, especially when scanning lots of data. Besides, cast expr in
where clause can not pushdown to Kudu server, if we can remove
unnecessary cast expr, Impala will pushdown this predicate to Kudu
server, and this will save lots of time and IO/memmory.

Testing:
- Added unit test cases in `ExprRewriteRulesTest`

Change-Id: Id8fac7100060d4e139a8b24d4795c6f279c55954
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
A fe/src/main/java/org/apache/impala/rewrite/SimplifyCastExprRule.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
3 files changed, 139 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/17933/4
--
To view, visit http://gerrit.cloudera.org:8080/17933
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id8fac7100060d4e139a8b24d4795c6f279c55954
Gerrit-Change-Number: 17933
Gerrit-PatchSet: 4
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Xianqing He 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 


[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

2021-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS 
event id when performing DDL operations via catalog HMS endpoints
..


Patch Set 22: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7562/


--
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c474eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 22
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Tue, 26 Oct 2021 01:40:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10212. Support ofs scheme.

2021-10-25 Thread Anonymous Coward (Code Review)
weic...@apache.org has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17963 )

Change subject: IMPALA-10212. Support ofs scheme.
..


Patch Set 2:

good idea. patch updated.


--
To view, visit http://gerrit.cloudera.org:8080/17963
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I69908f65c97f40ff01b25d6d6db53c37a9e978ba
Gerrit-Change-Number: 17963
Gerrit-PatchSet: 2
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Tue, 26 Oct 2021 01:30:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] Add TODO comment for future enhancement.

2021-10-25 Thread Anonymous Coward (Code Review)
weic...@apache.org has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/17967


Change subject: Add TODO comment for future enhancement.
..

Add TODO comment for future enhancement.

Change-Id: Iada0f494baf680c3b33ee122552f0d49608feb67
---
M fe/src/test/java/org/apache/impala/common/FileSystemUtilTest.java
1 file changed, 3 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/17967/1
--
To view, visit http://gerrit.cloudera.org:8080/17967
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Iada0f494baf680c3b33ee122552f0d49608feb67
Gerrit-Change-Number: 17967
Gerrit-PatchSet: 1
Gerrit-Owner: Anonymous Coward 


[Impala-ASF-CR] [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

2021-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17964 )

Change subject: [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to 
latest HMS event id when performing DDL operations via catalog HMS endpoints
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7563/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/17964
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I869268c4c23366ed0719b153252338af9738a5f6
Gerrit-Change-Number: 17964
Gerrit-PatchSet: 3
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 26 Oct 2021 00:27:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables

2021-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17858 )

Change subject: IMPALA-10923: Fine grained table refreshing at partition level 
events for transactional tables
..


Patch Set 9: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7561/


--
To view, visit http://gerrit.cloudera.org:8080/17858
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9
Gerrit-Change-Number: 17858
Gerrit-PatchSet: 9
Gerrit-Owner: Yu-Wen Lai 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Fucun Chu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Tue, 26 Oct 2021 00:27:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast expr in some situations

2021-10-25 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17933 )

Change subject: IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast 
expr in some situations
..


Patch Set 3: Code-Review+1

(1 comment)

Thank Sheng for the changes! I'll bump to +2 after the comments are resolved.

http://gerrit.cloudera.org:8080/#/c/17933/3/fe/src/main/java/org/apache/impala/rewrite/SimplifyCastExprRule.java
File fe/src/main/java/org/apache/impala/rewrite/SimplifyCastExprRule.java:

http://gerrit.cloudera.org:8080/#/c/17933/3/fe/src/main/java/org/apache/impala/rewrite/SimplifyCastExprRule.java@50
PS3, Line 50:
nit: need one more space



--
To view, visit http://gerrit.cloudera.org:8080/17933
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id8fac7100060d4e139a8b24d4795c6f279c55954
Gerrit-Change-Number: 17933
Gerrit-PatchSet: 3
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Xianqing He 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Tue, 26 Oct 2021 00:10:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

2021-10-25 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
..


Patch Set 6:

(3 comments)

Thanks for adding the tests and query option.

This is looking good, I only have a couple small comments.

http://gerrit.cloudera.org:8080/#/c/17955/6/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/17955/6/be/src/service/client-request-state.cc@769
PS6, Line 769:   DebugActionNoFail(
 :   exec_request_->query_options, 
"CRS_DELAY_BEFORE_LOAD_DATA");
Nit: My only thought here is that I do like it when these statements are right 
next to the statement that we are simulating the delay about (in this case 
frontend_->LoadData()).


http://gerrit.cloudera.org:8080/#/c/17955/6/tests/metadata/test_load.py
File tests/metadata/test_load.py:

http://gerrit.cloudera.org:8080/#/c/17955/6/tests/metadata/test_load.py@111
PS6, Line 111: class TestAsyncLoadData(TestLoadData):
One thing about subclassing TestLoadData is that TestAsyncLoadData will get its 
own copy of test_load() from TestLoadData. When those copies execute in 
parallel, things can go wrong.

One way out is to create a TestLoadDataBase that contains the pieces you need 
to share, and then subclass for both TestLoadData and TestAsyncLoadData.


http://gerrit.cloudera.org:8080/#/c/17955/6/tests/metadata/test_load.py@122
PS6, Line 122:   @pytest.mark.execute_serially   # To avoid file copy failure: 
dst file does not exist
Nice to have: When possible, we want to structure tests to allow them to 
execute in parallel. I ran test_async_load locally in its 6 variations, and it 
took about 6 minutes. I think a decent chunk of that was setup/teardown and not 
the test itself.

In this case, it would involve replacing STAGING_PATH with something under the 
unique_database directory (and populating it with some files, etc). 
Unfortunately, unique_database doesn't really work with 
setup_method/teardown_method, so it would need some rework of populating the 
directory.



--
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 6
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Comment-Date: Mon, 25 Oct 2021 21:51:31 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

2021-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9658/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 6
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Comment-Date: Mon, 25 Oct 2021 20:10:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10777: Enable min/max filtering for Iceberg partitions

2021-10-25 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17960 )

Change subject: IMPALA-10777: Enable min/max filtering for Iceberg partitions
..


Patch Set 3:

(2 comments)

Looks great!

http://gerrit.cloudera.org:8080/#/c/17960/3/be/src/exec/parquet/hdfs-parquet-scanner.cc
File be/src/exec/parquet/hdfs-parquet-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/17960/3/be/src/exec/parquet/hdfs-parquet-scanner.cc@678
PS3, Line 678: if (!IsDataInDataFile(idx)) continue;
nit. This call probably can wait after minmax_filter is obtained at line 680, 
at which point we can directly call

minmax_filter->IsDataInDataFile(GetScanNodeId()).


http://gerrit.cloudera.org:8080/#/c/17960/3/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
File fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java:

http://gerrit.cloudera.org:8080/#/c/17960/3/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@238
PS3, Line 238: isDataInDataFile
> nit: this was a bit ambiguous for me and had to read the comment of the isD
+1. Sounds like a good idea.



--
To view, visit http://gerrit.cloudera.org:8080/17960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I51b53188c6da7eeebfeae385e1de31ace0980cac
Gerrit-Change-Number: 17960
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 25 Oct 2021 19:58:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

2021-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9657/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 5
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Comment-Date: Mon, 25 Oct 2021 19:59:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

2021-10-25 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
..


Patch Set 6:

Fix format error in test_load.py.


--
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 6
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Comment-Date: Mon, 25 Oct 2021 19:50:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

2021-10-25 Thread Qifan Chen (Code Review)
Qifan Chen has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
..

IMPALA-10967 Load data should handle AWS NLB-type timeout

This patch addresses Impala client hang due to AWS network load balancer
timeout which is fixed at 350s. When some long data loading operations
are executing and the timeout happens, AWS silently drops the connection
and the Impala client enters the hang state.

The fix maintains the current TCLIService protocol between the client
and Impala server and utilizes a separate thread to run the data loading
and metadata refresh operation. Since this thread is waited for in a
wait thread which runs asynchronously, the execution of the entire
operation will not cause a wait on the Impala client. The Impala client
can check the status of the operation via repeated GetOperationStatus()
call.

External behavior change:
  1. A new query option 'enable_async_load_data_execution', default to
 true, is added. It can be set to false to turn off the patch.

Testing:
  1. Added a new test in test_load.py to verify that the asynchronous
 execution in BE keeps the session live;
  2. Ran core tests successfully.

Change-Id: I8c2437e9894510204303ec07710cad60102c8821
---
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M tests/metadata/test_load.py
7 files changed, 170 insertions(+), 32 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/17955/6
--
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 6
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 


[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

2021-10-25 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
..


Patch Set 5:

Added the logic to disable the feature, and a new state/timing test in 
test_load.py.


--
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 5
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Comment-Date: Mon, 25 Oct 2021 19:38:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

2021-10-25 Thread Qifan Chen (Code Review)
Qifan Chen has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
..

IMPALA-10967 Load data should handle AWS NLB-type timeout

This patch addresses Impala client hang due to AWS network load balancer
timeout which is fixed at 350s. When some long data loading operations
are executing and the timeout happens, AWS silently drops the connection
and the Impala client enters the hang state.

The fix maintains the current TCLIService protocol between the client
and Impala server and utilizes a separate thread to run the data loading
and metadata refresh operation. Since this thread is waited for in a
wait thread which runs asynchronously, the execution of the entire
operation will not cause a wait on the Impala client. The Impala client
can check the status of the operation via repeated GetOperationStatus()
call.

External behavior change:
  1. A new query option 'enable_async_load_data_execution', default to
 true, is added. It can be set to false to turn off the patch.

Testing:
  1. Added a new test in test_load.py to verify that the asynchronous
 execution in BE keeps the session live;
  2. Ran core tests successfully.

Change-Id: I8c2437e9894510204303ec07710cad60102c8821
---
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M tests/metadata/test_load.py
7 files changed, 170 insertions(+), 32 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/17955/5
--
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 5
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 


[Impala-ASF-CR] IMPALA-10967 Load data should handle AWS NLB-type timeout

2021-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17955 )

Change subject: IMPALA-10967 Load data should handle AWS NLB-type timeout
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17955/5/tests/metadata/test_load.py
File tests/metadata/test_load.py:

http://gerrit.cloudera.org:8080/#/c/17955/5/tests/metadata/test_load.py@20
PS5, Line 20: import sys
flake8: F401 'sys' imported but unused


http://gerrit.cloudera.org:8080/#/c/17955/5/tests/metadata/test_load.py@110
PS5, Line 110: @SkipIfLocal.hdfs_client
flake8: E302 expected 2 blank lines, found 1



--
To view, visit http://gerrit.cloudera.org:8080/17955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c2437e9894510204303ec07710cad60102c8821
Gerrit-Change-Number: 17955
Gerrit-PatchSet: 5
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Comment-Date: Mon, 25 Oct 2021 19:38:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

2021-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS 
event id when performing DDL operations via catalog HMS endpoints
..


Patch Set 22:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7562/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c474eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 22
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Mon, 25 Oct 2021 19:37:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

2021-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17964 )

Change subject: [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to 
latest HMS event id when performing DDL operations via catalog HMS endpoints
..


Patch Set 2: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/17964
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I869268c4c23366ed0719b153252338af9738a5f6
Gerrit-Change-Number: 17964
Gerrit-PatchSet: 2
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 25 Oct 2021 19:37:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

2021-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17964 )

Change subject: [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to 
latest HMS event id when performing DDL operations via catalog HMS endpoints
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9656/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17964
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I869268c4c23366ed0719b153252338af9738a5f6
Gerrit-Change-Number: 17964
Gerrit-PatchSet: 3
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 25 Oct 2021 18:35:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

2021-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS 
event id when performing DDL operations via catalog HMS endpoints
..


Patch Set 22:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9655/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c474eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 22
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Mon, 25 Oct 2021 18:33:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

2021-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17964 )

Change subject: [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to 
latest HMS event id when performing DDL operations via catalog HMS endpoints
..


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/17964/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/17964/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@5877
PS3, Line 5877:   updatedThriftTable = catalog_.reloadTable(tbl, 
req, resultType, cmdString, -1);
line too long (93 > 90)


http://gerrit.cloudera.org:8080/#/c/17964/3/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/17964/3/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2402
PS3, Line 2402:   batchEvents = eventFactory.createBatchEvents(mockEvents, 
eventsProcessor_.getMetrics());
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/17964/3/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
File 
fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java:

http://gerrit.cloudera.org:8080/#/c/17964/3/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java@85
PS3, Line 85: private static boolean flagEnableCatalogCache 
,flagInvalidateCache, flagSyncToLatestEventId;
line too long (96 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/17964
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I869268c4c23366ed0719b153252338af9738a5f6
Gerrit-Change-Number: 17964
Gerrit-PatchSet: 3
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 25 Oct 2021 18:14:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables

2021-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17858 )

Change subject: IMPALA-10923: Fine grained table refreshing at partition level 
events for transactional tables
..


Patch Set 9:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7561/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/17858
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9
Gerrit-Change-Number: 17858
Gerrit-PatchSet: 9
Gerrit-Owner: Yu-Wen Lai 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Fucun Chu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Mon, 25 Oct 2021 18:14:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

2021-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS 
event id when performing DDL operations via catalog HMS endpoints
..


Patch Set 21: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c474eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 21
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Mon, 25 Oct 2021 18:14:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

2021-10-25 Thread Sourabh Goyal (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17964

to look at the new patch set (#3).

Change subject: [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to 
latest HMS event id when performing DDL operations via catalog HMS endpoints
..

[DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event 
id when performing
DDL operations via catalog HMS endpoints

Change-Id: I869268c4c23366ed0719b153252338af9738a5f6
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M 
fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M 
fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M 
fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
A 
fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
26 files changed, 3,404 insertions(+), 290 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/17964/3
--
To view, visit http://gerrit.cloudera.org:8080/17964
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I869268c4c23366ed0719b153252338af9738a5f6
Gerrit-Change-Number: 17964
Gerrit-PatchSet: 3
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

2021-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS 
event id when performing DDL operations via catalog HMS endpoints
..


Patch Set 22:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/17859/22/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17859/22/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@63
PS22, Line 63: // import 
org.apache.impala.catalog.events.MetastoreEvents.EventFactoryForSyncToLatestEvent;
line too long (92 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/22/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/17859/22/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2412
PS22, Line 2412:   batchEvents = eventFactory.createBatchEvents(mockEvents, 
eventsProcessor_.getMetrics());
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/22/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
File 
fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java:

http://gerrit.cloudera.org:8080/#/c/17859/22/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java@85
PS22, Line 85: private static boolean flagEnableCatalogCache 
,flagInvalidateCache, flagSyncToLatestEventId;
line too long (96 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/22/fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
File fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17859/22/fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java@28
PS22, Line 28: //import 
org.apache.impala.catalog.events.MetastoreEvents.EventFactoryForSyncToLatestEvent;
line too long (91 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c474eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 22
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Mon, 25 Oct 2021 18:11:54 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

2021-10-25 Thread Sourabh Goyal (Code Review)
Hello Vihang Karajgaonkar, kis...@cloudera.com, Yu-Wen Lai, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#22).

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS 
event id when performing DDL operations via catalog HMS endpoints
..

IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when 
performing
DDL operations via catalog HMS endpoints

Change-Id: I36364e401911352c474eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M 
fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M 
fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M 
fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
A 
fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
26 files changed, 3,398 insertions(+), 290 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/22
--
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c474eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 22
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 


[Impala-ASF-CR] IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables

2021-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17858 )

Change subject: IMPALA-10923: Fine grained table refreshing at partition level 
events for transactional tables
..


Patch Set 9:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9654/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17858
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9
Gerrit-Change-Number: 17858
Gerrit-PatchSet: 9
Gerrit-Owner: Yu-Wen Lai 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Fucun Chu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Mon, 25 Oct 2021 17:49:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10212. Support ofs scheme.

2021-10-25 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17963 )

Change subject: IMPALA-10212. Support ofs scheme.
..


Patch Set 2: Code-Review+1

(1 comment)

This looks good to me. Thanks for putting this together. I had one minor nit.

http://gerrit.cloudera.org:8080/#/c/17963/2/fe/src/test/java/org/apache/impala/common/FileSystemUtilTest.java
File fe/src/test/java/org/apache/impala/common/FileSystemUtilTest.java:

http://gerrit.cloudera.org:8080/#/c/17963/2/fe/src/test/java/org/apache/impala/common/FileSystemUtilTest.java@121
PS2, Line 121: // 
testIsSupportStorageIds(mockLocation(FileSystemUtil.SCHEME_O3FS), true);
Nit: There are a few of these commented out O3FS tests that we would like to 
enable later. Can you add an OFS equivalent for each one (still commented out)?



--
To view, visit http://gerrit.cloudera.org:8080/17963
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I69908f65c97f40ff01b25d6d6db53c37a9e978ba
Gerrit-Change-Number: 17963
Gerrit-PatchSet: 2
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Mon, 25 Oct 2021 17:37:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables

2021-10-25 Thread Yu-Wen Lai (Code Review)
Yu-Wen Lai has uploaded a new patch set (#9). ( 
http://gerrit.cloudera.org:8080/17858 )

Change subject: IMPALA-10923: Fine grained table refreshing at partition level 
events for transactional tables
..

IMPALA-10923: Fine grained table refreshing at partition level events
for transactional tables

To enable fine-grained table refreshing, there are three main changes
in this commit.
1. Maintain validWriteIdList in Catalogd for transactional tables. We
  will keep track of write id changes for partitioned tables by
  AllocWriteIdEvents, CommitTxnEvents, and AbortTxnEvents.
2. Conduct partition level refreshing for transactional tables
  addPartitionEvents, dropPartitionEvents, and AlterPartitionEvents.
3. Introduce a config
  hms_event_incremental_refresh_transactional_table, which can switch
  on/off the fine-grained table refreshing.

Performance Tests:
A simple test was performed by running insert into one partition for
partitioned ACID tables (50,000 partitions). Below are the time taken
to refresh this table by the event.

StorageBefore  After
=
S3 50 secs 50 msecs
local  3 secs  3 msecs

Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
A fe/src/main/java/org/apache/impala/catalog/TableWriteId.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M 
fe/src/main/java/org/apache/impala/hive/common/MutableValidReaderWriteIdList.java
M fe/src/main/java/org/apache/impala/hive/common/MutableValidWriteIdList.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
A fe/src/test/java/org/apache/impala/catalog/CatalogTableWriteIdTest.java
M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
17 files changed, 938 insertions(+), 42 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/17858/9
--
To view, visit http://gerrit.cloudera.org:8080/17858
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9
Gerrit-Change-Number: 17858
Gerrit-PatchSet: 9
Gerrit-Owner: Yu-Wen Lai 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Fucun Chu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 


[Impala-ASF-CR] IMPALA-10777: Enable min/max filtering for Iceberg partitions

2021-10-25 Thread Tamas Mate (Code Review)
Tamas Mate has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17960 )

Change subject: IMPALA-10777: Enable min/max filtering for Iceberg partitions
..


Patch Set 3:

(2 comments)

Hi Zoltan,
Added a readability nit and a test comment, apart from these LGTM.

http://gerrit.cloudera.org:8080/#/c/17960/3/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
File fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java:

http://gerrit.cloudera.org:8080/#/c/17960/3/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@238
PS3, Line 238: isDataInDataFile
nit: this was a bit ambiguous for me and had to read the comment of the 
isDataInDataFile method to understand it. What do you think about using 
something like: isPartitionColumnValuesInDataFile, isPartitionValuesInDataFile 
or isPartColValInDataFile


http://gerrit.cloudera.org:8080/#/c/17960/3/testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test
File testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test:

http://gerrit.cloudera.org:8080/#/c/17960/3/testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test@429
PS3, Line 429: select * from functional_parquet.iceberg_partitioned i1,
Missing
 > SET RUNTIME_FILTER_WAIT_TIME_MS=$RUNTIME_FILTER_WAIT_TIME_MS;



--
To view, visit http://gerrit.cloudera.org:8080/17960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I51b53188c6da7eeebfeae385e1de31ace0980cac
Gerrit-Change-Number: 17960
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 25 Oct 2021 16:02:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10777: Enable min/max filtering for Iceberg partitions

2021-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17960 )

Change subject: IMPALA-10777: Enable min/max filtering for Iceberg partitions
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9653/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I51b53188c6da7eeebfeae385e1de31ace0980cac
Gerrit-Change-Number: 17960
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 25 Oct 2021 15:59:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10777: Enable min/max filtering for Iceberg partitions

2021-10-25 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17960 )

Change subject: IMPALA-10777: Enable min/max filtering for Iceberg partitions
..


Patch Set 1:

(1 comment)

Thanks for the comments.

http://gerrit.cloudera.org:8080/#/c/17960/1/be/src/exec/parquet/hdfs-parquet-scanner.cc
File be/src/exec/parquet/hdfs-parquet-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/17960/1/be/src/exec/parquet/hdfs-parquet-scanner.cc@1323
PS1, Line 1323: if (scan_node_->hdfs_table()->IsIcebergTable()) return false;
> I got it. Thanks for the explanation.
Added is_data_in_file to TRuntimeFilterTargetDesc.



--
To view, visit http://gerrit.cloudera.org:8080/17960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I51b53188c6da7eeebfeae385e1de31ace0980cac
Gerrit-Change-Number: 17960
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 25 Oct 2021 15:39:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10777: Enable min/max filtering for Iceberg partitions

2021-10-25 Thread Zoltan Borok-Nagy (Code Review)
Hello Tamas Mate, Qifan Chen, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17960

to look at the new patch set (#3).

Change subject: IMPALA-10777: Enable min/max filtering for Iceberg partitions
..

IMPALA-10777: Enable min/max filtering for Iceberg partitions

This patch enables min/max filters for Iceberg columns that
participate in table partitioning. The min/max filters are
evaluated at the Parquet row group level. This means that it
is still slower than dynamic partition pruning (which doesn't
even need to open the files), but much faster than no pruning at all.

Performance

I used the following query to measure perf on a scale 10 TPC-DS
dataset:

 select i_item_id,sum(ss_ext_sales_price) total_sales
 from
 store_sales,
 date_dim,
  customer_address,
  item
 where i_item_id in (select
  i_item_id
 from item
 where i_color in ('orchid','chiffon','lace'))
  and ss_item_sk  = i_item_sk
  and ss_sold_date_sk = d_date_sk
  and d_year  = 2000
  and d_moy   = 1
  and ss_addr_sk  = ca_address_sk
  and ca_gmt_offset   = -8

The above query took the following times to execute:

Regular Parquet table: 1.16s
Iceberg table without min/max filters: 4.39s
Iceberg table with min/max filters: 1.77s

Testing:
 * added e2e test
 * planner test could not be added because Iceberg tables behave
   differently during planner tests (due to some hacks that needs
   refactoring)

Change-Id: I51b53188c6da7eeebfeae385e1de31ace0980cac
---
M be/src/exec/parquet/hdfs-parquet-scanner.cc
M be/src/exec/parquet/hdfs-parquet-scanner.h
M be/src/runtime/runtime-filter.h
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/FeTable.java
M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
M testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test
8 files changed, 80 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/17960/3
--
To view, visit http://gerrit.cloudera.org:8080/17960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I51b53188c6da7eeebfeae385e1de31ace0980cac
Gerrit-Change-Number: 17960
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast expr in some situations

2021-10-25 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17933 )

Change subject: IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast 
expr in some situations
..


Patch Set 3: Code-Review+1

(5 comments)

Just found some nits/grammatical errors, otherwise LGTM!

http://gerrit.cloudera.org:8080/#/c/17933/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17933/3//COMMIT_MSG@10
PS3, Line 10: e
nit: we use 72 chars width lines in commit messages


http://gerrit.cloudera.org:8080/#/c/17933/3//COMMIT_MSG@11
PS3, Line 11: is
nit: are


http://gerrit.cloudera.org:8080/#/c/17933/3/fe/src/main/java/org/apache/impala/rewrite/SimplifyCastExprRule.java
File fe/src/main/java/org/apache/impala/rewrite/SimplifyCastExprRule.java:

http://gerrit.cloudera.org:8080/#/c/17933/3/fe/src/main/java/org/apache/impala/rewrite/SimplifyCastExprRule.java@26
PS3, Line 26: simplify
nit: simplifies


http://gerrit.cloudera.org:8080/#/c/17933/3/fe/src/main/java/org/apache/impala/rewrite/SimplifyCastExprRule.java@42
PS3, Line 42: length are same.
nit: lengths are the same


http://gerrit.cloudera.org:8080/#/c/17933/3/fe/src/main/java/org/apache/impala/rewrite/SimplifyCastExprRule.java@43
PS3, Line 43: precision and scale are same.
nit: precisions and scales are the same



--
To view, visit http://gerrit.cloudera.org:8080/17933
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id8fac7100060d4e139a8b24d4795c6f279c55954
Gerrit-Change-Number: 17933
Gerrit-PatchSet: 3
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Xianqing He 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 25 Oct 2021 14:01:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

2021-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17964 )

Change subject: [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to 
latest HMS event id when performing DDL operations via catalog HMS endpoints
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9652/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17964
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I869268c4c23366ed0719b153252338af9738a5f6
Gerrit-Change-Number: 17964
Gerrit-PatchSet: 2
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 25 Oct 2021 13:41:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

2021-10-25 Thread Sourabh Goyal (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17964

to look at the new patch set (#2).

Change subject: [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to 
latest HMS event id when performing DDL operations via catalog HMS endpoints
..

[DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event 
id when performing
DDL operations via catalog HMS endpoints

Change-Id: I869268c4c23366ed0719b153252338af9738a5f6
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M 
fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M 
fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M 
fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
A 
fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
26 files changed, 3,403 insertions(+), 289 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/17964/2
--
To view, visit http://gerrit.cloudera.org:8080/17964
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I869268c4c23366ed0719b153252338af9738a5f6
Gerrit-Change-Number: 17964
Gerrit-PatchSet: 2
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

2021-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17964 )

Change subject: [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to 
latest HMS event id when performing DDL operations via catalog HMS endpoints
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7560/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/17964
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I869268c4c23366ed0719b153252338af9738a5f6
Gerrit-Change-Number: 17964
Gerrit-PatchSet: 2
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 25 Oct 2021 13:20:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

2021-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17964 )

Change subject: [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to 
latest HMS event id when performing DDL operations via catalog HMS endpoints
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/17964/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/17964/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@5877
PS2, Line 5877:   updatedThriftTable = catalog_.reloadTable(tbl, 
req, resultType, cmdString, -1);
line too long (93 > 90)


http://gerrit.cloudera.org:8080/#/c/17964/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/17964/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2402
PS2, Line 2402:   batchEvents = eventFactory.createBatchEvents(mockEvents, 
eventsProcessor_.getMetrics());
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/17964/2/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
File 
fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java:

http://gerrit.cloudera.org:8080/#/c/17964/2/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java@85
PS2, Line 85: private static boolean flagEnableCatalogCache 
,flagInvalidateCache, flagSyncToLatestEventId;
line too long (96 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/17964
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I869268c4c23366ed0719b153252338af9738a5f6
Gerrit-Change-Number: 17964
Gerrit-PatchSet: 2
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 25 Oct 2021 13:20:22 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast expr in some situations

2021-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17933 )

Change subject: IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast 
expr in some situations
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9651/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17933
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id8fac7100060d4e139a8b24d4795c6f279c55954
Gerrit-Change-Number: 17933
Gerrit-PatchSet: 3
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Xianqing He 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 25 Oct 2021 13:15:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast expr in some situations

2021-10-25 Thread wangsheng (Code Review)
wangsheng has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17933 )

Change subject: IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast 
expr in some situations
..


Patch Set 3:

(9 comments)

Thanks for carefully review, Quanlong!

http://gerrit.cloudera.org:8080/#/c/17933/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17933/2//COMMIT_MSG@7
PS2, Line 7: SimplifyCastExprRule
> Can we rename it to 'SimplifyCastExprRule'? We already have some similar na
Done


http://gerrit.cloudera.org:8080/#/c/17933/2//COMMIT_MSG@9
PS2, Line 9: add
> nit: adds
Done


http://gerrit.cloudera.org:8080/#/c/17933/2//COMMIT_MSG@11
PS2, Line 11: rules i
> nit: is the same
Done


http://gerrit.cloudera.org:8080/#/c/17933/2//COMMIT_MSG@12
PS2, Line 12: ing a n
> nit: is the same
Done


http://gerrit.cloudera.org:8080/#/c/17933/2//COMMIT_MSG@14
PS2, Line 14: move unnecessar
> nit: if any rules is matched
Done


http://gerrit.cloudera.org:8080/#/c/17933/2//COMMIT_MSG@15
PS2, Line 15:  and
> nit: casting
Done


http://gerrit.cloudera.org:8080/#/c/17933/2//COMMIT_MSG@16
PS2, Line 16: time and IO/memmory.
> nit: , especially when scanning lots of data.
Done


http://gerrit.cloudera.org:8080/#/c/17933/2/fe/src/main/java/org/apache/impala/rewrite/CastExprSimplifyRule.java
File fe/src/main/java/org/apache/impala/rewrite/CastExprSimplifyRule.java:

http://gerrit.cloudera.org:8080/#/c/17933/2/fe/src/main/java/org/apache/impala/rewrite/CastExprSimplifyRule.java@56
PS2, Line 56:
> Can we remove this check? So the rule can apply to more scenarios, e.g. CAS
Done


http://gerrit.cloudera.org:8080/#/c/17933/2/fe/src/main/java/org/apache/impala/rewrite/CastExprSimplifyRule.java@56
PS2, Line 56:
:
:
:
:
:
:
:
:
:
> I think we can merge these two branches into one. We just check the first c
Done



--
To view, visit http://gerrit.cloudera.org:8080/17933
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id8fac7100060d4e139a8b24d4795c6f279c55954
Gerrit-Change-Number: 17933
Gerrit-PatchSet: 3
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Xianqing He 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 25 Oct 2021 12:54:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast expr in some situations

2021-10-25 Thread wangsheng (Code Review)
wangsheng has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/17933 )

Change subject: IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast 
expr in some situations
..

IMPALA-10836: Add 'SimplifyCastExprRule' rule to rewrite cast expr in some 
situations

This patch adds a new expr rewrite rule to simplify some cast expr when
cast target data type is the same as inner expr data type. We will remove
unnecessary cast expr if any rules is matched. This kind of rewrite will
improve query performance when casting a non-partition column, especially
when scanning lots of data. Besides, cast expr in where clause can not
pushdown to Kudu server, if we can remove unnecessary cast expr, Impala
will pushdown this predicate to Kudu server, and this will save lots of
time and IO/memmory.

Testing:
- Added unit test cases in `ExprRewriteRulesTest`

Change-Id: Id8fac7100060d4e139a8b24d4795c6f279c55954
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
A fe/src/main/java/org/apache/impala/rewrite/SimplifyCastExprRule.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
3 files changed, 139 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/17933/3
--
To view, visit http://gerrit.cloudera.org:8080/17933
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id8fac7100060d4e139a8b24d4795c6f279c55954
Gerrit-Change-Number: 17933
Gerrit-PatchSet: 3
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Xianqing He 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

2021-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17964 )

Change subject: [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to 
latest HMS event id when performing DDL operations via catalog HMS endpoints
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9650/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17964
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I869268c4c23366ed0719b153252338af9738a5f6
Gerrit-Change-Number: 17964
Gerrit-PatchSet: 1
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 25 Oct 2021 12:24:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

2021-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS 
event id when performing DDL operations via catalog HMS endpoints
..


Patch Set 21:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9649/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c474eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 21
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Mon, 25 Oct 2021 12:22:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

2021-10-25 Thread Sourabh Goyal (Code Review)
Sourabh Goyal has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/17964


Change subject: [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to 
latest HMS event id when performing DDL operations via catalog HMS endpoints
..

[DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event 
id when performing
DDL operations via catalog HMS endpoints

Change-Id: I869268c4c23366ed0719b153252338af9738a5f6
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M 
fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M 
fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M 
fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
A 
fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
26 files changed, 3,362 insertions(+), 282 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/17964/1
--
To view, visit http://gerrit.cloudera.org:8080/17964
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I869268c4c23366ed0719b153252338af9738a5f6
Gerrit-Change-Number: 17964
Gerrit-PatchSet: 1
Gerrit-Owner: Sourabh Goyal 


[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

2021-10-25 Thread Sourabh Goyal (Code Review)
Hello Vihang Karajgaonkar, kis...@cloudera.com, Yu-Wen Lai, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#21).

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS 
event id when performing DDL operations via catalog HMS endpoints
..

IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when 
performing
DDL operations via catalog HMS endpoints

Change-Id: I36364e401911352c474eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M 
fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M 
fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M 
fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
A 
fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
26 files changed, 3,397 insertions(+), 289 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/21
--
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c474eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 21
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 


[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

2021-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS 
event id when performing DDL operations via catalog HMS endpoints
..


Patch Set 21:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7559/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c474eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 21
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Mon, 25 Oct 2021 12:03:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

2021-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17964 )

Change subject: [DO NOT MERGE] IMPALA-10926: Sync db/table in catalog cache to 
latest HMS event id when performing DDL operations via catalog HMS endpoints
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17964/1/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/17964/1/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2402
PS1, Line 2402:   batchEvents = eventFactory.createBatchEvents(mockEvents, 
eventsProcessor_.getMetrics());
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/17964/1/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
File 
fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java:

http://gerrit.cloudera.org:8080/#/c/17964/1/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java@85
PS1, Line 85: private static boolean flagEnableCatalogCache 
,flagInvalidateCache, flagSyncToLatestEventId;
line too long (96 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/17964
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I869268c4c23366ed0719b153252338af9738a5f6
Gerrit-Change-Number: 17964
Gerrit-PatchSet: 1
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 25 Oct 2021 12:03:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

2021-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS 
event id when performing DDL operations via catalog HMS endpoints
..


Patch Set 21:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/17859/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17859/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@63
PS21, Line 63: // import 
org.apache.impala.catalog.events.MetastoreEvents.EventFactoryForSyncToLatestEvent;
line too long (92 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/21/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/17859/21/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2412
PS21, Line 2412:   batchEvents = eventFactory.createBatchEvents(mockEvents, 
eventsProcessor_.getMetrics());
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/21/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
File 
fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java:

http://gerrit.cloudera.org:8080/#/c/17859/21/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java@85
PS21, Line 85: private static boolean flagEnableCatalogCache 
,flagInvalidateCache, flagSyncToLatestEventId;
line too long (96 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/21/fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
File fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17859/21/fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java@28
PS21, Line 28: //import 
org.apache.impala.catalog.events.MetastoreEvents.EventFactoryForSyncToLatestEvent;
line too long (91 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c474eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 21
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Mon, 25 Oct 2021 12:00:59 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.

2021-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17860 )

Change subject: IMPALA-9873: Avoid materilization of columns for filtered out 
rows in Parquet table.
..


Patch Set 12:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9648/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17860
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60
Gerrit-Change-Number: 17860
Gerrit-PatchSet: 12
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 25 Oct 2021 11:20:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.

2021-10-25 Thread Amogh Margoor (Code Review)
Amogh Margoor has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17860 )

Change subject: IMPALA-9873: Avoid materilization of columns for filtered out 
rows in Parquet table.
..


Patch Set 12:

Fix Jenkins indent comments.


--
To view, visit http://gerrit.cloudera.org:8080/17860
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60
Gerrit-Change-Number: 17860
Gerrit-PatchSet: 12
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 25 Oct 2021 10:58:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.

2021-10-25 Thread Amogh Margoor (Code Review)
Amogh Margoor has uploaded a new patch set (#12). ( 
http://gerrit.cloudera.org:8080/17860 )

Change subject: IMPALA-9873: Avoid materilization of columns for filtered out 
rows in Parquet table.
..

IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet 
table.

Currently, entire row is materialized before filtering during scan.
Instead of paying the cost of materializing upfront, for columnar
formats we can avoid doing it for rows that are filtered out.
Columns that are required for filtering are the only ones that are
needed to be materialized before filtering. For rest of the columns,
materialization can be delayed and be done only for rows that survive.
This patch implements this technique for Parquet format only.

New configuration 'parquet_materialization_threshold' is introduced,
which is minimum number of consecutive rows that are filtered out
to avoid materialization. If set to less than 0, it disables the
late materialization.

Performance:
Peformance measured for single daemon, single threaded impalad
upon TPCH scale 42 lineitem table with 252 million rows,
unsorted data. Upto 2.5x improvement for non-page indexed and
upto 4x improvement in page index seen. Queries for page index
borrowed from blog:
https://blog.cloudera.com/speeding-up-select-queries-with-parquet-page-indexes/
More details:
https://docs.google.com/spreadsheets/d/17s5OLaFOPo-64kimAPP6n3kJA42vM-iVT24OvsQgfuA/edit?usp=sharing

Testing:
 1. Ran existing tests
 2. Added UT for 'ScratchTupleBatch::GetMicroBatch'

Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60
---
M be/src/exec/CMakeLists.txt
M be/src/exec/hdfs-columnar-scanner-ir.cc
M be/src/exec/hdfs-columnar-scanner.cc
M be/src/exec/hdfs-columnar-scanner.h
M be/src/exec/parquet/hdfs-parquet-scanner.cc
M be/src/exec/parquet/hdfs-parquet-scanner.h
M be/src/exec/parquet/parquet-collection-column-reader.cc
M be/src/exec/parquet/parquet-collection-column-reader.h
M be/src/exec/parquet/parquet-column-chunk-reader.cc
M be/src/exec/parquet/parquet-column-chunk-reader.h
M be/src/exec/parquet/parquet-column-readers.cc
M be/src/exec/parquet/parquet-column-readers.h
A be/src/exec/scratch-tuple-batch-test.cc
M be/src/exec/scratch-tuple-batch.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M be/src/util/tuple-row-compare.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test
20 files changed, 935 insertions(+), 51 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/17860/12
--
To view, visit http://gerrit.cloudera.org:8080/17860
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60
Gerrit-Change-Number: 17860
Gerrit-PatchSet: 12
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.

2021-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17860 )

Change subject: IMPALA-9873: Avoid materilization of columns for filtered out 
rows in Parquet table.
..


Patch Set 11:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9647/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17860
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60
Gerrit-Change-Number: 17860
Gerrit-PatchSet: 11
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 25 Oct 2021 10:51:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.

2021-10-25 Thread Amogh Margoor (Code Review)
Amogh Margoor has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17860 )

Change subject: IMPALA-9873: Avoid materilization of columns for filtered out 
rows in Parquet table.
..


Patch Set 11:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17860/10//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17860/10//COMMIT_MSG@19
PS10, Line 19: than
> nit: than
Done



--
To view, visit http://gerrit.cloudera.org:8080/17860
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60
Gerrit-Change-Number: 17860
Gerrit-PatchSet: 11
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 25 Oct 2021 10:46:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.

2021-10-25 Thread Amogh Margoor (Code Review)
Amogh Margoor has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17860 )

Change subject: IMPALA-9873: Avoid materilization of columns for filtered out 
rows in Parquet table.
..


Patch Set 11:

> (10 comments)
 >
 > Looks great!
 >
 > On testing, I wonder if we can add a counter on # of rows (or
 > amount of data) not surviving the materialization. This will be
 > useful to safe guard the feature and demonstrate its usefulness.

Thanks Qifan for the review and the suggestion of counter is good and something 
I pondered about earlier. Issue is that we don't skip decoding rows, instead we 
skip decoding values where one row may constitute hundreds of values out of 
which some will be read and others might be skipped. But we cannot accurately 
keep track number of values being skipped in current scheme of things without 
incurring significant performance penalty. For instance, we sometimes skip 
pages without decompressing it - if skipped page has page index with candidate 
rows we will need to decompress the page to get the accurate values skipped due 
to late materialisation. In that scenario where we directly skip pages, even if 
page is not compressed, figuring out number of values for corresponding 
candidate range can be time consuming. Hence, using timed counters would be 
more appropriate here, which are already present.


--
To view, visit http://gerrit.cloudera.org:8080/17860
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60
Gerrit-Change-Number: 17860
Gerrit-PatchSet: 11
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 25 Oct 2021 10:45:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.

2021-10-25 Thread Amogh Margoor (Code Review)
Amogh Margoor has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17860 )

Change subject: IMPALA-9873: Avoid materilization of columns for filtered out 
rows in Parquet table.
..


Patch Set 11:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/parquet/hdfs-parquet-scanner.cc
File be/src/exec/parquet/hdfs-parquet-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/parquet/hdfs-parquet-scanner.cc@2223
PS10, Line 2223: c.
> Could you please explain where do we filter out the rows in the merged micr
We don't need to re-filter after step 3. I will explain it in comment.


http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch-test.cc
File be/src/exec/scratch-tuple-batch-test.cc:

http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch-test.cc@67
PS10, Line 67: scratch_batch->num_tuples = BATCH_
> I wonder if we can add two more tests for the following situations.
Done


http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch.h
File be/src/exec/scratch-tuple-batch.h:

http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch.h@29
PS10, Line 29: ScratchMicroBatch
> May need a cstr to properly init these fields.
Using aggregate initialisers instead of constructor accepting arguments as we 
need default constructor too. Plus we don't want many function calls on hot 
path (GetMicroBatches).


http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch.h@171
PS10, Line 171:   /// bits set are used to create micro batches. Micro batches 
that differ by less than
> nit (or micro batches).
Done


http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch.h@176
PS10, Line 176: present
> nit.
Done


http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch.h@178
PS10, Line 178: batch
> nit. batch_idx may be a better name in this method.
Done


http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch.h@203
PS10, Line 203: << "should be true";
  : /// Add the last micro batch which was b
> nit. An alternative is the following, which is more robust.
We can avoid that extra branch and condition and also extra condition on client 
side to handle 0 being returned, as it is anyways going to be dead code and 
also mentioned as precondition for method. DCHECK is to ensure that 
precondition and in future this dead code doesn't get activated.


http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/service/query-options.h
File be/src/service/query-options.h:

http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/service/query-options.h@50
PS10, Line 50: PARQUET_LATE_MATERIALIZATION_THRE
> nit: PARQUET_LATE_MATERIALIZATION_THRESHOLD?
Done


http://gerrit.cloudera.org:8080/#/c/17860/10/common/thrift/ImpalaService.thrift
File common/thrift/ImpalaService.thrift:

http://gerrit.cloudera.org:8080/#/c/17860/10/common/thrift/ImpalaService.thrift@701
PS10, Line 701:   ENABLE_ASYNC_DDL_EXECUTION = 136
> nit. -1 to turn off the feature.
Done


http://gerrit.cloudera.org:8080/#/c/17860/10/common/thrift/Query.thrift
File common/thrift/Query.thrift:

http://gerrit.cloudera.org:8080/#/c/17860/10/common/thrift/Query.thrift@554
PS10, Line 554:   137: optional bool enable_async_ddl_execution = true;
> nit. -1 to turn off the feature.
Done



--
To view, visit http://gerrit.cloudera.org:8080/17860
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60
Gerrit-Change-Number: 17860
Gerrit-PatchSet: 11
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 25 Oct 2021 10:30:16 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.

2021-10-25 Thread Amogh Margoor (Code Review)
Amogh Margoor has uploaded a new patch set (#11). ( 
http://gerrit.cloudera.org:8080/17860 )

Change subject: IMPALA-9873: Avoid materilization of columns for filtered out 
rows in Parquet table.
..

IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet 
table.

Currently, entire row is materialized before filtering during scan.
Instead of paying the cost of materializing upfront, for columnar
formats we can avoid doing it for rows that are filtered out.
Columns that are required for filtering are the only ones that are
needed to be materialized before filtering. For rest of the columns,
materialization can be delayed and be done only for rows that survive.
This patch implements this technique for Parquet format only.

New configuration 'parquet_materialization_threshold' is introduced,
which is minimum number of consecutive rows that are filtered out
to avoid materialization. If set to less than 0, it disables the
late materialization.

Performance:
Peformance measured for single daemon, single threaded impalad
upon TPCH scale 42 lineitem table with 252 million rows,
unsorted data. Upto 2.5x improvement for non-page indexed and
upto 4x improvement in page index seen. Queries for page index
borrowed from blog:
https://blog.cloudera.com/speeding-up-select-queries-with-parquet-page-indexes/
More details:
https://docs.google.com/spreadsheets/d/17s5OLaFOPo-64kimAPP6n3kJA42vM-iVT24OvsQgfuA/edit?usp=sharing

Testing:
 1. Ran existing tests
 2. Added UT for 'ScratchTupleBatch::GetMicroBatch'

Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60
---
M be/src/exec/CMakeLists.txt
M be/src/exec/hdfs-columnar-scanner-ir.cc
M be/src/exec/hdfs-columnar-scanner.cc
M be/src/exec/hdfs-columnar-scanner.h
M be/src/exec/parquet/hdfs-parquet-scanner.cc
M be/src/exec/parquet/hdfs-parquet-scanner.h
M be/src/exec/parquet/parquet-collection-column-reader.cc
M be/src/exec/parquet/parquet-collection-column-reader.h
M be/src/exec/parquet/parquet-column-chunk-reader.cc
M be/src/exec/parquet/parquet-column-chunk-reader.h
M be/src/exec/parquet/parquet-column-readers.cc
M be/src/exec/parquet/parquet-column-readers.h
A be/src/exec/scratch-tuple-batch-test.cc
M be/src/exec/scratch-tuple-batch.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M be/src/util/tuple-row-compare.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test
20 files changed, 933 insertions(+), 51 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/17860/11
--
To view, visit http://gerrit.cloudera.org:8080/17860
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60
Gerrit-Change-Number: 17860
Gerrit-PatchSet: 11
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.

2021-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17860 )

Change subject: IMPALA-9873: Avoid materilization of columns for filtered out 
rows in Parquet table.
..


Patch Set 11:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/17860/11/be/src/exec/parquet/hdfs-parquet-scanner.cc
File be/src/exec/parquet/hdfs-parquet-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/17860/11/be/src/exec/parquet/hdfs-parquet-scanner.cc@2291
PS11, Line 2291: int num_micro_batches = 
scratch_batch_->GetMicroBatches(late_materialization_threshold_,
line too long (96 > 90)


http://gerrit.cloudera.org:8080/#/c/17860/11/be/src/exec/scratch-tuple-batch-test.cc
File be/src/exec/scratch-tuple-batch-test.cc:

http://gerrit.cloudera.org:8080/#/c/17860/11/be/src/exec/scratch-tuple-batch-test.cc@84
PS11, Line 84:   EXPECT_EQ(scratch_batch->GetMicroBatches(10 /*Skip 
Length*/, micro_batches), 1024/n);
line too long (91 > 90)


http://gerrit.cloudera.org:8080/#/c/17860/11/be/src/exec/scratch-tuple-batch-test.cc@116
PS11, Line 116: EXPECT_EQ(scratch_batch->GetMicroBatches(10 /*Skip 
Length*/, micro_batches), 1024/(n * 2));
line too long (95 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/17860
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60
Gerrit-Change-Number: 17860
Gerrit-PatchSet: 11
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 25 Oct 2021 10:30:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10212. Support ofs scheme.

2021-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17963 )

Change subject: IMPALA-10212. Support ofs scheme.
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9646/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17963
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I69908f65c97f40ff01b25d6d6db53c37a9e978ba
Gerrit-Change-Number: 17963
Gerrit-PatchSet: 1
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 25 Oct 2021 09:12:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables

2021-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17858 )

Change subject: IMPALA-10923: Fine grained table refreshing at partition level 
events for transactional tables
..


Patch Set 8:

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7558/


--
To view, visit http://gerrit.cloudera.org:8080/17858
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9
Gerrit-Change-Number: 17858
Gerrit-PatchSet: 8
Gerrit-Owner: Yu-Wen Lai 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Fucun Chu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Mon, 25 Oct 2021 09:06:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10212. Support ofs scheme.

2021-10-25 Thread Anonymous Coward (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17963

to look at the new patch set (#2).

Change subject: IMPALA-10212. Support ofs scheme.
..

IMPALA-10212. Support ofs scheme.

OFS is the new file system implementation for Ozone.
The biggest difference compared to o3fs is that ofs supports operations across 
all
volumes and buckets and provides a full view of all the volume/buckets.

It uses the same transport as o3fs and therefore it shares the thread pool with 
o3fs.

How it was tested:
The patch was tested manually on a CDPD cluster, loaded TPC-DS data, ran 
TPC-DS, ran 'load data inpath' command.

Change-Id: I69908f65c97f40ff01b25d6d6db53c37a9e978ba
---
M be/src/util/hdfs-util.cc
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
M fe/src/test/java/org/apache/impala/common/FileSystemUtilTest.java
3 files changed, 16 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/17963/2
--
To view, visit http://gerrit.cloudera.org:8080/17963
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I69908f65c97f40ff01b25d6d6db53c37a9e978ba
Gerrit-Change-Number: 17963
Gerrit-PatchSet: 2
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-10212. Support ofs scheme.

2021-10-25 Thread Anonymous Coward (Code Review)
weic...@apache.org has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/17963


Change subject: IMPALA-10212. Support ofs scheme.
..

IMPALA-10212. Support ofs scheme.

OFS is the new file system implementation for Ozone.
The biggest difference compared to o3fs is that ofs supports operations across 
all
volumes and buckets and provides a full view of all the volume/buckets.

It uses the same transport as o3fs and therefore it shares the thread pool with 
o3fs.

Change-Id: I69908f65c97f40ff01b25d6d6db53c37a9e978ba
---
M be/src/util/hdfs-util.cc
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
M fe/src/test/java/org/apache/impala/common/FileSystemUtilTest.java
3 files changed, 16 insertions(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/17963/1
--
To view, visit http://gerrit.cloudera.org:8080/17963
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I69908f65c97f40ff01b25d6d6db53c37a9e978ba
Gerrit-Change-Number: 17963
Gerrit-PatchSet: 1
Gerrit-Owner: Anonymous Coward