[Impala-ASF-CR] IMPALA-10943: Add test to verify support for multiple resource and executor pools

2021-11-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17891 )

Change subject: IMPALA-10943: Add test to verify support for multiple resource 
and executor pools
..


Patch Set 3: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7584/


-- 
To view, visit http://gerrit.cloudera.org:8080/17891
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If76d386d8de5730da937674ddd9a69aa1aa1355e
Gerrit-Change-Number: 17891
Gerrit-PatchSet: 3
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Tue, 02 Nov 2021 05:23:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables

2021-11-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17858 )

Change subject: IMPALA-10923: Fine grained table refreshing at partition level 
events for transactional tables
..


Patch Set 12:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9706/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17858
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9
Gerrit-Change-Number: 17858
Gerrit-PatchSet: 12
Gerrit-Owner: Yu-Wen Lai 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Fucun Chu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Tue, 02 Nov 2021 05:19:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables

2021-11-01 Thread Yu-Wen Lai (Code Review)
Yu-Wen Lai has uploaded a new patch set (#12). ( 
http://gerrit.cloudera.org:8080/17858 )

Change subject: IMPALA-10923: Fine grained table refreshing at partition level 
events for transactional tables
..

IMPALA-10923: Fine grained table refreshing at partition level events
for transactional tables

To enable fine-grained table refreshing, there are three main changes
in this commit.
1. Maintain validWriteIdList in Catalogd for transactional tables. We
  will keep track of write id changes for partitioned tables by
  AllocWriteIdEvents, CommitTxnEvents, and AbortTxnEvents.
2. Conduct partition level refreshing for transactional tables'
  addPartitionEvents, dropPartitionEvents, and AlterPartitionEvents.
3. Introduce a config
  hms_event_incremental_refresh_transactional_table, which can switch
  on/off the fine-grained table refreshing.

Performance Tests:
A simple test was performed by running insert into one partition for
a partitioned ACID table(50,000 partitions). Below are the time taken
to refresh this table by the event.

StorageBefore  After
=
S3 50 secs 50 msecs
local  3 secs  3 msecs

Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
A fe/src/main/java/org/apache/impala/catalog/TableWriteId.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/hive/common/MutableValidReaderWriteIdList.java
M fe/src/main/java/org/apache/impala/hive/common/MutableValidWriteIdList.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
A fe/src/test/java/org/apache/impala/catalog/CatalogTableWriteIdTest.java
M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M 
fe/src/test/java/org/apache/impala/hive/common/MutableValidReaderWriteIdListTest.java
17 files changed, 1,002 insertions(+), 58 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/17858/12
--
To view, visit http://gerrit.cloudera.org:8080/17858
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9
Gerrit-Change-Number: 17858
Gerrit-PatchSet: 12
Gerrit-Owner: Yu-Wen Lai 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Fucun Chu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 


[Impala-ASF-CR] IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables

2021-11-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17858 )

Change subject: IMPALA-10923: Fine grained table refreshing at partition level 
events for transactional tables
..


Patch Set 12:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7585/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/17858
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9
Gerrit-Change-Number: 17858
Gerrit-PatchSet: 12
Gerrit-Owner: Yu-Wen Lai 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Fucun Chu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Tue, 02 Nov 2021 04:59:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

2021-11-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events 
detection
..


Patch Set 26: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7582/


--
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c474eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 26
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Tue, 02 Nov 2021 03:50:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] WiP: IMPALA-10798 : Prototype for JSON reader

2021-11-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17771 )

Change subject: WiP: IMPALA-10798 : Prototype for JSON reader
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9705/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17771
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If79364a421d862d0d837f9be694911e388d4d629
Gerrit-Change-Number: 17771
Gerrit-PatchSet: 7
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Tue, 02 Nov 2021 03:43:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10934: Enable table definition over a single file

2021-11-01 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17878 )

Change subject: IMPALA-10934: Enable table definition over a single file
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17878/2/be/src/runtime/io/disk-io-mgr.cc
File be/src/runtime/io/disk-io-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/17878/2/be/src/runtime/io/disk-io-mgr.cc@142
PS2, Line 142: // The maximum number of SFS I/O threads.
 : DEFINE_int32(num_sfs_io_threads, 16, "Number of SFS I/O 
threads");
> Agree that turning off file handle caching for the SFS case should not hurt
My understanding is that the file handle cache will be disabled for SFS unless 
we explicitly try to enable it. That's probably ok. The path to enabling the 
file handle cache would be to understand the distinction between SFS+S3 vs 
SFS+HDFS vs whatnot and map them to the right thread pools. That probably isn't 
that hard if we want to go that way, and it could be done in the backend.



--
To view, visit http://gerrit.cloudera.org:8080/17878
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I32be936243aa4c8320f5d06d2b7fbf98822f82e7
Gerrit-Change-Number: 17878
Gerrit-PatchSet: 2
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Tue, 02 Nov 2021 03:31:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WiP: IMPALA-10798 : Prototype for JSON reader

2021-11-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17771 )

Change subject: WiP: IMPALA-10798 : Prototype for JSON reader
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17771/7/bin/bootstrap_toolchain.py
File bin/bootstrap_toolchain.py:

http://gerrit.cloudera.org:8080/#/c/17771/7/bin/bootstrap_toolchain.py@469
PS7, Line 469: )
flake8: E501 line too long (91 > 90 characters)



--
To view, visit http://gerrit.cloudera.org:8080/17771
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If79364a421d862d0d837f9be694911e388d4d629
Gerrit-Change-Number: 17771
Gerrit-PatchSet: 7
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Tue, 02 Nov 2021 03:22:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WiP: IMPALA-10798 : Prototype for JSON reader

2021-11-01 Thread Anonymous Coward (Code Review)
Hello Quanlong Huang, Aman Sinha, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17771

to look at the new patch set (#7).

Change subject: WiP: IMPALA-10798 : Prototype for JSON reader
..

WiP: IMPALA-10798 : Prototype for JSON reader

This prototype allows user to  create a table stored as jsonfile and
query it.
Steps to test:
- create a json table with schema specified using eligible datatypes
(int8/16/32/64/float/double/string/varchar/char/timestamp/boolean)
- add your json file (with eligble datatypes and same column names as
 schema specified in the create command) to hdfs location
- add this 'location' to your table
- run a select statement

Fix:
- arrow library is included wherever required
- json format is added to scan node base class.
- json scanner files are added, that implement methods to read the
 json file from the specified file location

Change-Id: If79364a421d862d0d837f9be694911e388d4d629
---
M CMakeLists.txt
M be/CMakeLists.txt
M be/src/exec/CMakeLists.txt
A be/src/exec/hdfs-json-scanner.cc
A be/src/exec/hdfs-json-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M bin/bootstrap_toolchain.py
M bin/impala-config.sh
A cmake_modules/FindArrow.cmake
9 files changed, 612 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/17771/7
--
To view, visit http://gerrit.cloudera.org:8080/17771
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If79364a421d862d0d837f9be694911e388d4d629
Gerrit-Change-Number: 17771
Gerrit-PatchSet: 7
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] WiP: IMPALA-10798 : Prototype for JSON reader

2021-11-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17771 )

Change subject: WiP: IMPALA-10798 : Prototype for JSON reader
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9704/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17771
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If79364a421d862d0d837f9be694911e388d4d629
Gerrit-Change-Number: 17771
Gerrit-PatchSet: 6
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Tue, 02 Nov 2021 02:48:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] WiP: IMPALA-10798 : Prototype for JSON reader

2021-11-01 Thread Anonymous Coward (Code Review)
Hello Quanlong Huang, Aman Sinha, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17771

to look at the new patch set (#6).

Change subject: WiP: IMPALA-10798 : Prototype for JSON reader
..

WiP: IMPALA-10798 : Prototype for JSON reader

This prototype allows user to  create a table stored as jsonfile and
query it.
Steps to test:
- create a json table with schema specified using eligible datatypes
(int8/16/32/64/float/double/string/varchar/char/timestamp)
- add your json file (with eligble datatypes and same column names as
 schema specified in the create command) to hdfs location
- add this 'location' to your table
- run a select statement

Fix:
- arrow library is included wherever required
- json format is added to scan node base class.
- json scanner files are added, that implement methods to read the
 json file from the specified file location

Change-Id: If79364a421d862d0d837f9be694911e388d4d629
---
M CMakeLists.txt
M be/CMakeLists.txt
M be/src/exec/CMakeLists.txt
A be/src/exec/hdfs-json-scanner.cc
A be/src/exec/hdfs-json-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M bin/bootstrap_toolchain.py
M bin/impala-config.sh
A cmake_modules/FindArrow.cmake
9 files changed, 607 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/17771/6
--
To view, visit http://gerrit.cloudera.org:8080/17771
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If79364a421d862d0d837f9be694911e388d4d629
Gerrit-Change-Number: 17771
Gerrit-PatchSet: 6
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] WiP: IMPALA-10798 : Prototype for JSON reader

2021-11-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17771 )

Change subject: WiP: IMPALA-10798 : Prototype for JSON reader
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17771/6/bin/bootstrap_toolchain.py
File bin/bootstrap_toolchain.py:

http://gerrit.cloudera.org:8080/#/c/17771/6/bin/bootstrap_toolchain.py@469
PS6, Line 469: )
flake8: E501 line too long (91 > 90 characters)



--
To view, visit http://gerrit.cloudera.org:8080/17771
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If79364a421d862d0d837f9be694911e388d4d629
Gerrit-Change-Number: 17771
Gerrit-PatchSet: 6
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Tue, 02 Nov 2021 02:26:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10791 Add batching reading for remote temporary files

2021-11-01 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17979 )

Change subject: IMPALA-10791 Add batching reading for remote temporary files
..


Patch Set 4:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/17979/1/be/src/runtime/io/disk-file.h
File be/src/runtime/io/disk-file.h:

http://gerrit.cloudera.org:8080/#/c/17979/1/be/src/runtime/io/disk-file.h@195
PS1, Line 195:
> Can we use MemBlockState state_ here?
Because the naming in DiskFile is "file_status_", maybe just keep them the 
same, otherwise it may be good to change all of them, including the interface 
names, but should be some work.


http://gerrit.cloudera.org:8080/#/c/17979/3/be/src/runtime/io/request-context.cc
File be/src/runtime/io/request-context.cc:

http://gerrit.cloudera.org:8080/#/c/17979/3/be/src/runtime/io/request-context.cc@201
PS3, Line 201: unstarted_remote_file_op_ranges_;
> Maybe named to unstarted_remote_file_op_ranges_?
Done


http://gerrit.cloudera.org:8080/#/c/17979/3/be/src/runtime/io/request-ranges.h
File be/src/runtime/io/request-ranges.h:

http://gerrit.cloudera.org:8080/#/c/17979/3/be/src/runtime/io/request-ranges.h@118
PS3, Line 118: WRITE,
> May need to explain what it is.
Done


http://gerrit.cloudera.org:8080/#/c/17979/3/be/src/runtime/io/request-ranges.h@702
PS3, Line 702:
> nit. upload the file to a remote location.
Done


http://gerrit.cloudera.org:8080/#/c/17979/3/be/src/runtime/io/request-ranges.h@708
PS3, Line 708:
> nit. the fetch file operation from a remote site.
Done


http://gerrit.cloudera.org:8080/#/c/17979/3/be/src/runtime/io/scan-range.cc
File be/src/runtime/io/scan-range.cc:

http://gerrit.cloudera.org:8080/#/c/17979/3/be/src/runtime/io/scan-range.cc@171
PS3, Line 171:  the range
 : // is supposed to be read in one round.
> Suggest to remove as there is no guarantee.
Changed to "supposed".


http://gerrit.cloudera.org:8080/#/c/17979/3/be/src/runtime/io/scan-range.cc@178
PS3, Line 178: read_status
> need to check the status.
Done


http://gerrit.cloudera.org:8080/#/c/17979/2/be/src/runtime/tmp-file-mgr.cc
File be/src/runtime/tmp-file-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/17979/2/be/src/runtime/tmp-file-mgr.cc@257
PS2, Line 257: Status setup_read_buffer_status = SetUpReadBufferParams();
 : if (!setup_read_buffer_status.ok()) {
> If handling the rare case is simple task, I feel we should do so.
Changed. If the file size is smaller than the max block size, set the block 
size as file size. Otherwise block size is the max block size, which is 16MB.


http://gerrit.cloudera.org:8080/#/c/17979/2/be/src/runtime/tmp-file-mgr.cc@1039
PS2, Line 1039:   read_buffer_block->NotifyAllWaits();
> In practice, the read buffer memory is always full during the big queries (
Have a simple test today (15x tpcds, q67, c5d.4xlarge 16u32g, 1G read buffer).

1. Disabled is set: (Time: 135s) (Data Read: 13.8GB)
2. Disabled is not set: (Time: 150s) (Data Read: 17.7GB)

As expected, if the disabled is not set, performance is worse because more data 
is read (more duplicated read). It could be a little different for other 
queries, but if the read buffer is not available (full) for most of the time, 
which is quite likely when spilling large amount of data, disabling the file 
from batching read when failing to reserve space could be a better solution.

I think the next optimization is to make the read buffer more available, maybe 
using a better eviction policy.



--
To view, visit http://gerrit.cloudera.org:8080/17979
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1dcc5d0881ffaeff09c5c514306cd668373ad31b
Gerrit-Change-Number: 17979
Gerrit-PatchSet: 4
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Tue, 02 Nov 2021 01:05:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10791 Add batching reading for remote temporary files

2021-11-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17979 )

Change subject: IMPALA-10791 Add batching reading for remote temporary files
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9703/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17979
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1dcc5d0881ffaeff09c5c514306cd668373ad31b
Gerrit-Change-Number: 17979
Gerrit-PatchSet: 4
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Tue, 02 Nov 2021 01:03:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10791 Add batching reading for remote temporary files

2021-11-01 Thread Yida Wu (Code Review)
Yida Wu has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/17979 )

Change subject: IMPALA-10791 Add batching reading for remote temporary files
..

IMPALA-10791 Add batching reading for remote temporary files

The patch adds a feature to batching read from a remote temporary
file in order to improve the reading performance for the spilled
remote data.

Originally, the design is to use the local disk file as the buffer
for batching reading from the remote file. But in practice, it
doesn't help to improve the performance. Therefore, the design
is changed to use the memory as the read buffer.

Currently, each TmpFileRemote has two DiskFile, one is for the
remote, and one is for the local buffer. The patch adds MemBlocks
to the local buffer file. Each local buffer file is divided into
several MemBlocks evenly, but in order to guarantee a page not
being cut into two parts in different blocks, the block size
could be a little different to each other in practice. The default
block size is the minimum value between 1/4 default file size and
MAX_REMOTE_READ_MEM_BLOCK_THRESHOLD_MB, which is 16MB.

When pinning a page, the system will detect if there is enough
memory for the block that holds the page, if not, we will go
reading the page directly and disable this block, because it may
be good to avoid duplicated reads from the remote fs for the same
content. If the system decides to fetch a block, the block will be
stored in the memory until all of the pages in the block are read
or the query ends.

One challenge of using the memory for the buffer is that, when the
system is lacking of memory when it needs to spill the data. So we
make a restriction to limit the percentage of the memory for the
read buffer to 5% of the total, because right now the impala
process will reserve 20% memory as unused memory by default, using
5% for the emergency case like spilling is reasonable.

Two start options have been added for the new feature.

1. remote_batching_read. Default is false. If set true, the batching
read is enabled.
2. remote_read_memory_buffer_size. Default is 1G. The maximum memory
that can be used by the read buffer. The number also restricted by
the total system memory, which can not exceed 5% of the total memory.

The patch also increases the MAX_REMOTE_TMPFILE_SIZE_THRESHOLD_MB
from 256 to 512.

Tests:
Ran core and exhaustive tests.
Added and ran TmpFileMgrTest::TestBatchingReadFromRemote.
Added e2e test test_scratch_dirs_batch_reading.

Change-Id: I1dcc5d0881ffaeff09c5c514306cd668373ad31b
---
M be/src/runtime/io/disk-file.cc
M be/src/runtime/io/disk-file.h
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/request-context.cc
M be/src/runtime/io/request-ranges.h
M be/src/runtime/io/scan-range.cc
M be/src/runtime/tmp-file-mgr-internal.h
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
M common/thrift/metrics.json
M tests/custom_cluster/test_scratch_disk.py
12 files changed, 1,110 insertions(+), 151 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/17979/4
--
To view, visit http://gerrit.cloudera.org:8080/17979
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1dcc5d0881ffaeff09c5c514306cd668373ad31b
Gerrit-Change-Number: 17979
Gerrit-PatchSet: 4
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Yida Wu 


[Impala-ASF-CR] IMPALA-10943: Add test to verify support for multiple resource and executor pools

2021-11-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17891 )

Change subject: IMPALA-10943: Add test to verify support for multiple resource 
and executor pools
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7584/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/17891
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If76d386d8de5730da937674ddd9a69aa1aa1355e
Gerrit-Change-Number: 17891
Gerrit-PatchSet: 3
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Mon, 01 Nov 2021 22:56:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10943: Add test to verify support for multiple resource and executor pools

2021-11-01 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17891 )

Change subject: IMPALA-10943: Add test to verify support for multiple resource 
and executor pools
..


Patch Set 3:

unrelated flaky tests failed in last GVO, starting another one.


--
To view, visit http://gerrit.cloudera.org:8080/17891
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If76d386d8de5730da937674ddd9a69aa1aa1355e
Gerrit-Change-Number: 17891
Gerrit-PatchSet: 3
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Mon, 01 Nov 2021 22:56:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10943: Add test to verify support for multiple resource and executor pools

2021-11-01 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has removed a vote on this change.

Change subject: IMPALA-10943: Add test to verify support for multiple resource 
and executor pools
..


Removed Verified-1 by Impala Public Jenkins 
--
To view, visit http://gerrit.cloudera.org:8080/17891
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: If76d386d8de5730da937674ddd9a69aa1aa1355e
Gerrit-Change-Number: 17891
Gerrit-PatchSet: 3
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 


[Impala-ASF-CR] IMPALA-10984: Improve TimestampValue to String casting

2021-11-01 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17980 )

Change subject: IMPALA-10984: Improve TimestampValue to String casting
..


Patch Set 5:

(16 comments)

http://gerrit.cloudera.org:8080/#/c/17980/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17980/3//COMMIT_MSG@17
PS3, Line 17: adds method
> nit reimplements
Done


http://gerrit.cloudera.org:8080/#/c/17980/3//COMMIT_MSG@22
PS3, Line 22: format. The chosen DateTimeFormatContext then is passed to
> nit is passed
Done


http://gerrit.cloudera.org:8080/#/c/17980/3//COMMIT_MSG@32
PS3, Line 32:
> nit. duplicated in before/after. Probably should be mentioned the para at l
Done


http://gerrit.cloudera.org:8080/#/c/17980/3//COMMIT_MSG@33
PS3, Line 33:
> nit. this column can be removed?
Done


http://gerrit.cloudera.org:8080/#/c/17980/3//COMMIT_MSG@38
PS3, Line 38:   2.31
> nit. not aligned with the rest of the values in this column.
Done


http://gerrit.cloudera.org:8080/#/c/17980/3/be/src/runtime/datetime-iso-sql-format-tokenizer.h
File be/src/runtime/datetime-iso-sql-format-tokenizer.h:

http://gerrit.cloudera.org:8080/#/c/17980/3/be/src/runtime/datetime-iso-sql-format-tokenizer.h@111
PS3, Line 111: Iterates throug
> nit. fmt_out_max_len_?
Removed.
This is now written directly to DateTimeFormatContext.fmt_out_len.


http://gerrit.cloudera.org:8080/#/c/17980/3/be/src/runtime/datetime-simple-date-format-parser.cc
File be/src/runtime/datetime-simple-date-format-parser.cc:

http://gerrit.cloudera.org:8080/#/c/17980/3/be/src/runtime/datetime-simple-date-format-parser.cc@401
PS3, Line 401: }
 :   }
 :   return nullptr;
 : }
 :
> inline?
Done


http://gerrit.cloudera.org:8080/#/c/17980/3/be/src/runtime/timestamp-parse-util.h
File be/src/runtime/timestamp-parse-util.h:

http://gerrit.cloudera.org:8080/#/c/17980/3/be/src/runtime/timestamp-parse-util.h@79
PS3, Line 79:   /// max_length -- the maximum length of characters that 'dst' 
can hold. Only used for
:   ///   assertion in debug build.
> I need to update this comment in next patch set, since we're enforcing the 
Done


http://gerrit.cloudera.org:8080/#/c/17980/2/be/src/runtime/timestamp-parse-util.cc
File be/src/runtime/timestamp-parse-util.cc:

http://gerrit.cloudera.org:8080/#/c/17980/2/be/src/runtime/timestamp-parse-util.cc@305
PS2, Line 305: CATOR: {
> optional:
Done. Benchmarked it with expression "cast(now() as string format 'Y .S')".

Compared patch set 3 vs 4, the (10%ile, 50%ile, 90%ile) increased from (19.9, 
20.1, 20.3) to (61.1, 61.3, 61.6). 3X increase.


http://gerrit.cloudera.org:8080/#/c/17980/2/be/src/runtime/timestamp-parse-util.cc@351
PS2, Line 351:   DCHECK(!d.is_special());
> After changing AppendToBuffer() now we can't this dcheck, even if we want t
Done


http://gerrit.cloudera.org:8080/#/c/17980/3/be/src/runtime/timestamp-value.cc
File be/src/runtime/timestamp-value.cc:

http://gerrit.cloudera.org:8080/#/c/17980/3/be/src/runtime/timestamp-value.cc@83
PS3, Line 83: st.clear();
> UNLIKELY?
Done


http://gerrit.cloudera.org:8080/#/c/17980/3/be/src/runtime/timestamp-value.cc@213
PS3, Line 213:   }
> Can we update any remaining callers to the new variant and eliminate this f
Unfortunately, there are couple call sites to this function. Especially the 
output stream operator of TimestampValue.

In patch set 4, I change the signature, asking the caller to supply a string 
output argument. Add a comment as well in the header file warning caller to 
reuse the output string.


http://gerrit.cloudera.org:8080/#/c/17980/3/be/src/runtime/timestamp-value.cc@222
PS3, Line 222:
> So the space is bounded by the row batch size * max_length? Approx how much
Yes, I think batch size * max_length is the approximation. There is also 8 
bytes alignment and power2 round up thing in mem-pool code I have not fully 
understand yet.


http://gerrit.cloudera.org:8080/#/c/17980/3/be/src/runtime/timestamp-value.cc@224
PS3, Line 224:   int64_t t_in_nano_sec = t.total_nanoseconds();
> Should use C++ cast
Done


http://gerrit.cloudera.org:8080/#/c/17980/3/be/src/runtime/timestamp-value.cc@225
PS3, Line 225:
> unlikely?
Done


http://gerrit.cloudera.org:8080/#/c/17980/3/be/src/runtime/timestamp-value.cc@230
PS3, Line 230: int64_t days = total_in_nano_sec / NANOS_PER_DAY;
 : int64_t nano_secs_remaining = total_in_nano_sec % 
NANOS_PER_DAY;
 : return TimestampValue(date_ + 
boost::gregorian::date_duration(days),
 : boost::posix_time::time_duration(0, 0, 0, 
nano_secs_remaining));
 :
> nit This method probably can be inlined.
Done. Moved to timestap-value.inline.h as well.



--
To view, visit http://gerrit.cloudera.org:8080/17980
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: 

[Impala-ASF-CR] IMPALA-9873: Avoid materialization of columns for filtered out rows in Parquet table.

2021-11-01 Thread Amogh Margoor (Code Review)
Amogh Margoor has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17860 )

Change subject: IMPALA-9873: Avoid materialization of columns for filtered out 
rows in Parquet table.
..


Patch Set 19:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17860/12//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17860/12//COMMIT_MSG@24
PS12, Line 24: TPCH scale 42
> I think it would be good to execute the whole benchmark with bin/single_nod
Hi Zoltan,
Sorry for the delay with benchmark. I ran the entire tpch bechmark at scale 42. 
This was the summary of report (Delta is the change).

Report Generated on 2021-10-28
Run Description: "78ce235db6d5b720f3e3319ff571a2da054a2602 vs 
c46d765dccd5739c848d8c1c82043e72394b8397"

Cluster Name: UNKNOWN
Lab Run Info: UNKNOWN
Impala Version:  impalad version 4.1.0-SNAPSHOT RELEASE (2021-10-28)
Baseline Impala Version: impalad version 4.1.0-SNAPSHOT RELEASE (2021-10-27)

+--+---+-++++
| Workload | File Format   | Avg (s) | Delta(Avg) | GeoMean(s) | 
Delta(GeoMean) |
+--+---+-++++
| TPCH(42) | parquet / none / none | 12.83   | -1.54% | 8.26   | -1.48% 
|
+--+---+-++++

Very slight improvement overall and major improvements in these 2 queries:

(I) Improvement: TPCH(42) TPCH-Q6 [parquet / none / none] (1.85s -> 1.72s 
[-7.30%])
+--++---+--++---+---+--+++---+---+---+
| Operator | % of Query | Avg   | Base Avg | Delta(Avg) | StdDev(%) | Max   
| Base Max | Delta(Max) | #Hosts | #Inst | #Rows | Est #Rows |
+--++---+--++---+---+--+++---+---+---+
| 00:SCAN HDFS | 94.83% | 1.50s | 1.62s| -7.75% |   2.07%   | 1.56s 
| 1.73s| -9.58% | 1  | 1 | 4.79M | 29.96M|
+--++---+--++---+---+--+++---+---+---+

(I) Improvement: TPCH(42) TPCH-Q19 [parquet / none / none] (4.73s -> 4.18s 
[-11.72%])
+--++--+--++---+--+--+++---++---+
| Operator | % of Query | Avg  | Base Avg | Delta(Avg) | StdDev(%) | 
Max  | Base Max | Delta(Max) | #Hosts | #Inst | #Rows  | Est #Rows |
+--++--+--++---+--+--+++---++---+
| 01:SCAN HDFS | 22.68% | 729.91ms | 736.69ms | -0.92% |   1.61%   | 
751.55ms | 747.34ms | +0.56% | 1  | 1 | 20.33K | 1.50M |
| 00:SCAN HDFS | 74.84% | 2.41s| 2.97s| -18.98%|   0.67%   | 
2.44s| 3.00s| -18.70%| 1  | 1 | 13.07K | 29.96M|
+--++--+--++---+--+--+++---++---+

There was no regression reported as such just these 2 improvements and couple 
of queries with high variability in runtime (not related to our change).



--
To view, visit http://gerrit.cloudera.org:8080/17860
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60
Gerrit-Change-Number: 17860
Gerrit-PatchSet: 19
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 01 Nov 2021 17:51:22 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

2021-11-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events 
detection
..


Patch Set 26:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7582/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c474eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 26
Gerrit-Owner: Sourabh Goyal 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Mon, 01 Nov 2021 17:50:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables

2021-11-01 Thread Sourabh Goyal (Code Review)
Sourabh Goyal has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17858 )

Change subject: IMPALA-10923: Fine grained table refreshing at partition level 
events for transactional tables
..


Patch Set 11:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17858/11/be/src/catalog/catalog-server.cc
File be/src/catalog/catalog-server.cc:

http://gerrit.cloudera.org:8080/#/c/17858/11/be/src/catalog/catalog-server.cc@117
PS11, Line 117: "catalog server will refresh transactional tables 
incrementally for partition level "
nit: instead of catalog server, we should say event processor


http://gerrit.cloudera.org:8080/#/c/17858/11/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/17858/11/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@4353
PS11, Line 4353:   // set write id as committed before reload the 
partitions so that we can get
How Is this helping in getting up to date file metadata? Also what happens id 
hdfsTable.reloadPartitionsFromEvent() throws an exception? What would happen to 
the newly committed writeIds in the table? Are these writeIds give correct info 
about partition metadata?



--
To view, visit http://gerrit.cloudera.org:8080/17858
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9
Gerrit-Change-Number: 17858
Gerrit-PatchSet: 11
Gerrit-Owner: Yu-Wen Lai 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Fucun Chu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Mon, 01 Nov 2021 17:49:34 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10984: Improve TimestampValue to String casting

2021-11-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17980 )

Change subject: IMPALA-10984: Improve TimestampValue to String casting
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9702/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17980
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4fcb4545d9c9a3fdb38c4db58bb4b1321a429d61
Gerrit-Change-Number: 17980
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Mon, 01 Nov 2021 17:14:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10984: Improve TimestampValue to String casting

2021-11-01 Thread Kurt Deschler (Code Review)
Kurt Deschler has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17980 )

Change subject: IMPALA-10984: Improve TimestampValue to String casting
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17980/3/be/src/runtime/timestamp-value.cc
File be/src/runtime/timestamp-value.cc:

http://gerrit.cloudera.org:8080/#/c/17980/3/be/src/runtime/timestamp-value.cc@222
PS3, Line 222:   StringVal result(ctx, max_length);
> The allocation comes from exps_results_pool_.
So the space is bounded by the row batch size * max_length? Approx how much??



--
To view, visit http://gerrit.cloudera.org:8080/17980
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4fcb4545d9c9a3fdb38c4db58bb4b1321a429d61
Gerrit-Change-Number: 17980
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Mon, 01 Nov 2021 17:07:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10984: Improve TimestampValue to String casting

2021-11-01 Thread Riza Suminto (Code Review)
Hello Qifan Chen, Kurt Deschler, Csaba Ringhofer, Bikramjeet Vig, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17980

to look at the new patch set (#5).

Change subject: IMPALA-10984: Improve TimestampValue to String casting
..

IMPALA-10984: Improve TimestampValue to String casting

TimestampValue::ToString was implemented by concatenating
boost::gregorian::to_iso_extended_string and
boost::posix_time::to_simple_string using stringstream. This involves
multiple string allocations, copying, and might hit lock within
tcmalloc::CentralFreeList. FROM_UNIXTIME and CAST expression that
touches this function can be inefficient if the expression is being
evaluated for millions of rows.

This patch adds method TimestampValue::ToStringVal and reimplements
TimestampValue::ToString by supplying default DateTimeFormatContext if
no pattern was specified. "-MM-dd HH:mm:ss" will be picked as the
default format if the time_ component does not have fractional seconds.
Otherwise, "-MM-dd HH:mm:ss.S" will be picked as the default
format. The chosen DateTimeFormatContext then is passed to
TimestampParser::Format along with date_ and time_ to be formatted into
the string representation. Int to string parsing method is replaced with
FastInt32ToBufferLeft in TimestampParser::Format.

We ran a set of expression benchmarks in a machine with Intel(R)
Core(TM) i7-4790 CPU @ 3.60GHz. This patch gives > 10X performance
improvement for CAST timestamp to string and FROM_UNIXTIME without a
date-time pattern. Following are the detailed results before and after
the patch.

Before the patch:
FromUnixCodegen:   Function   10%ile   50%ile   90%ile 10%ile   
  50%ile 90%ile
   (relative) 
(relative) (relative)
---
literal 36.7   37 37.3 1X   
  1X 1X
  cast(now() as string) 2.31 2.31 2.330.0628X   
 0.0623X0.0626X
cast(now() as string format 'Y .S') 16.9 17.5 17.5 0.459X   
  0.472X 0.471X
 from_unixtime(0,'-MM-dd HH:mm:ss')  6.3  6.3 6.37 0.171X   
   0.17X 0.171X
  from_unixtime(0,'-MM-dd') 11.8 11.8   12  0.32X   
   0.32X 0.322X
   from_unixtime(0) 2.36  2.4  2.40.0644X   
 0.0648X0.0644X

After the patch:
FromUnixCodegen:   Function   10%ile   50%ile   90%ile 10%ile   
  50%ile 90%ile
   (relative) 
(relative) (relative)
---
literal 37.7 38.1 38.4 1X   
  1X 1X
  cast(now() as string) 29.9 30.1 30.2 0.794X   
   0.79X 0.787X
cast(now() as string format 'Y .S') 61.1 61.3 61.6  1.62X   
   1.61X  1.61X
 from_unixtime(0,'-MM-dd HH:mm:ss') 33.6 33.8 34.2 0.892X   
  0.887X 0.892X
  from_unixtime(0,'-MM-dd') 50.5 50.6 50.9  1.34X   
   1.33X  1.33X
   from_unixtime(0)   34 34.2 34.5 0.902X   
  0.896X 0.898X

The literal expression used as the baseline in this benchmark is
"cast('2012-01-01 09:10:11.123456789' as timestamp)".

This patch also updates numbers in expr-benchmark for
BenchmarkTimestampFunctions and tidy up expr-benchmark a bit to clear
its MemPool in between benchmark iteration so that it does not run out
of memory.

Testing:
- Pass core tests.

Change-Id: I4fcb4545d9c9a3fdb38c4db58bb4b1321a429d61
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/exec/kudu-util-ir.cc
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/literal.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/exprs/timestamp-functions.cc
M be/src/runtime/date-parse-util.cc
M be/src/runtime/datetime-iso-sql-format-tokenizer.cc
M be/src/runtime/datetime-iso-sql-format-tokenizer.h
M be/src/runtime/datetime-parser-common.cc
M be/src/runtime/datetime-parser-common.h
M be/src/runtime/datetime-simple-date-format-parser.cc
M be/src/runtime/datetime-simple-date-format-parser.h
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-parse-util.h
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.cc
M be/src/runtime/timestamp-value.h
M be/src/runtime/timestamp-value.inline.h
M be/src/service/client-request-state.cc
M be/src/util/min-max-filter.cc
22 files changed, 316 insertions(+), 213 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF 

[Impala-ASF-CR] IMPALA-10984: Improve TimestampValue to String casting

2021-11-01 Thread Riza Suminto (Code Review)
Hello Qifan Chen, Kurt Deschler, Csaba Ringhofer, Bikramjeet Vig, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17980

to look at the new patch set (#4).

Change subject: IMPALA-10984: Improve TimestampValue to String casting
..

IMPALA-10984: Improve TimestampValue to String casting

TimestampValue::ToString was implemented by concatenating
boost::gregorian::to_iso_extended_string and
boost::posix_time::to_simple_string using stringstream. This involves
multiple string allocations, copying, and might hit lock within
tcmalloc::CentralFreeList. FROM_UNIXTIME and CAST expression that
touches this function can be inefficient if the expression is being
evaluated for millions of rows.

This patch reimplement TimestampValue::ToString by supplying default
DateTimeFormatContext if no pattern was specified. "-MM-dd HH:mm:ss"
will be picked as the default format if the time_ component does not
have fractional seconds. Otherwise, "-MM-dd HH:mm:ss.S" will
be picked as the default format. The chosen DateTimeFormatContext then
passed to TimestampParser::Format along with date_ and time_ to be
formatted into the string representation. Int to string parsing method
is replaced with FastInt32ToBufferLeft in TimestampParser::Format.

We ran a set of expression benchmarks in a machine with Intel(R)
Core(TM) i7-4790 CPU @ 3.60GHz. This patch gives > 10X performance
improvement for CAST timestamp to string and FROM_UNIXTIME without a
date-time pattern. Following are the detailed results before and after
the patch.

Before the patch:
FromUnixCodegen:   Function   10%ile   50%ile   90%ile 10%ile   
  50%ile 90%ile
   (relative) 
(relative) (relative)
---
literal 36.7   37 37.3 1X   
  1X 1X
  cast(now() as string) 2.31 2.31 2.330.0628X   
 0.0623X0.0626X
cast(now() as string format 'Y .S') 16.9 17.5 17.5 0.459X   
  0.472X 0.471X
 from_unixtime(0,'-MM-dd HH:mm:ss')  6.3  6.3 6.37 0.171X   
   0.17X 0.171X
  from_unixtime(0,'-MM-dd') 11.8 11.8   12  0.32X   
   0.32X 0.322X
   from_unixtime(0) 2.36  2.4  2.40.0644X   
 0.0648X0.0644X

After the patch:
FromUnixCodegen:   Function   10%ile   50%ile   90%ile 10%ile   
  50%ile 90%ile
   (relative) 
(relative) (relative)
---
literal 37.7 38.1 38.4 1X   
  1X 1X
  cast(now() as string) 29.9 30.1 30.2 0.794X   
   0.79X 0.787X
cast(now() as string format 'Y .S') 61.1 61.3 61.6  1.62X   
   1.61X  1.61X
 from_unixtime(0,'-MM-dd HH:mm:ss') 33.6 33.8 34.2 0.892X   
  0.887X 0.892X
  from_unixtime(0,'-MM-dd') 50.5 50.6 50.9  1.34X   
   1.33X  1.33X
   from_unixtime(0)   34 34.2 34.5 0.902X   
  0.896X 0.898X

The literal expression used as the baseline in this benchmark is
"cast('2012-01-01 09:10:11.123456789' as timestamp)".

This patch also updates numbers in expr-benchmark for
BenchmarkTimestampFunctions and tidy up expr-benchmark a bit to clear
its MemPool in between benchmark iteration so that it does not run out
of memory.

Testing:
- Pass core tests.

Change-Id: I4fcb4545d9c9a3fdb38c4db58bb4b1321a429d61
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/exec/kudu-util-ir.cc
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/literal.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/exprs/timestamp-functions.cc
M be/src/runtime/date-parse-util.cc
M be/src/runtime/datetime-iso-sql-format-tokenizer.cc
M be/src/runtime/datetime-iso-sql-format-tokenizer.h
M be/src/runtime/datetime-parser-common.cc
M be/src/runtime/datetime-parser-common.h
M be/src/runtime/datetime-simple-date-format-parser.cc
M be/src/runtime/datetime-simple-date-format-parser.h
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-parse-util.h
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.cc
M be/src/runtime/timestamp-value.h
M be/src/runtime/timestamp-value.inline.h
M be/src/service/client-request-state.cc
M be/src/util/min-max-filter.cc
22 files changed, 316 insertions(+), 213 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/17980/4
--
To view, visit 

[Impala-ASF-CR] IMPALA-10997: Refactor Java Hive UDF code.

2021-11-01 Thread Steve Carlin (Code Review)
Steve Carlin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17986 )

Change subject: IMPALA-10997: Refactor Java Hive UDF code.
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17986/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17986/1//COMMIT_MSG@19
PS1, Line 19: HiveUdfExecutor: Abstract base class that contains code that is 
common to
: the legacy UDF.class and the GenericUDF.class when it is 
eventually created.
: HiveUdfExecutorLegacy: Implementation of the code that is 
UDF.class specific.
> nit: each line should have 72 or fewer characters if possible.
Done


http://gerrit.cloudera.org:8080/#/c/17986/1/fe/src/main/java/org/apache/impala/hive/executor/UdfExecutor.java
File fe/src/main/java/org/apache/impala/hive/executor/UdfExecutor.java:

http://gerrit.cloudera.org:8080/#/c/17986/1/fe/src/main/java/org/apache/impala/hive/executor/UdfExecutor.java@105
PS1, Line 105:   classLoaderClosed_ = true;
> Why not use classLoader_ = null?
Done



--
To view, visit http://gerrit.cloudera.org:8080/17986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic1b981aed3021aef08c87e7cdbf7c6af95906754
Gerrit-Change-Number: 17986
Gerrit-PatchSet: 1
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Fucun Chu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Mon, 01 Nov 2021 13:39:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10994: Normalize the pip package name part of download URL.

2021-11-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17987 )

Change subject: IMPALA-10994: Normalize the pip package name part of download 
URL.
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9701/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17987
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I479df0ad7acf3c650b8f5317372261d5e2840864
Gerrit-Change-Number: 17987
Gerrit-PatchSet: 5
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Fucun Chu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 01 Nov 2021 12:47:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10994: Normalize the pip package name part of download URL.

2021-11-01 Thread Anonymous Coward (Code Review)
yx91...@126.com has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/17987 )

Change subject: IMPALA-10994: Normalize the pip package name part of download 
URL.
..

IMPALA-10994: Normalize the pip package name part of download URL.

According to PEP-0503, pip repo server doesn't support unnormalized URL
access, and some package name within
'infra/python/deps/*requirements.txt' are unnormalized, e.g. 'Cython',
and pip_download.py will concat $PYPI_MIRROR and package name to get
download URL directly, which maybe unnormalized.

Fix this by normalize package name in download URL using the
recommanded method in PEP-0503.

Change-Id: I479df0ad7acf3c650b8f5317372261d5e2840864
---
M infra/python/deps/pip_download.py
1 file changed, 2 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/17987/5
--
To view, visit http://gerrit.cloudera.org:8080/17987
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I479df0ad7acf3c650b8f5317372261d5e2840864
Gerrit-Change-Number: 17987
Gerrit-PatchSet: 5
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Fucun Chu 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] Impala-10994: Normalize pip package name

2021-11-01 Thread Fucun Chu (Code Review)
Fucun Chu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17987 )

Change subject: Impala-10994: Normalize pip package name
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17987/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17987/4//COMMIT_MSG@7
PS4, Line 7: Impala-10994
The ticket address must be uppercase, IMPALA-10994.


http://gerrit.cloudera.org:8080/#/c/17987/4//COMMIT_MSG@8
PS4, Line 8:
Please add a message that is exactly long enough to explain what the problem 
was, and how it was fixed. Each should have 72 or fewer characters if possible.
see: https://cwiki.apache.org/confluence/display/IMPALA/Contributing+to+Impala



--
To view, visit http://gerrit.cloudera.org:8080/17987
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I479df0ad7acf3c650b8f5317372261d5e2840864
Gerrit-Change-Number: 17987
Gerrit-PatchSet: 4
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Fucun Chu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 01 Nov 2021 11:13:46 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

2021-11-01 Thread Sourabh Goyal (Code Review)
Hello Vihang Karajgaonkar, kis...@cloudera.com, Yu-Wen Lai, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#26).

Change subject: IMPALA-10926: Improve catalogd consistency and self events 
detection
..

IMPALA-10926: Improve catalogd consistency and self events detection

In the current design catalogd cache gets updated from 2 sources:
1. Impala shell
2. MetastoreEventProcessor

The updates from the Impala shell are applied in place whereas
MetastoreEventProcessor runs as a background thread, polls HMS events
and apply them asynchronously. These two stream of updates cause
consistency issues. For example consider a following sequence of
alter table events on a table t1 as per HMS:

1. alter table t1 from source s1 say other Impala cluster
2. alter table t1 from source s2 say other Hive cluster
3. alter table t1 from local Impala cluster

The #3 alter table ddl operation would get reflected in the local
cache immediately. However, later on event processor would process
events from #1 and #2 above and try to alter the table. In an ideal
scenario, these alters should have been applied before #3 i.e in the
same order as they appear in HMS notification log. This leaves table
t1 in an inconsistent state.

Proposed solution:

The main idea of the solution is to keep track of the last event id
for a given table as eventId which the catalogd has synced to in the
Table object. The events processor ignores any event whose EVENT_ID
is less than or equal to the eventId stored in the table. Once the
events processor successfully processes a given event, it updates the
value of eventId in the table before releasing the table lock. Also,
any DDL or refresh operation on the catalogd (from both catalog HMS
endpoint and Impala shell) will follow the following steps to update
the event id for the table:

1. Acquire write lock on the table
2. Perform ddl operation in HMS
3. Sync table till the latest event id (as per HMS) since its last
   synced event id

The above steps ensure that any concurrent updates applied on a same
db/table from multiple sources like Hive, Impala or say multiple
Impala clusters, get reflected in the local catalogd cache (in the
same order as they appear in HMS) thus removing any inconsistencies.
Also the solution relies on the existing locking mechanism in the
catalogd to prevent any other concurrent updates to the table (even
via EventsProcessor). In case of database objects, we will also have
a similar eventId which represents the events on the database object
(CREATE, DROP, ALTER database) to which the catalogd as synced to.

This patch addresses the following:

- Add a new flag enable_sync_to_latest_event_on_ddls to enable/disable
this improvement. It is turned off by default.

- Sync db/table to latest event id for ddls from catalog HMS endpoints.
A subsequent patch would address the same for DDLs executed from Impala
shell

- Event processor skips processing an event if db/table is already
synced till that event id and sets that event id in db/table in case
the event is processed

- When EventProcessor detects a self event, it sets the last synced
event id in db/table before skipping an event

- Full table refresh sets the last event processed in table cache

Testing:

1. Added new unit tests and modified existing ones
2. Ran exhaustive tests with flag both turned on and off

Change-Id: I36364e401911352c474eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M 
fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M 

[Impala-ASF-CR] IMPALA-10997: Refactor Java Hive UDF code.

2021-11-01 Thread Fucun Chu (Code Review)
Fucun Chu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17986 )

Change subject: IMPALA-10997: Refactor Java Hive UDF code.
..


Patch Set 1:

(2 comments)

This looks good, I only had some minor comments.

http://gerrit.cloudera.org:8080/#/c/17986/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17986/1//COMMIT_MSG@19
PS1, Line 19: HiveUdfExecutor: Abstract base class that contains code that is 
common to
: the legacy UDF.class and the GenericUDF.class when it is 
eventually created.
: HiveUdfExecutorLegacy: Implementation of the code that is 
UDF.class specific.
nit: each line should have 72 or fewer characters if possible.


http://gerrit.cloudera.org:8080/#/c/17986/1/fe/src/main/java/org/apache/impala/hive/executor/UdfExecutor.java
File fe/src/main/java/org/apache/impala/hive/executor/UdfExecutor.java:

http://gerrit.cloudera.org:8080/#/c/17986/1/fe/src/main/java/org/apache/impala/hive/executor/UdfExecutor.java@105
PS1, Line 105:   classLoaderClosed_ = true;
Why not use classLoader_ = null?



--
To view, visit http://gerrit.cloudera.org:8080/17986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic1b981aed3021aef08c87e7cdbf7c6af95906754
Gerrit-Change-Number: 17986
Gerrit-PatchSet: 1
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Fucun Chu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 01 Nov 2021 08:21:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Impala-10994: Normalize pip package name

2021-11-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17987 )

Change subject: Impala-10994: Normalize pip package name
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/9700/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17987
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I479df0ad7acf3c650b8f5317372261d5e2840864
Gerrit-Change-Number: 17987
Gerrit-PatchSet: 4
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 01 Nov 2021 06:24:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] Impala-10994: Normalize pip package name

2021-11-01 Thread Anonymous Coward (Code Review)
yx91...@126.com has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/17987


Change subject: Impala-10994: Normalize pip package name
..

Impala-10994: Normalize pip package name

Change-Id: I479df0ad7acf3c650b8f5317372261d5e2840864
---
M infra/python/deps/pip_download.py
1 file changed, 2 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/17987/4
--
To view, visit http://gerrit.cloudera.org:8080/17987
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I479df0ad7acf3c650b8f5317372261d5e2840864
Gerrit-Change-Number: 17987
Gerrit-PatchSet: 4
Gerrit-Owner: Anonymous Coward