[Impala-ASF-CR] IMPALA-12513: Reset metadata when the CatalogD become active

2023-10-25 Thread ttttttz (Code Review)
Hello Quanlong Huang, Wenzhe Zhou, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20614

to look at the new patch set (#3).

Change subject: IMPALA-12513: Reset metadata when the CatalogD become active
..

IMPALA-12513: Reset metadata when the CatalogD become active

When switching active CatalogD, the loaded metadata in the candidate
CatalogD may not be the latest. We should reset the metadata to ensure
that there are no metadata inconsistency exceptions after failover.

Change-Id: I2b54f36f96e7901499105e51790d8e2300d7fea9
---
M be/src/catalog/catalog-server.cc
M tests/custom_cluster/test_catalogd_ha.py
2 files changed, 46 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/20614/3
--
To view, visit http://gerrit.cloudera.org:8080/20614
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2b54f36f96e7901499105e51790d8e2300d7fea9
Gerrit-Change-Number: 20614
Gerrit-PatchSet: 3
Gerrit-Owner: ttz <2433038...@qq.com>
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: ttz <2433038...@qq.com>


[Impala-ASF-CR] IMPALA-12513: Reset metadata when the CatalogD become active

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20614 )

Change subject: IMPALA-12513: Reset metadata when the CatalogD become active
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20614/3/tests/custom_cluster/test_catalogd_ha.py
File tests/custom_cluster/test_catalogd_ha.py:

http://gerrit.cloudera.org:8080/#/c/20614/3/tests/custom_cluster/test_catalogd_ha.py@426
PS3, Line 426:
flake8: W292 no newline at end of file



--
To view, visit http://gerrit.cloudera.org:8080/20614
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2b54f36f96e7901499105e51790d8e2300d7fea9
Gerrit-Change-Number: 20614
Gerrit-PatchSet: 3
Gerrit-Owner: ttz <2433038...@qq.com>
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: ttz <2433038...@qq.com>
Gerrit-Comment-Date: Wed, 25 Oct 2023 08:33:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12513: Reset metadata when the CatalogD become active

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20614 )

Change subject: IMPALA-12513: Reset metadata when the CatalogD become active
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/14242/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20614
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2b54f36f96e7901499105e51790d8e2300d7fea9
Gerrit-Change-Number: 20614
Gerrit-PatchSet: 3
Gerrit-Owner: ttz <2433038...@qq.com>
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: ttz <2433038...@qq.com>
Gerrit-Comment-Date: Wed, 25 Oct 2023 09:03:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12513: Reset metadata when the CatalogD become active

2023-10-25 Thread ttttttz (Code Review)
ttz has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20614 )

Change subject: IMPALA-12513: Reset metadata when the CatalogD become active
..


Patch Set 3:

> Patch Set 2:
>
> (1 comment)

There are serveral causes why the metadata loaded in the standby catalogd 
legging behind, such as:
  1、The catalogd enables load metadata in the background.
  2、After multiple statestored restarts, the catalog's HA state changes as 
follows: active > stanbdy > active.
  3、Newly created user-defined functions will not be recognized after failover.


--
To view, visit http://gerrit.cloudera.org:8080/20614
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2b54f36f96e7901499105e51790d8e2300d7fea9
Gerrit-Change-Number: 20614
Gerrit-PatchSet: 3
Gerrit-Owner: ttz <2433038...@qq.com>
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: ttz <2433038...@qq.com>
Gerrit-Comment-Date: Wed, 25 Oct 2023 09:04:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9118: Show catalog operation details in catalogd webUI

2023-10-25 Thread Tamas Mate (Code Review)
Tamas Mate has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20428 )

Change subject: IMPALA-9118: Show catalog operation details in catalogd webUI
..


Patch Set 7: Code-Review+2

Thanks Quanlong, LGTM!


--
To view, visit http://gerrit.cloudera.org:8080/20428
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3cf3f0da2be2be79e546762a8083d4de338ff6aa
Gerrit-Change-Number: 20428
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 25 Oct 2023 09:05:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12513: Reset metadata when the CatalogD become active

2023-10-25 Thread ttttttz (Code Review)
Hello Quanlong Huang, Wenzhe Zhou, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20614

to look at the new patch set (#4).

Change subject: IMPALA-12513: Reset metadata when the CatalogD become active
..

IMPALA-12513: Reset metadata when the CatalogD become active

When switching active CatalogD, the loaded metadata in the candidate
CatalogD may not be the latest. We should reset the metadata to ensure
that there are no metadata inconsistency exceptions after failover.

Change-Id: I2b54f36f96e7901499105e51790d8e2300d7fea9
---
M be/src/catalog/catalog-server.cc
M tests/custom_cluster/test_catalogd_ha.py
2 files changed, 46 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/20614/4
--
To view, visit http://gerrit.cloudera.org:8080/20614
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2b54f36f96e7901499105e51790d8e2300d7fea9
Gerrit-Change-Number: 20614
Gerrit-PatchSet: 4
Gerrit-Owner: ttz <2433038...@qq.com>
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: ttz <2433038...@qq.com>


[Impala-ASF-CR] IMPALA-12513: Reset metadata when the CatalogD become active

2023-10-25 Thread ttttttz (Code Review)
ttz has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20614 )

Change subject: IMPALA-12513: Reset metadata when the CatalogD become active
..


Patch Set 4:

(1 comment)

> Patch Set 2:
>
> (1 comment)

http://gerrit.cloudera.org:8080/#/c/20614/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20614/2//COMMIT_MSG@9
PS2, Line 9: the loaded metadata in the candidate
   : CatalogD may not be the latest.
> What are the causes for the loaded metadata in standby catalogd legging beh
There are serveral causes why the metadata loaded in the standby catalogd 
legging behind, such as:
  1?The catalogd enables load metadata in the background.
  2?After multiple statestored restarts, the catalog's HA state changes as 
follows: active > stanbdy > active.
  3?Newly created user-defined functions will not be recognized after failover.



--
To view, visit http://gerrit.cloudera.org:8080/20614
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2b54f36f96e7901499105e51790d8e2300d7fea9
Gerrit-Change-Number: 20614
Gerrit-PatchSet: 4
Gerrit-Owner: ttz <2433038...@qq.com>
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: ttz <2433038...@qq.com>
Gerrit-Comment-Date: Wed, 25 Oct 2023 09:07:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12513: Reset metadata when the CatalogD become active

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20614 )

Change subject: IMPALA-12513: Reset metadata when the CatalogD become active
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/14243/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20614
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2b54f36f96e7901499105e51790d8e2300d7fea9
Gerrit-Change-Number: 20614
Gerrit-PatchSet: 4
Gerrit-Owner: ttz <2433038...@qq.com>
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: ttz <2433038...@qq.com>
Gerrit-Comment-Date: Wed, 25 Oct 2023 09:37:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12398: Fix Ranger role not exists when altering db/table/view owner to a role

2023-10-25 Thread ji chen (Code Review)
ji chen has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/20508 )

Change subject: IMPALA-12398: Fix Ranger role not exists when altering 
db/table/view owner to a role
..

IMPALA-12398: Fix Ranger role not exists when altering db/table/view owner to a 
role

When Role '' is created with Ranger authorization enabled, if
'ALTER TABLE  SET OWNER ROLE ' statement is
executed to assign role as the owner of the table, it will throw
AnalysisException:Role '' does not exist.

Before this patch, given the ALTER DATABASE/TABLE/VIEW SET OWNER ROLE statement,
Impala always checked the existence of the given role in its 
AuthorizationPolicy.
However, when the support for role-related statements with Ranger was added
in IMPALA-10211, we only added the roles in RangerImpalaPlugin instead of
AuthorizationPolicy. Therefore, the statement above  would fail even though an
authorized user tries to set the owner to an existing role in 
RangerImpalaPlugin.

This patch will directly use ranger impala plugin to check the existence of the 
role,
instead of using AuthorizationPolicy object.

Tests:
 - Pass unit tests. test method testAlterView in AuthorizationStmtTest is 
updated
   accordingly.
 - Pass e2e tests. test method _test_ownership in test_ranger.py is updated to 
cover the
   new implementation.
 - Pass core tests with ranger enabled.

Change-Id: I2b029bdb90111dbd0eab5189360cc81090225cda
---
M fe/src/main/java/org/apache/impala/analysis/AlterDbSetOwnerStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableOrViewSetOwnerStmt.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationFactory.java
M fe/src/main/java/org/apache/impala/authorization/NoopAuthorizationFactory.java
A fe/src/main/java/org/apache/impala/authorization/RoleChecker.java
M 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java
A fe/src/main/java/org/apache/impala/authorization/ranger/RangerRoleChecker.java
M fe/src/main/java/org/apache/impala/authorization/ranger/RangerUtil.java
M fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java
M fe/src/test/java/org/apache/impala/authorization/AuthorizationTestBase.java
A 
fe/src/test/java/org/apache/impala/authorization/CatalogServiceTestCatalogWithRanger.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/authorization/test_ranger.py
14 files changed, 255 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/08/20508/4
--
To view, visit http://gerrit.cloudera.org:8080/20508
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2b029bdb90111dbd0eab5189360cc81090225cda
Gerrit-Change-Number: 20508
Gerrit-PatchSet: 4
Gerrit-Owner: ji chen 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: ji chen 


[Impala-ASF-CR] IMPALA-12398: Fix Ranger role not exists when altering db/table/view owner to a role

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20508 )

Change subject: IMPALA-12398: Fix Ranger role not exists when altering 
db/table/view owner to a role
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9841/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/20508
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2b029bdb90111dbd0eab5189360cc81090225cda
Gerrit-Change-Number: 20508
Gerrit-PatchSet: 4
Gerrit-Owner: ji chen 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: ji chen 
Gerrit-Comment-Date: Wed, 25 Oct 2023 10:31:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11996: Scanner change for Iceberg metadata querying

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20010 )

Change subject: IMPALA-11996: Scanner change for Iceberg metadata querying
..


Patch Set 15: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/20010
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0e943cecd77f5ef7af7cd07e2b596f2c5b4331e7
Gerrit-Change-Number: 20010
Gerrit-PatchSet: 15
Gerrit-Owner: Tamas Mate 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 25 Oct 2023 10:32:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12398: Fix Ranger role not exists when altering db/table/view owner to a role

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20508 )

Change subject: IMPALA-12398: Fix Ranger role not exists when altering 
db/table/view owner to a role
..


Patch Set 4:

Build Failed 

https://jenkins.impala.io/job/gerrit-code-review-checks/14244/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/20508
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2b029bdb90111dbd0eab5189360cc81090225cda
Gerrit-Change-Number: 20508
Gerrit-PatchSet: 4
Gerrit-Owner: ji chen 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: ji chen 
Gerrit-Comment-Date: Wed, 25 Oct 2023 10:57:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12509: Optimize the backend startup and planner time of large Iceberg table query

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20601 )

Change subject: IMPALA-12509: Optimize the backend startup and planner time of 
large Iceberg table query
..


Patch Set 5: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/20601
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I46a68f21903c16db3b02681feddb61ddce57879a
Gerrit-Change-Number: 20601
Gerrit-PatchSet: 5
Gerrit-Owner: Fu Lili 
Gerrit-Reviewer: Fu Lili 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 25 Oct 2023 11:04:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12493: Allow retry after failed cancel query

2023-10-25 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20584 )

Change subject: IMPALA-12493: Allow retry after failed cancel_query
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/20584/3/be/src/runtime/query-driver.h
File be/src/runtime/query-driver.h:

http://gerrit.cloudera.org:8080/#/c/20584/3/be/src/runtime/query-driver.h@267
PS3, Line 267: True if a thread has called Finalize()
nit: we need to update the comment.


http://gerrit.cloudera.org:8080/#/c/20584/3/be/src/runtime/query-driver.cc
File be/src/runtime/query-driver.cc:

http://gerrit.cloudera.org:8080/#/c/20584/3/be/src/runtime/query-driver.cc@442
PS3, Line 442: Finalize
It's not that obvious that the error is always recoverable. After checking the 
underlying codes, I see only the Cancel() call could return non-ok status. 
There are other stuffs in Finalize(), e.g. aborting transactions. We could 
unintentionally add new codes and return non-recoverable errors in Finalize().

Maybe it's more clean to check inflight at the begining of this method.



--
To view, visit http://gerrit.cloudera.org:8080/20584
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6e2c8872e91a7ff078cedb13e0771bcfaae6ee24
Gerrit-Change-Number: 20584
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Wed, 25 Oct 2023 11:23:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11996: Scanner change for Iceberg metadata querying

2023-10-25 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20010 )

Change subject: IMPALA-11996: Scanner change for Iceberg metadata querying
..


Patch Set 15: Code-Review+2

Great job, LGTM!


--
To view, visit http://gerrit.cloudera.org:8080/20010
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0e943cecd77f5ef7af7cd07e2b596f2c5b4331e7
Gerrit-Change-Number: 20010
Gerrit-PatchSet: 15
Gerrit-Owner: Tamas Mate 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 25 Oct 2023 12:31:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12415: Implement tests for graphical query timeline in webUI

2023-10-25 Thread Surya Hebbar (Code Review)
Surya Hebbar has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/20538 )

Change subject: IMPALA-12415: Implement tests for graphical query timeline in 
webUI
..

IMPALA-12415: Implement tests for graphical query timeline in webUI

To help with testing the query timeline web page, the tests have been
integrated using JEST testing framework based on nodejs.

'run_js_tests.sh' runs JEST test suits, after it fetches nodejs binaries
and related nodejs packages, if they are not present locally.

NodeJS binaries are stored within the ${IMPALA_TOOLCHAIN} directory.

'jest-environment-jsdom' supports rendering DOM elements and
testing them without the requirement of a browser.

For implementing unit and integration tests, the script
has been divided into multiple properly functioning modules,
that are imported as ES6 modules.

Unit tests have been written for the following query profile parsing
functions.

- mapTimeseriesCounters() - Test whether the method correctly searches
and maps the profile's counter's indexes based on counter_name even
in reverse order

- accumulateTimeseriesValues() - Test whether the method correctly
accumlates values after parsing values from 'data' in
'time_series_counters' while updating 'max_samples'

- generateTimesamples() - Test whether time sample values generated
based on 'max_samples' are correct, with different 'max_samples'

- clearTimeseriesValues() - Test whether Timeseries arrays are being
properly truncated in the correct range

- initializeUtilizationMetrics() - Test whether aggregate arrays and
time sample arrays are correctly allocated based on counters and
'max_samples'

- getSvg*() - Test whether all getSvg methods are correctly setting
attributes and returning expected attributes in elements

Unit tests produce JUnitXML for integration with jenkins jobs,
these are stored in ${IMPALA_JS_TEST_LOGS_DIR}.

This patch also fixes an edge case in 'mapTimeseriesCounters()',
when any counter within 'time_series_counter' is missing within profile.

Change-Id: I0caf0a0beee23821f78c0b3fe1aeb7dbf92d6a3e
---
M bin/impala-config.sh
M bin/run-all-tests.sh
A tests/run-js-tests.sh
M www/scripts/query_timeline/chart_commons.js
M www/scripts/query_timeline/fragment_diagram.js
A www/scripts/query_timeline/global_dom.js
M www/scripts/query_timeline/host_utilization_diagram.js
A www/scripts/query_timeline/package.json
A www/scripts/tests/.gitignore
A www/scripts/tests/package.json
A www/scripts/tests/query_timeline/chart_commons.test.js
A www/scripts/tests/query_timeline/fragment_diagram.test.js
A www/scripts/tests/query_timeline/host_utilization_diagram.test.js
13 files changed, 746 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/20538/5
--
To view, visit http://gerrit.cloudera.org:8080/20538
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0caf0a0beee23821f78c0b3fe1aeb7dbf92d6a3e
Gerrit-Change-Number: 20538
Gerrit-PatchSet: 5
Gerrit-Owner: Surya Hebbar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Surya Hebbar 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-12415: Implement tests for graphical query timeline in webUI

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20538 )

Change subject: IMPALA-12415: Implement tests for graphical query timeline in 
webUI
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20538/5/tests/run-js-tests.sh
File tests/run-js-tests.sh:

http://gerrit.cloudera.org:8080/#/c/20538/5/tests/run-js-tests.sh@40
PS5, Line 40:   curl 
https://nodejs.org/dist/${NODEJS_VERSION}/node-${NODEJS_VERSION}-${NODEJS_DISTRO}.tar.xz
 -O
line too long (98 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/20538
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0caf0a0beee23821f78c0b3fe1aeb7dbf92d6a3e
Gerrit-Change-Number: 20538
Gerrit-PatchSet: 5
Gerrit-Owner: Surya Hebbar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Surya Hebbar 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 25 Oct 2023 13:20:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12509: Optimize the backend startup and planner time of large Iceberg table query

2023-10-25 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20601 )

Change subject: IMPALA-12509: Optimize the backend startup and planner time of 
large Iceberg table query
..


Patch Set 5: Code-Review+1

(1 comment)

One small nit, otherwise the code LGTM!

http://gerrit.cloudera.org:8080/#/c/20601/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20601/5//COMMIT_MSG@7
PS5, Line 7: e
   : Iceberg table query
nit: we usually don't limit the title line to 72 chars, it can be longer so it 
can fit to a single line.



--
To view, visit http://gerrit.cloudera.org:8080/20601
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I46a68f21903c16db3b02681feddb61ddce57879a
Gerrit-Change-Number: 20601
Gerrit-PatchSet: 5
Gerrit-Owner: Fu Lili 
Gerrit-Reviewer: Fu Lili 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 25 Oct 2023 13:22:29 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12415: Implement tests for graphical query timeline in webUI

2023-10-25 Thread Surya Hebbar (Code Review)
Surya Hebbar has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/20538 )

Change subject: IMPALA-12415: Implement tests for graphical query timeline in 
webUI
..

IMPALA-12415: Implement tests for graphical query timeline in webUI

To help with testing the query timeline web page, the tests have been
integrated using JEST testing framework based on nodejs.

'run_js_tests.sh' runs JEST test suits, after it fetches nodejs binaries
and related nodejs packages, if they are not present locally.

NodeJS binaries are stored within the ${IMPALA_TOOLCHAIN} directory.

'jest-environment-jsdom' supports rendering DOM elements and
testing them without the requirement of a browser.

For implementing unit and integration tests, the script
has been divided into multiple properly functioning modules,
that are imported as ES6 modules.

Unit tests have been written for the following query profile parsing
functions.

- mapTimeseriesCounters() - Test whether the method correctly searches
and maps the profile's counter's indexes based on counter_name even
in reverse order

- accumulateTimeseriesValues() - Test whether the method correctly
accumlates values after parsing values from 'data' in
'time_series_counters' while updating 'max_samples'

- generateTimesamples() - Test whether time sample values generated
based on 'max_samples' are correct, with different 'max_samples'

- clearTimeseriesValues() - Test whether Timeseries arrays are being
properly truncated in the correct range

- initializeUtilizationMetrics() - Test whether aggregate arrays and
time sample arrays are correctly allocated based on counters and
'max_samples'

- getSvg*() - Test whether all getSvg methods are correctly setting
attributes and returning expected attributes in elements

Unit tests produce JUnitXML for integration with jenkins jobs,
these are stored in ${IMPALA_JS_TEST_LOGS_DIR}.

Change-Id: I0caf0a0beee23821f78c0b3fe1aeb7dbf92d6a3e
---
M bin/impala-config.sh
M bin/run-all-tests.sh
A tests/run-js-tests.sh
M www/scripts/query_timeline/chart_commons.js
M www/scripts/query_timeline/fragment_diagram.js
A www/scripts/query_timeline/global_dom.js
M www/scripts/query_timeline/host_utilization_diagram.js
A www/scripts/query_timeline/package.json
A www/scripts/tests/.gitignore
A www/scripts/tests/package.json
A www/scripts/tests/query_timeline/chart_commons.test.js
A www/scripts/tests/query_timeline/fragment_diagram.test.js
A www/scripts/tests/query_timeline/host_utilization_diagram.test.js
13 files changed, 746 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/20538/6
--
To view, visit http://gerrit.cloudera.org:8080/20538
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0caf0a0beee23821f78c0b3fe1aeb7dbf92d6a3e
Gerrit-Change-Number: 20538
Gerrit-PatchSet: 6
Gerrit-Owner: Surya Hebbar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Surya Hebbar 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-12415: Implement tests for graphical query timeline in webUI

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20538 )

Change subject: IMPALA-12415: Implement tests for graphical query timeline in 
webUI
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20538/6/tests/run-js-tests.sh
File tests/run-js-tests.sh:

http://gerrit.cloudera.org:8080/#/c/20538/6/tests/run-js-tests.sh@40
PS6, Line 40:   curl 
https://nodejs.org/dist/${NODEJS_VERSION}/node-${NODEJS_VERSION}-${NODEJS_DISTRO}.tar.xz
 -O
line too long (98 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/20538
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0caf0a0beee23821f78c0b3fe1aeb7dbf92d6a3e
Gerrit-Change-Number: 20538
Gerrit-PatchSet: 6
Gerrit-Owner: Surya Hebbar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Surya Hebbar 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 25 Oct 2023 13:23:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12415: Implement tests for graphical query timeline in webUI

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20538 )

Change subject: IMPALA-12415: Implement tests for graphical query timeline in 
webUI
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/14246/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20538
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0caf0a0beee23821f78c0b3fe1aeb7dbf92d6a3e
Gerrit-Change-Number: 20538
Gerrit-PatchSet: 6
Gerrit-Owner: Surya Hebbar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Surya Hebbar 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 25 Oct 2023 13:53:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12415: Implement tests for graphical query timeline in webUI

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20538 )

Change subject: IMPALA-12415: Implement tests for graphical query timeline in 
webUI
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/14245/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20538
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0caf0a0beee23821f78c0b3fe1aeb7dbf92d6a3e
Gerrit-Change-Number: 20538
Gerrit-PatchSet: 5
Gerrit-Owner: Surya Hebbar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Surya Hebbar 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 25 Oct 2023 13:54:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11996: Scanner change for Iceberg metadata querying

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20010 )

Change subject: IMPALA-11996: Scanner change for Iceberg metadata querying
..


Patch Set 16:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9842/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/20010
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0e943cecd77f5ef7af7cd07e2b596f2c5b4331e7
Gerrit-Change-Number: 20010
Gerrit-PatchSet: 16
Gerrit-Owner: Tamas Mate 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 25 Oct 2023 14:01:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11996: Scanner change for Iceberg metadata querying

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20010 )

Change subject: IMPALA-11996: Scanner change for Iceberg metadata querying
..


Patch Set 16: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/20010
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0e943cecd77f5ef7af7cd07e2b596f2c5b4331e7
Gerrit-Change-Number: 20010
Gerrit-PatchSet: 16
Gerrit-Owner: Tamas Mate 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 25 Oct 2023 14:01:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12308: DIRECTED distribution mode for V2 Iceberg tables

2023-10-25 Thread Gabor Kaszab (Code Review)
Hello Tamas Mate, Daniel Becker, Zoltan Borok-Nagy, Peter Rozsa, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20548

to look at the new patch set (#5).

Change subject: IMPALA-12308: DIRECTED distribution mode for V2 Iceberg tables
..

IMPALA-12308: DIRECTED distribution mode for V2 Iceberg tables

For Iceberg tables, when joining the data files with the delete files,
both of the current distribution modes (broadcast, partitioned) are
wasteful. The idea is that when we read a row from a delete file it
contains the name of the data file that this particular delete row is
referring to so if we knew where that data file is scheduled we could
directly send that delete file row there.

This patch enhances the scheduler to collect the information about
which data file is scheduled on which host. Since, the scan node for
the data files are on the same host as the Iceberg join node, we can
send the delete files directly to that specific host.

Restriction:
Table sampling couldn't use this enhancement as it won't schedule all
the data files in a table. But a delete file could contain any data
file name, so when the KrpcDataStreamSender distributes the delete file
rows it might not find the corresponding data file in the filename to
hosts mapping and won't know where to send that particular row. Hence,
V2 read optimization is turned off for table sampling.

Functional testing:
 - Re-run full test suite to check for regressions.

Performance testing:
1) Local machine: SELECT COUNT(1) FROM TPCH10_parquet.lineitem
Around 15% of the rows are deleted.
As the table is unpartitioned I got a small number of delete files with
relatively large size.
Query runtime decreased by ~80%

2) Local machine: SELECT COUNT(1) FROM TPCDS10_store_sales
Around 15% of the rows are deleted.
Table is partitioned that results more delete files but smaller in
size.
Query runtime decreased by ~50%

3) 10-node cluster with data stored on S3.
SELECT COUNT(1) FROM a scaled store_sales table having ~8.6B rows and
~15% are deleted.
Here we had 2 scenarios:
  a) Table is written by Impala: The runtime decreased by ~80%. One
 delete file row is sent exactly to one host.
  b) Table is written by Hive: The runtime decreased by ~60%. Here
 apparently the data files are bigger and one data file might be
 spread to multiple scan ranges. As a result one delete file row
 might be sent to multiple hosts. The time difference between the
 a) run is the time spent on sending out more delete file rows.

Change-Id: I212afd7c9e94551a1c50a40ccb0e3c1f7ecdf3d2
---
M be/src/exec/data-sink.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/fragment-state.cc
M be/src/runtime/fragment-state.h
M be/src/runtime/krpc-data-stream-sender.cc
M be/src/runtime/krpc-data-stream-sender.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/scheduling/scheduler.cc
M be/src/scheduling/scheduler.h
M be/src/service/query-options-test.cc
M be/src/service/query-options.cc
M common/protobuf/admission_control_service.proto
M common/protobuf/control_service.proto
M common/thrift/Partitions.thrift
M common/thrift/PlanNodes.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/planner/DataPartition.java
M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java
M fe/src/main/java/org/apache/impala/planner/IcebergDeleteNode.java
M fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java
M fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-delete.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables.test
M testdata/workloads/functional-planner/queries/PlannerTest/tablesample.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-v2-read-position-deletes.test
M testdata/workloads/functional-query/queries/QueryTest/set.test
M tests/query_test/test_iceberg.py
31 files changed, 517 insertions(+), 283 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/20548/5
--
To view, visit http://gerrit.cloudera.org:8080/20548
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I212afd7c9e94551a1c50a40ccb0e3c1f7ecdf3d2
Gerrit-Change-Number: 20548
Gerrit-PatchSet: 5
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-12308: DIRECTED distribution mode for V2 Iceberg tables

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20548 )

Change subject: IMPALA-12308: DIRECTED distribution mode for V2 Iceberg tables
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20548/5/be/src/scheduling/scheduler.h
File be/src/scheduling/scheduler.h:

http://gerrit.cloudera.org:8080/#/c/20548/5/be/src/scheduling/scheduler.h@86
PS5, Line 86:   /// Map from filenames to hosts where those files are 
scheduled, grouped by scan node ID.
line too long (91 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/20548
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I212afd7c9e94551a1c50a40ccb0e3c1f7ecdf3d2
Gerrit-Change-Number: 20548
Gerrit-PatchSet: 5
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 25 Oct 2023 14:39:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12308: DIRECTED distribution mode for V2 Iceberg tables

2023-10-25 Thread Gabor Kaszab (Code Review)
Hello Tamas Mate, Daniel Becker, Zoltan Borok-Nagy, Peter Rozsa, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20548

to look at the new patch set (#6).

Change subject: IMPALA-12308: DIRECTED distribution mode for V2 Iceberg tables
..

IMPALA-12308: DIRECTED distribution mode for V2 Iceberg tables

For Iceberg tables, when joining the data files with the delete files,
both of the current distribution modes (broadcast, partitioned) are
wasteful. The idea is that when we read a row from a delete file it
contains the name of the data file that this particular delete row is
referring to so if we knew where that data file is scheduled we could
directly send that delete file row there.

This patch enhances the scheduler to collect the information about
which data file is scheduled on which host. Since, the scan node for
the data files are on the same host as the Iceberg join node, we can
send the delete files directly to that specific host.

Restriction:
Table sampling couldn't use this enhancement as it won't schedule all
the data files in a table. But a delete file could contain any data
file name, so when the KrpcDataStreamSender distributes the delete file
rows it might not find the corresponding data file in the filename to
hosts mapping and won't know where to send that particular row. Hence,
V2 read optimization is turned off for table sampling.

Functional testing:
 - Re-run full test suite to check for regressions.

Performance testing:
1) Local machine: SELECT COUNT(1) FROM TPCH10_parquet.lineitem
Around 15% of the rows are deleted.
As the table is unpartitioned I got a small number of delete files with
relatively large size.
Query runtime decreased by ~80%

2) Local machine: SELECT COUNT(1) FROM TPCDS10_store_sales
Around 15% of the rows are deleted.
Table is partitioned that results more delete files but smaller in
size.
Query runtime decreased by ~50%

3) 10-node cluster with data stored on S3.
SELECT COUNT(1) FROM a scaled store_sales table having ~8.6B rows and
~15% are deleted.
Here we had 2 scenarios:
  a) Table is written by Impala: The runtime decreased by ~80%. One
 delete file row is sent exactly to one host.
  b) Table is written by Hive: The runtime decreased by ~60%. Here
 apparently the data files are bigger and one data file might be
 spread to multiple scan ranges. As a result one delete file row
 might be sent to multiple hosts. The time difference between the
 a) run is the time spent on sending out more delete file rows.

Change-Id: I212afd7c9e94551a1c50a40ccb0e3c1f7ecdf3d2
---
M be/src/exec/data-sink.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/fragment-state.cc
M be/src/runtime/fragment-state.h
M be/src/runtime/krpc-data-stream-sender.cc
M be/src/runtime/krpc-data-stream-sender.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/scheduling/scheduler.cc
M be/src/scheduling/scheduler.h
M be/src/service/query-options-test.cc
M be/src/service/query-options.cc
M common/protobuf/admission_control_service.proto
M common/protobuf/control_service.proto
M common/thrift/Partitions.thrift
M common/thrift/PlanNodes.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/planner/DataPartition.java
M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java
M fe/src/main/java/org/apache/impala/planner/IcebergDeleteNode.java
M fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java
M fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-delete.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables.test
M testdata/workloads/functional-planner/queries/PlannerTest/tablesample.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-v2-read-position-deletes.test
M testdata/workloads/functional-query/queries/QueryTest/set.test
M tests/query_test/test_iceberg.py
31 files changed, 518 insertions(+), 283 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/20548/6
--
To view, visit http://gerrit.cloudera.org:8080/20548
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I212afd7c9e94551a1c50a40ccb0e3c1f7ecdf3d2
Gerrit-Change-Number: 20548
Gerrit-PatchSet: 6
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-12308: DIRECTED distribution mode for V2 Iceberg tables

2023-10-25 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20548 )

Change subject: IMPALA-12308: DIRECTED distribution mode for V2 Iceberg tables
..


Patch Set 6:

(43 comments)

Thanks for all the feedback, People!
I think I managed to address all of them except the following 2:
  - Make another perf measurement where I scale from 10 executor nodes to 20 or 
even more. Will update the commit message once done.
  - Ignore the fact when a delete file record is referring to a data file that 
is not part of the snapshot. Will make this change in a follow-up PS.

http://gerrit.cloudera.org:8080/#/c/20548/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20548/4//COMMIT_MSG@26
PS4, Line 26: Hence,
: V2 read optimization is turned off for table sampling.
> Can we just drop such delete records instead? Compaction might also remove
I also considered just silently dropping the delete rows when I don't find the 
associated data file. I decided be on the safe side and to fail the query 
instead because if we have a bug somewhere in the 'filename to hosts' mapping 
creation then we'd end up in a data correctness issue if we just ignored that a 
delete row that doesn't have corresponding entry in the map.
I don't really have insight into how compaction works TBH. Is it a valid 
scenario that let's assume we have a delete file containing rows for 2 
different data files, and when we do compaction where we get rid of one of the 
data files but leaving the delete file intact referring to both data files? If 
this scenario is in fact valid, then indeed we have to ignore the 
'non-matching' delete rows.
I don't get the other example where we mix MoR and CoW. In that case there 
won't be any delete files referring non-existing data files.

Update: I believe compaction can cause a situation where delete file rows 
referring to non-existing data files. So I guess the only option I have here is 
to ignore such delete rows. Fingers crossed then not to have any bugs in the 
filename to hosts mapping mechanism now that remain under the radar now :)
Will adjust this logic in a follow-up patch set.


http://gerrit.cloudera.org:8080/#/c/20548/4//COMMIT_MSG@45
PS4, Line 45: 10-node cluster wi
> These are not known upstream. Maybe you could just specify the characterist
Ooop, I usually don't confuse the open source and the downstream world. Thanks 
for pointing that out!

I'll do the measurement for more executors later on and will update the commit 
msg with the results.


http://gerrit.cloudera.org:8080/#/c/20548/4/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/20548/4/be/src/runtime/coordinator-backend-state.cc@330
PS4, Line 330: CopyFilenameToHostsMappingToRequest(&request);
> Could extract this to a function.
Done


http://gerrit.cloudera.org:8080/#/c/20548/4/be/src/runtime/coordinator-backend-state.cc@331
PS4, Line 331:
> Could we use the concrete types instead of auto here and on L337? It would
Done


http://gerrit.cloudera.org:8080/#/c/20548/4/be/src/runtime/coordinator-backend-state.cc@332
PS4, Line 332: VLOG_FILE << "making rpc: ExecQueryFInstances"
> nit: could be range-based for loops for less verbosity
Done


http://gerrit.cloudera.org:8080/#/c/20548/4/be/src/runtime/coordinator-backend-state.cc@350
PS4, Line 350:   google::protobuf::Map* 
by_node_filename_to_hosts =
> For better readability, can we put the above code to a separate function?
Done


http://gerrit.cloudera.org:8080/#/c/20548/4/be/src/runtime/fragment-state.cc
File be/src/runtime/fragment-state.cc:

http://gerrit.cloudera.org:8080/#/c/20548/4/be/src/runtime/fragment-state.cc@108
PS4, Line 108:   DCHECK(tuple_desc != nullptr);
> nit: could be nullptr
Done


http://gerrit.cloudera.org:8080/#/c/20548/4/be/src/runtime/fragment-state.cc@112
PS4, Line 112:
> Could write out the type to make it easier to understand.
well, it's this:
std::unordered_map, bool>>
I didn't want to write it out as I found it too long. If you insists it'd 
increase readability, I can do so.


http://gerrit.cloudera.org:8080/#/c/20548/4/be/src/runtime/fragment-state.cc@113
PS4, Line 113:   for (auto file_to_hosts : node_to_files_it->second) {
> nit: can we use range-based for loop?
Done


http://gerrit.cloudera.org:8080/#/c/20548/4/be/src/runtime/fragment-state.cc@116
PS4, Line 116: is_relative_path) {
> Could extract this to a variable named "path_is_relative" or something simi
Done


http://gerrit.cloudera.org:8080/#/c/20548/4/be/src/runtime/krpc-data-stream-sender.h
File be/src/runtime/krpc-data-stream-sender.h:

http://gerrit.cloudera.org:8080/#/c/20548/4/be/src/runtime/krpc-data-stream-sender.h@113
PS4, Line 113: M
> nit: please add comma before 'and'
it's not required to use a comma before an 'and' when you list things.


http://gerrit.cloudera.org:8080/#/c/20548/4

[Impala-ASF-CR] IMPALA-12398: Fix Ranger role not exists when altering db/table/view owner to a role

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20508 )

Change subject: IMPALA-12398: Fix Ranger role not exists when altering 
db/table/view owner to a role
..


Patch Set 4: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/9841/


--
To view, visit http://gerrit.cloudera.org:8080/20508
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2b029bdb90111dbd0eab5189360cc81090225cda
Gerrit-Change-Number: 20508
Gerrit-PatchSet: 4
Gerrit-Owner: ji chen 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: ji chen 
Gerrit-Comment-Date: Wed, 25 Oct 2023 15:08:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12308: DIRECTED distribution mode for V2 Iceberg tables

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20548 )

Change subject: IMPALA-12308: DIRECTED distribution mode for V2 Iceberg tables
..


Patch Set 5:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/14247/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/20548
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I212afd7c9e94551a1c50a40ccb0e3c1f7ecdf3d2
Gerrit-Change-Number: 20548
Gerrit-PatchSet: 5
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 25 Oct 2023 15:11:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12308: DIRECTED distribution mode for V2 Iceberg tables

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20548 )

Change subject: IMPALA-12308: DIRECTED distribution mode for V2 Iceberg tables
..


Patch Set 6:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/14248/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/20548
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I212afd7c9e94551a1c50a40ccb0e3c1f7ecdf3d2
Gerrit-Change-Number: 20548
Gerrit-PatchSet: 6
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 25 Oct 2023 15:35:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3825: Delegate runtime filter aggregation to some executors

2023-10-25 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20612 )

Change subject: IMPALA-3825: Delegate runtime filter aggregation to some 
executors
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20612/5/tests/common/test_dimensions.py
File tests/common/test_dimensions.py:

http://gerrit.cloudera.org:8080/#/c/20612/5/tests/common/test_dimensions.py@261
PS5, Line 261:
 :   TODO: In the future we could generate these values using 
pairwise to reduce total
 :   execution time.
This is probably the reason why extra exec options are always declared with 
separate cls.ImpalaTestMatrix.add_dimension().
It ensure the pairwise combination is done properly.



--
To view, visit http://gerrit.cloudera.org:8080/20612
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I11d38ed0f223d6e5b32a19ebe725af7738ee4ab0
Gerrit-Change-Number: 20612
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 25 Oct 2023 16:12:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12493: Allow retry after failed cancel query

2023-10-25 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20584 )

Change subject: IMPALA-12493: Allow retry after failed cancel_query
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20584/3/be/src/runtime/query-driver.cc
File be/src/runtime/query-driver.cc:

http://gerrit.cloudera.org:8080/#/c/20584/3/be/src/runtime/query-driver.cc@442
PS3, Line 442: Finalize
> It's not that obvious that the error is always recoverable. After checking
That seems reasonable. In a draft I was looking at updating Finalize so it 
can't fail, I could pull that into here.



--
To view, visit http://gerrit.cloudera.org:8080/20584
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6e2c8872e91a7ff078cedb13e0771bcfaae6ee24
Gerrit-Change-Number: 20584
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Wed, 25 Oct 2023 17:47:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7131: Support external data sources in LocalCatalog mode

2023-10-25 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/20574 )

Change subject: IMPALA-7131: Support external data sources in LocalCatalog mode
..

IMPALA-7131: Support external data sources in LocalCatalog mode

This patch makes external data source working in LocalCatalog mode:
 - Add APIs in CatalogdMetaProvider to fetch DataSource from Catalog
   server through RPC.
 - Add getDataSources() and getDataSource() in LocalCatalog.
 - Add LocalDataSourceTable class for loading DataSource table in
   LocalCatalog.
 - Handle request for loading DataSource in CatalogServiceCatalog on
   Catalog server.
 - Enable tests which are skipped by
   SkipIfCatalogV2.data_sources_unsupported().
   Remove SkipIfCatalogV2.data_sources_unsupported().
 - Add end-to-end tests for LocalCatalog mode.

Testing:
 - Passed core tests

Change-Id: I40841c9be9064ac67771c4d3f5acbb3b552a2e55
---
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java
A fe/src/main/java/org/apache/impala/catalog/local/LocalDataSourceTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java
M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M testdata/workloads/functional-query/queries/QueryTest/data-source-tables.test
M testdata/workloads/functional-query/queries/QueryTest/jdbc-data-source.test
M tests/common/skip.py
A tests/custom_cluster/test_ext_data_sources.py
M tests/metadata/test_ddl.py
M tests/metadata/test_metadata_query_statements.py
M tests/query_test/test_ext_data_sources.py
16 files changed, 567 insertions(+), 36 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/20574/6
--
To view, visit http://gerrit.cloudera.org:8080/20574
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I40841c9be9064ac67771c4d3f5acbb3b552a2e55
Gerrit-Change-Number: 20574
Gerrit-PatchSet: 6
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-7131: Support external data sources in LocalCatalog mode

2023-10-25 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20574 )

Change subject: IMPALA-7131: Support external data sources in LocalCatalog mode
..


Patch Set 6:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/20574/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/20574/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3664
PS5, Line 3664: Collections.emptyLis
> nit: use Collections.emptyList() to avoid creating new array objects.
Fixed as suggested


http://gerrit.cloudera.org:8080/#/c/20574/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3675
PS5, Line 3675: Collections.emptyLis
> nit: use Collections.emptyList()
Fixed as suggested


http://gerrit.cloudera.org:8080/#/c/20574/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/20574/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@1233
PS3, Line 1233: ImmutableList thriftDataSrcs = loadWithCaching(
> I see. I think the DataSource name list could be inconsistent with the Data
I got your point. Removed this unused function now. Will consider inconsistent 
issue if I have to add back this function.


http://gerrit.cloudera.org:8080/#/c/20574/5/fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java
File fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java:

http://gerrit.cloudera.org:8080/#/c/20574/5/fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java@232
PS5, Line 232:   LOG.info("Unable to load DataSource objects, ", e);
> nit: the stacktrace might be helpful, so let's log it as
fixed


http://gerrit.cloudera.org:8080/#/c/20574/5/fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java@244
PS5, Line 244:   LOG.info("DataSource not found: " + dsName, e);
> nit: log the stacktrace
fixed


http://gerrit.cloudera.org:8080/#/c/20574/5/testdata/workloads/functional-query/queries/QueryTest/data-source-tables.test
File 
testdata/workloads/functional-query/queries/QueryTest/data-source-tables.test:

http://gerrit.cloudera.org:8080/#/c/20574/5/testdata/workloads/functional-query/queries/QueryTest/data-source-tables.test@5
PS5, Line 5: CREATE DATA SOURCE TestGenericDataSource
> I see this is created in testdata/bin/create-ext-data-source-table.sql. Are
DataSource object is not persistent now. We have to recreate it when creating 
new external data source table. I also want to add test for dropping DataSource 
object.
Renamed the DataSource name to avoid impact for other tests.


http://gerrit.cloudera.org:8080/#/c/20574/5/testdata/workloads/functional-query/queries/QueryTest/data-source-tables.test@184
PS5, Line 184: DROP DATA SOURCE TestGenericDataSource;
> Wouldn't this impact other tests that depend on this data source? Or is thi
Renamed DataSource name to avoid impact to other tests.



--
To view, visit http://gerrit.cloudera.org:8080/20574
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I40841c9be9064ac67771c4d3f5acbb3b552a2e55
Gerrit-Change-Number: 20574
Gerrit-PatchSet: 6
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 25 Oct 2023 18:30:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11996: Scanner change for Iceberg metadata querying

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20010 )

Change subject: IMPALA-11996: Scanner change for Iceberg metadata querying
..


Patch Set 16: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/9842/


--
To view, visit http://gerrit.cloudera.org:8080/20010
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0e943cecd77f5ef7af7cd07e2b596f2c5b4331e7
Gerrit-Change-Number: 20010
Gerrit-PatchSet: 16
Gerrit-Owner: Tamas Mate 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 25 Oct 2023 18:37:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9118: Show catalog operation details in catalogd webUI

2023-10-25 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20428 )

Change subject: IMPALA-9118: Show catalog operation details in catalogd webUI
..


Patch Set 7: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/20428
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3cf3f0da2be2be79e546762a8083d4de338ff6aa
Gerrit-Change-Number: 20428
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 25 Oct 2023 18:48:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9118: Show catalog operation details in catalogd webUI

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20428 )

Change subject: IMPALA-9118: Show catalog operation details in catalogd webUI
..


Patch Set 8: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/20428
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3cf3f0da2be2be79e546762a8083d4de338ff6aa
Gerrit-Change-Number: 20428
Gerrit-PatchSet: 8
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 25 Oct 2023 18:49:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9118: Show catalog operation details in catalogd webUI

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20428 )

Change subject: IMPALA-9118: Show catalog operation details in catalogd webUI
..


Patch Set 8:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9843/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/20428
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3cf3f0da2be2be79e546762a8083d4de338ff6aa
Gerrit-Change-Number: 20428
Gerrit-PatchSet: 8
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 25 Oct 2023 18:49:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12493: Allow retry after failed cancel query

2023-10-25 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20584 )

Change subject: IMPALA-12493: Allow retry after failed cancel_query
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20584/3/be/src/runtime/query-driver.cc
File be/src/runtime/query-driver.cc:

http://gerrit.cloudera.org:8080/#/c/20584/3/be/src/runtime/query-driver.cc@442
PS3, Line 442: ion alre
> That seems reasonable. In a draft I was looking at updating Finalize so it
Done



--
To view, visit http://gerrit.cloudera.org:8080/20584
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6e2c8872e91a7ff078cedb13e0771bcfaae6ee24
Gerrit-Change-Number: 20584
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Wed, 25 Oct 2023 18:58:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12493: Allow retry after failed cancel query

2023-10-25 Thread Michael Smith (Code Review)
Hello Quanlong Huang, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20584

to look at the new patch set (#4).

Change subject: IMPALA-12493: Allow retry after failed cancel_query
..

IMPALA-12493: Allow retry after failed cancel_query

Previously if /cancel_query failed with "Query not yet running",
subsequent attempts to cancel the query would always fail and the query
would never be unregistered (so it would stay in "queries in flight" or
"waiting to be closed" until the coordinator was restarted) because the
QueryDriver thought the query was finalized. Check whether the query is
inflight before setting finalized_ to allow retries.

Cancelling a request can no longer fail, so additional checks for
whether the request has been cancelled before it started executing are
added. Finalizing through the QueryDriver still checks that the query is
inflight, which preserves the current behavior for user cancellation.

Testing:
- adds a test that canceling a query in CREATED state fails but does not
  prevent canceling the query later
- adds a test that canceling a query that later fails results in that
  query completing rather than going to "waiting to be closed"

Change-Id: I6e2c8872e91a7ff078cedb13e0771bcfaae6ee24
---
M be/src/runtime/query-driver.cc
M be/src/runtime/query-driver.h
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-server.cc
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/util/DebugUtils.java
M tests/webserver/test_web_pages.py
8 files changed, 178 insertions(+), 60 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/20584/4
--
To view, visit http://gerrit.cloudera.org:8080/20584
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6e2c8872e91a7ff078cedb13e0771bcfaae6ee24
Gerrit-Change-Number: 20584
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-7131: Support external data sources in LocalCatalog mode

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20574 )

Change subject: IMPALA-7131: Support external data sources in LocalCatalog mode
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/14249/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20574
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I40841c9be9064ac67771c4d3f5acbb3b552a2e55
Gerrit-Change-Number: 20574
Gerrit-PatchSet: 6
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 25 Oct 2023 19:00:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12515: Build modules for extra pythons

2023-10-25 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20617 )

Change subject: IMPALA-12515: Build modules for extra pythons
..


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/20617/1/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/20617/1/bin/impala-config.sh@228
PS1, Line 228: # Additional Python versions to package for
Can you expand this comment a bit? Talk about how this is about the shell 
tarball, currently not anything else.


http://gerrit.cloudera.org:8080/#/c/20617/1/shell/CMakeLists.txt
File shell/CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/20617/1/shell/CMakeLists.txt@47
PS1, Line 47:   if (${PYTHON_EXE} EQUAL $ENV{IMPALA_SYSTEM_PYTHON2})
: set(PYTHON2_VENV ${VENV})
: set(PYTHON2_PIP_CACHE ${PIP_CACHE})
:   elseif (${PYTHON_EXE} EQUAL $ENV{IMPALA_SYSTEM_PYTHON3})
: set(PYTHON3_VENV ${VENV})
: set(PYTHON3_PIP_CACHE ${PIP_CACHE})
:   endif()
We also set PYTHON2_VENV etc down below next to the shell_pythonX_install 
targets. Can we remove one?


http://gerrit.cloudera.org:8080/#/c/20617/1/shell/impala-shell
File shell/impala-shell:

http://gerrit.cloudera.org:8080/#/c/20617/1/shell/impala-shell@51
PS1, Line 51:   echo "Python ${PYTHON_VERSION} is not supported, try 'pip 
install impala-shell'."
I'm thinking about what exactly we want in the error message. We want to avoid 
giving the impression that impala-shell itself doesn't support that Python 
version. Something more of the form "This impala-shell package was not built 
with support for Python X"

Do we want to list the pythons the package does support?


http://gerrit.cloudera.org:8080/#/c/20617/1/shell/make_shell_tarball.sh
File shell/make_shell_tarball.sh:

http://gerrit.cloudera.org:8080/#/c/20617/1/shell/make_shell_tarball.sh@150
PS1, Line 150: --upgrade
What does the "--upgrade" do? Do we need it?



--
To view, visit http://gerrit.cloudera.org:8080/20617
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I13720a9e3c50f348bef41f5e91f810204e416f13
Gerrit-Change-Number: 20617
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 25 Oct 2023 19:05:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12516: Set HDFS limit based on memlock

2023-10-25 Thread Michael Smith (Code Review)
Michael Smith has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/20623


Change subject: IMPALA-12516: Set HDFS limit based on memlock
..

IMPALA-12516: Set HDFS limit based on memlock

With RHEL 8 on AWS Graviton instances,
dfs.datanode.max.locked.memory=64000 is insufficient to run
query_test/test_hdfs_caching.py::TestHdfsCaching::test_table_is_cached.
Sets it based on 'ulimit -l' so test environments can increase this
limit as needed.

Change-Id: I7722ddd0c7fbd9bbd1979503952b7522b808194a
---
M testdata/cluster/admin
M testdata/cluster/node_templates/common/etc/hadoop/conf/hdfs-site.xml.tmpl
2 files changed, 11 insertions(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/20623/1
--
To view, visit http://gerrit.cloudera.org:8080/20623
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I7722ddd0c7fbd9bbd1979503952b7522b808194a
Gerrit-Change-Number: 20623
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Smith 


[Impala-ASF-CR] IMPALA-12514: Remove dependency on jetty-server

2023-10-25 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20616 )

Change subject: IMPALA-12514: Remove dependency on jetty-server
..


Patch Set 1: Code-Review+1

This looks good to me. IMPALA-11466 introduced this, so as long as that test is 
passing, it seems fine.


--
To view, visit http://gerrit.cloudera.org:8080/20616
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6cd2879ee88e0636186a27c3e16843a30ee9a738
Gerrit-Change-Number: 20616
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Wed, 25 Oct 2023 19:14:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] Specify the native toolchain revision exactly

2023-10-25 Thread Joe McDonnell (Code Review)
Joe McDonnell has abandoned this change. ( 
http://gerrit.cloudera.org:8080/19885 )

Change subject: Specify the native toolchain revision exactly
..


Abandoned
--
To view, visit http://gerrit.cloudera.org:8080/19885
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: Ib1cb3476fb0c0fdc020782fd543abc3aaa2873a9
Gerrit-Change-Number: 19885
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 


[Impala-ASF-CR] IMPALA-12493: Allow retry after failed cancel query

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20584 )

Change subject: IMPALA-12493: Allow retry after failed cancel_query
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/14250/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20584
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6e2c8872e91a7ff078cedb13e0771bcfaae6ee24
Gerrit-Change-Number: 20584
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Wed, 25 Oct 2023 19:27:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12516: Set HDFS limit based on memlock

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20623 )

Change subject: IMPALA-12516: Set HDFS limit based on memlock
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/14251/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20623
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7722ddd0c7fbd9bbd1979503952b7522b808194a
Gerrit-Change-Number: 20623
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Wed, 25 Oct 2023 19:33:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12415: Implement tests for graphical query timeline in webUI

2023-10-25 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20538 )

Change subject: IMPALA-12415: Implement tests for graphical query timeline in 
webUI
..


Patch Set 6:

(2 comments)

I could successfully run ./tests/run-js-tests.sh on my local machine.
Just one minor comment. Please also fix "line too long (98 > 90)" warning in 
run-js-tests.sh.

http://gerrit.cloudera.org:8080/#/c/20538/6/www/scripts/query_timeline/chart_commons.js
File www/scripts/query_timeline/chart_commons.js:

http://gerrit.cloudera.org:8080/#/c/20538/6/www/scripts/query_timeline/chart_commons.js@64
PS6, Line 64: var no_change;
nit: don't need to define this variable out of loop, define it directly in line 
#66


http://gerrit.cloudera.org:8080/#/c/20538/4/www/scripts/tests/query_timeline/chart_commons.test.js
File www/scripts/tests/query_timeline/chart_commons.test.js:

http://gerrit.cloudera.org:8080/#/c/20538/4/www/scripts/tests/query_timeline/chart_commons.test.js@24
PS4, Line 24: // Test whether the method correctly searches and maps indexes of 
counters based
:   // on counter_name
> I have added a test case where an error is thrown in case the counter is mi
Ack



--
To view, visit http://gerrit.cloudera.org:8080/20538
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0caf0a0beee23821f78c0b3fe1aeb7dbf92d6a3e
Gerrit-Change-Number: 20538
Gerrit-PatchSet: 6
Gerrit-Owner: Surya Hebbar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Surya Hebbar 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 25 Oct 2023 19:33:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12514: Remove dependency on jetty-server

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20616 )

Change subject: IMPALA-12514: Remove dependency on jetty-server
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9844/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/20616
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6cd2879ee88e0636186a27c3e16843a30ee9a738
Gerrit-Change-Number: 20616
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Wed, 25 Oct 2023 19:42:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12516: Set HDFS limit based on memlock

2023-10-25 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20623 )

Change subject: IMPALA-12516: Set HDFS limit based on memlock
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/20623/1/testdata/cluster/admin
File testdata/cluster/admin:

http://gerrit.cloudera.org:8080/#/c/20623/1/testdata/cluster/admin@249
PS1, Line 249:   let DATANODE_LOCKED_MEM=$(ulimit -l)*1000
What are some typical values for ulimit -l?


http://gerrit.cloudera.org:8080/#/c/20623/1/testdata/cluster/admin@251
PS1, Line 251:   # Provide 128MB of memory for HDFS caching.
 :   DATANODE_LOCKED_MEM=128000
128KB (dfs.datanode.max.locked.memory is measured in bytes)


http://gerrit.cloudera.org:8080/#/c/20623/1/testdata/cluster/node_templates/common/etc/hadoop/conf/hdfs-site.xml.tmpl
File testdata/cluster/node_templates/common/etc/hadoop/conf/hdfs-site.xml.tmpl:

http://gerrit.cloudera.org:8080/#/c/20623/1/testdata/cluster/node_templates/common/etc/hadoop/conf/hdfs-site.xml.tmpl@102
PS1, Line 102:   
Nit: Update comment



--
To view, visit http://gerrit.cloudera.org:8080/20623
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7722ddd0c7fbd9bbd1979503952b7522b808194a
Gerrit-Change-Number: 20623
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Wed, 25 Oct 2023 19:55:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12504: Split graphical query timeline script into es6 modules

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20588 )

Change subject: IMPALA-12504: Split graphical query timeline script into es6 
modules
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/20588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id9d62a799d838876509686a75ecea778b2c72cc6
Gerrit-Change-Number: 20588
Gerrit-PatchSet: 3
Gerrit-Owner: Surya Hebbar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Surya Hebbar 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 25 Oct 2023 20:07:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12504: Split graphical query timeline script into es6 modules

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20588 )

Change subject: IMPALA-12504: Split graphical query timeline script into es6 
modules
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9845/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/20588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id9d62a799d838876509686a75ecea778b2c72cc6
Gerrit-Change-Number: 20588
Gerrit-PatchSet: 3
Gerrit-Owner: Surya Hebbar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Surya Hebbar 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 25 Oct 2023 20:07:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12504: Split graphical query timeline script into es6 modules

2023-10-25 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20588 )

Change subject: IMPALA-12504: Split graphical query timeline script into es6 
modules
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/20588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id9d62a799d838876509686a75ecea778b2c72cc6
Gerrit-Change-Number: 20588
Gerrit-PatchSet: 2
Gerrit-Owner: Surya Hebbar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Surya Hebbar 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 25 Oct 2023 20:06:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12308: DIRECTED distribution mode for V2 Iceberg tables

2023-10-25 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20548 )

Change subject: IMPALA-12308: DIRECTED distribution mode for V2 Iceberg tables
..


Patch Set 4:

(3 comments)

Thanks for applying the changes. I quickly went through the changes since PS4 
and left some comments/answers.

http://gerrit.cloudera.org:8080/#/c/20548/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20548/4//COMMIT_MSG@26
PS4, Line 26: Hence,
: V2 read optimization is turned off for table sampling.
> I also considered just silently dropping the delete rows when I don't find
About mixing MoR and CoW:

Let's assume the user does MoR first, and creates a delete file that refers to 
two data files. Then the user does CoW and rewrite one of the data files. In 
that case I *assume* that the delete file remain in the new snapshot, but now 
only one of the associated data files will be part of the new snapshot.


http://gerrit.cloudera.org:8080/#/c/20548/4/be/src/runtime/krpc-data-stream-sender.h
File be/src/runtime/krpc-data-stream-sender.h:

http://gerrit.cloudera.org:8080/#/c/20548/4/be/src/runtime/krpc-data-stream-sender.h@113
PS4, Line 113: M
> it's not required to use a comma before an 'and' when you list things.
I now, but:
 * the Oxford comma makes the separation of list items clearer
 * "Base" also used the Oxford comma
 * Most importantly, here it would emphasize that the sentence in parentheses 
only refers to the last item in the list (i.e. only DIRECTED is used for 
sending delete rows)

But I see that it is a very subjective matter, so feel free to keep it as is if 
you like it better.


http://gerrit.cloudera.org:8080/#/c/20548/4/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

http://gerrit.cloudera.org:8080/#/c/20548/4/be/src/runtime/query-state.h@126
PS4, Line 126: std::pair, bool>
> Good point, but one thing that prevented me from introducing a struct is th
I see, but looking at the code this whole std::unordered_map, bool>> part might not be necessary.
You could replace it with a struct:

 struct FileSchedulingInfo {
   std::string filename;
   std::vector;
   bool is_relative;
 };

Because IIUC you never use by_node_filename_to_hosts to lookup the hosts by 
filename. You always iterate over the whole map, e.g. here:

  for (auto file_to_hosts : node_to_files_it->second) {
string full_file_path = file_to_hosts.first;
bool is_relative_path = file_to_hosts.second.second;
if (is_relative_path) {
  full_file_path = kudu::JoinPathSegments(
  hdfs_table_desc->hdfs_base_dir(), file_to_hosts.first);
}
DCHECK(!full_file_path.empty());

sink_config_->filename_to_hosts_[full_file_path] = 
file_to_hosts.second.first;
  }
}

So ByNodeFilenameToHosts could be just:

 typedef std::unordered_map> 
NodeToFileSchedulings;



--
To view, visit http://gerrit.cloudera.org:8080/20548
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I212afd7c9e94551a1c50a40ccb0e3c1f7ecdf3d2
Gerrit-Change-Number: 20548
Gerrit-PatchSet: 4
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 25 Oct 2023 20:30:34 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12373: Small String Optimization for StringValue

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20496 )

Change subject: IMPALA-12373: Small String Optimization for StringValue
..


Patch Set 8:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9846/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/20496
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I741c3a5f12ab620b6b64b57d4c89b5f8e056efd3
Gerrit-Change-Number: 20496
Gerrit-PatchSet: 8
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 25 Oct 2023 20:35:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12516: Set HDFS limit based on memlock

2023-10-25 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20623 )

Change subject: IMPALA-12516: Set HDFS limit based on memlock
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/20623/1/testdata/cluster/admin
File testdata/cluster/admin:

http://gerrit.cloudera.org:8080/#/c/20623/1/testdata/cluster/admin@249
PS1, Line 249:   let DATANODE_LOCKED_MEM=$(ulimit -l)*1000
> What are some typical values for ulimit -l?
Every OS I've checked has been 64 (for 64 KiB).


http://gerrit.cloudera.org:8080/#/c/20623/1/testdata/cluster/admin@251
PS1, Line 251:   # Provide 128MB of memory for HDFS caching.
 :   DATANODE_LOCKED_MEM=128000
> 128KB (dfs.datanode.max.locked.memory is measured in bytes)
Ack


http://gerrit.cloudera.org:8080/#/c/20623/1/testdata/cluster/node_templates/common/etc/hadoop/conf/hdfs-site.xml.tmpl
File testdata/cluster/node_templates/common/etc/hadoop/conf/hdfs-site.xml.tmpl:

http://gerrit.cloudera.org:8080/#/c/20623/1/testdata/cluster/node_templates/common/etc/hadoop/conf/hdfs-site.xml.tmpl@102
PS1, Line 102:   
> Nit: Update comment
Done



--
To view, visit http://gerrit.cloudera.org:8080/20623
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7722ddd0c7fbd9bbd1979503952b7522b808194a
Gerrit-Change-Number: 20623
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Wed, 25 Oct 2023 20:41:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12515: Build modules for extra pythons

2023-10-25 Thread Michael Smith (Code Review)
Hello Quanlong Huang, Joe McDonnell, Wenzhe Zhou, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20617

to look at the new patch set (#2).

Change subject: IMPALA-12515: Build modules for extra pythons
..

IMPALA-12515: Build modules for extra pythons

Adds IMPALA_EXTRA_PACKAGE_PYTHONS to build impala-shell tarball
dependencies for additional Python targets. That can be used to build a
tarball that supports multiple Python 3 minor versions at once.

Updates the impala-shell script to provide a clear error message when
attempting to use the tarball with a Python version that it hasn't been
built for.

Change-Id: I13720a9e3c50f348bef41f5e91f810204e416f13
---
R bin/cmake_aux/create_virtualenv.sh
M bin/impala-config.sh
M shell/CMakeLists.txt
M shell/impala-shell
M shell/make_shell_tarball.sh
5 files changed, 90 insertions(+), 81 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/17/20617/2
--
To view, visit http://gerrit.cloudera.org:8080/20617
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I13720a9e3c50f348bef41f5e91f810204e416f13
Gerrit-Change-Number: 20617
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-12515: Build modules for extra pythons

2023-10-25 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20617 )

Change subject: IMPALA-12515: Build modules for extra pythons
..


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/20617/1/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/20617/1/bin/impala-config.sh@228
PS1, Line 228: # Additional Python versions to package for
> Can you expand this comment a bit? Talk about how this is about the shell t
Done


http://gerrit.cloudera.org:8080/#/c/20617/1/shell/CMakeLists.txt
File shell/CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/20617/1/shell/CMakeLists.txt@47
PS1, Line 47:   if (${PYTHON_EXE} EQUAL $ENV{IMPALA_SYSTEM_PYTHON2})
: set(PYTHON2_VENV ${VENV})
: set(PYTHON2_PIP_CACHE ${PIP_CACHE})
:   elseif (${PYTHON_EXE} EQUAL $ENV{IMPALA_SYSTEM_PYTHON3})
: set(PYTHON3_VENV ${VENV})
: set(PYTHON3_PIP_CACHE ${PIP_CACHE})
:   endif()
> We also set PYTHON2_VENV etc down below next to the shell_pythonX_install t
Done


http://gerrit.cloudera.org:8080/#/c/20617/1/shell/impala-shell
File shell/impala-shell:

http://gerrit.cloudera.org:8080/#/c/20617/1/shell/impala-shell@51
PS1, Line 51:   echo "Python ${PYTHON_VERSION} is not supported, try 'pip 
install impala-shell'."
> I'm thinking about what exactly we want in the error message. We want to av
Done


http://gerrit.cloudera.org:8080/#/c/20617/1/shell/make_shell_tarball.sh
File shell/make_shell_tarball.sh:

http://gerrit.cloudera.org:8080/#/c/20617/1/shell/make_shell_tarball.sh@150
PS1, Line 150: --upgrade
> What does the "--upgrade" do? Do we need it?
It essentially suppresses a WARNING

  WARNING: Target directory 
/home/michael/Impala/shell/build/impala-shell-4.4.0-SNAPSHOT/ext-py2.7/thrift_sasl
 already exists. Specify --upgrade to force replacement.

when the list of python interpreters includes duplicates for the same minor 
version (such as python3 and python3.8).

I've added a comment.



--
To view, visit http://gerrit.cloudera.org:8080/20617
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I13720a9e3c50f348bef41f5e91f810204e416f13
Gerrit-Change-Number: 20617
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 25 Oct 2023 20:37:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12516: Set HDFS limit based on memlock

2023-10-25 Thread Michael Smith (Code Review)
Hello David Rorke, Joe McDonnell, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20623

to look at the new patch set (#2).

Change subject: IMPALA-12516: Set HDFS limit based on memlock
..

IMPALA-12516: Set HDFS limit based on memlock

With RHEL 8 on AWS Graviton instances,
dfs.datanode.max.locked.memory=64000 is insufficient to run
query_test/test_hdfs_caching.py::TestHdfsCaching::test_table_is_cached.
Sets it based on 'ulimit -l' so test environments can increase this
limit as needed.

Change-Id: I7722ddd0c7fbd9bbd1979503952b7522b808194a
---
M testdata/cluster/admin
M testdata/cluster/node_templates/common/etc/hadoop/conf/hdfs-site.xml.tmpl
2 files changed, 14 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/20623/2
--
To view, visit http://gerrit.cloudera.org:8080/20623
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7722ddd0c7fbd9bbd1979503952b7522b808194a
Gerrit-Change-Number: 20623
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 


[Impala-ASF-CR] IMPALA-12373: Small String Optimization for StringValue

2023-10-25 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20496 )

Change subject: IMPALA-12373: Small String Optimization for StringValue
..


Patch Set 8:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/20496/8/be/src/runtime/smallable-string.h
File be/src/runtime/smallable-string.h:

http://gerrit.cloudera.org:8080/#/c/20496/8/be/src/runtime/smallable-string.h@65
PS8, Line 65: // Invalid string values might have negative lengths. We also 
have backed tests
nit: backend tests


http://gerrit.cloudera.org:8080/#/c/20496/8/be/src/runtime/smallable-string.h@90
PS8, Line 90: // Invalid string values might have negative lengths. We also 
have backed tests
nit: backend tests



--
To view, visit http://gerrit.cloudera.org:8080/20496
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I741c3a5f12ab620b6b64b57d4c89b5f8e056efd3
Gerrit-Change-Number: 20496
Gerrit-PatchSet: 8
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 25 Oct 2023 21:11:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3825: Delegate runtime filter aggregation to some executors

2023-10-25 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20612 )

Change subject: IMPALA-3825: Delegate runtime filter aggregation to some 
executors
..


Patch Set 5:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/20612/3/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/20612/3/be/src/runtime/coordinator.cc@488
PS3, Line 488:   const RuntimeFilterAggregatorInfoPB& agg_info =
> This is the translation from RuntimeFilterAggregatorInfoPB (FragmentExecPar
Oh, I think part of my confusion is that "to report" sounds like a mapping, but 
it's actually an action. The comment helps.


http://gerrit.cloudera.org:8080/#/c/20612/5/be/src/runtime/runtime-filter-bank.h
File be/src/runtime/runtime-filter-bank.h:

http://gerrit.cloudera.org:8080/#/c/20612/5/be/src/runtime/runtime-filter-bank.h@243
PS5, Line 243: /// Return true if all filter updates has been received.
nit: updates have been


http://gerrit.cloudera.org:8080/#/c/20612/5/be/src/runtime/runtime-filter-bank.h@246
PS5, Line 246: /// Return number filter updates that has been received.
nit: updates that have been


http://gerrit.cloudera.org:8080/#/c/20612/3/be/src/runtime/runtime-filter-bank.cc
File be/src/runtime/runtime-filter-bank.cc:

http://gerrit.cloudera.org:8080/#/c/20612/3/be/src/runtime/runtime-filter-bank.cc@281
PS3, Line 281: DCHECK(!pending_merge_filter->AlwaysFalse());
> That will require checking the underlying buffer if all the bits have becom
Ack


http://gerrit.cloudera.org:8080/#/c/20612/3/be/src/runtime/runtime-filter-bank.cc@690
PS3, Line 690:   if (query_state_->query_options().runtime_filter_wait_time_ms 
> 0) {
> cancelled_ can turn from False to True once through CancelLocked(). Once it
Still seems like that should use an atomic then.


http://gerrit.cloudera.org:8080/#/c/20612/3/be/src/service/data-stream-service.cc
File be/src/service/data-stream-service.cc:

http://gerrit.cloudera.org:8080/#/c/20612/3/be/src/service/data-stream-service.cc@139
PS3, Line 139: // query_id has complete their execution.
> Done. Also added comment.
Based on the comment, it probably doesn't need to be a warning.



--
To view, visit http://gerrit.cloudera.org:8080/20612
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I11d38ed0f223d6e5b32a19ebe725af7738ee4ab0
Gerrit-Change-Number: 20612
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 25 Oct 2023 21:08:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12390 (part 4): Enable unnecessary-value-param

2023-10-25 Thread Michael Smith (Code Review)
Hello Daniel Becker, gsi...@cloudera.com, Joe McDonnell, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20494

to look at the new patch set (#8).

Change subject: IMPALA-12390 (part 4): Enable unnecessary-value-param
..

IMPALA-12390 (part 4): Enable unnecessary-value-param

Enables the clang-tidy performance-unnecessary-value-param check and
fixes any issues found with run_clang_tidy.sh.

Updates based on how values are used:
- constructors and functions that take ownership of a value accept by
  value and move into place
- others take a const& of the value
- const shared_ptr& should be rare, as it's more flexible to take a
  pointer to the value and deref the shared_ptr when calling

Some updates required cascading changes to other calls. I attempted to
dig down through how a value was used to determine whether other calls
should be updated to take by value or reference.

Change-Id: I8aa5d98596d82f615a0a728e0235e7dd9d8b5003
---
M .clang-tidy
M be/src/benchmarks/lock-benchmark.cc
M be/src/benchmarks/parquet-delta-benchmark.cc
M be/src/benchmarks/row-batch-serialize-benchmark.cc
M be/src/catalog/catalog-service-client-wrapper.h
M be/src/codegen/codegen-anyval-read-write-info.cc
M be/src/codegen/codegen-anyval-read-write-info.h
M be/src/codegen/codegen-symbol-emitter.h
M be/src/codegen/llvm-codegen-cache-test.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/common/init.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/json/json-parser.h
M be/src/exec/parquet/hdfs-parquet-scanner.h
M be/src/exec/parquet/hdfs-parquet-table-writer.cc
M be/src/exec/parquet/parquet-column-chunk-reader.cc
M be/src/exec/parquet/parquet-common-test.cc
M be/src/exec/parquet/parquet-delta-coder-test.cc
M be/src/exec/parquet/parquet-page-reader.h
M be/src/exec/scanner-context.h
M be/src/exprs/expr-test.cc
M be/src/rpc/TAcceptQueueServer.cpp
M be/src/rpc/TAcceptQueueServer.h
M be/src/rpc/authentication.cc
M be/src/rpc/rpc-mgr-test.h
M be/src/rpc/thrift-server-test.cc
M be/src/rpc/thrift-thread.cc
M be/src/rpc/thrift-thread.h
M be/src/rpc/thrift-util.cc
M be/src/runtime/bufferpool/buffer-allocator.cc
M be/src/runtime/bufferpool/buffer-allocator.h
M be/src/runtime/bufferpool/buffer-pool-internal.h
M be/src/runtime/client-cache.cc
M be/src/runtime/client-cache.h
M be/src/runtime/exec-env.cc
M be/src/runtime/io/data-cache-trace-replayer.cc
M be/src/runtime/io/data-cache-trace-test.cc
M be/src/runtime/io/data-cache-trace.cc
M be/src/runtime/io/data-cache-trace.h
M be/src/runtime/io/data-cache.cc
M be/src/runtime/io/data-cache.h
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/request-ranges.h
M be/src/runtime/krpc-data-stream-sender.cc
M be/src/runtime/mem-tracker.cc
M be/src/runtime/multi-precision.h
M be/src/runtime/outbound-row-batch.h
M be/src/runtime/query-driver.cc
M be/src/runtime/query-driver.h
M be/src/runtime/reservation-manager.cc
M be/src/runtime/reservation-manager.h
M be/src/runtime/row-batch-serialize-test.cc
M be/src/runtime/runtime-filter-bank.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/thread-resource-mgr.cc
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/scheduling/admission-control-service.cc
M be/src/scheduling/admission-control-service.h
M be/src/scheduling/admission-controller-test.cc
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/admission-controller.h
M be/src/scheduling/admissiond-env.cc
M be/src/scheduling/cluster-membership-mgr.cc
M be/src/scheduling/cluster-membership-mgr.h
M be/src/scheduling/executor-group.cc
M be/src/scheduling/hash-ring-test.cc
M be/src/scheduling/scheduler-test-util.cc
M be/src/scheduling/scheduler-test-util.h
M be/src/service/client-request-state.cc
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/statestore/statestore-service-client-wrapper.h
M be/src/statestore/statestore-subscriber-client-wrapper.h
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore-subscriber.h
M be/src/statestore/statestore.h
M be/src/testutil/impalad-query-executor.h
M be/src/transport/THttpServer.cpp
M be/src/transport/THttpServer.h
M be/src/transport/THttpTransport.cpp
M be/src/transport/TSaslClientTransport.cpp
M be/src/transport/TSaslServerTransport.cpp
M be/src/transport/TSaslServerTransport.h
M be/src/transport/TSaslTransport.cpp
M be/src/util/bit-packing-test.cc
M be/src/util/bloom-filter-test.cc
M be/src/util/cache/lirs-cache.cc
M be/src/util/debug-util.cc
M be/src/util/debug-util.h
M be/src/util/decimal-util.h
M be/src/util/disk-info.cc
M be/src/util/disk-info.h
M be/src/util/in-list-filter-test.cc
M be/src/util/in-list-filter.cc
M be/src/util/internal-queue.h
M

[Impala-ASF-CR] IMPALA-12515: Build modules for extra pythons

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20617 )

Change subject: IMPALA-12515: Build modules for extra pythons
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/14252/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20617
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I13720a9e3c50f348bef41f5e91f810204e416f13
Gerrit-Change-Number: 20617
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 25 Oct 2023 21:25:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12516: Set HDFS limit based on memlock

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20623 )

Change subject: IMPALA-12516: Set HDFS limit based on memlock
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/14253/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20623
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7722ddd0c7fbd9bbd1979503952b7522b808194a
Gerrit-Change-Number: 20623
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Wed, 25 Oct 2023 21:31:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12390 (part 4): Enable unnecessary-value-param

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20494 )

Change subject: IMPALA-12390 (part 4): Enable unnecessary-value-param
..


Patch Set 8:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/14254/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20494
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8aa5d98596d82f615a0a728e0235e7dd9d8b5003
Gerrit-Change-Number: 20494
Gerrit-PatchSet: 8
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Wed, 25 Oct 2023 21:46:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12415: Implement tests for graphical query timeline in webUI

2023-10-25 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20538 )

Change subject: IMPALA-12415: Implement tests for graphical query timeline in 
webUI
..


Patch Set 6:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/20538/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20538/6//COMMIT_MSG@48
PS6, Line 48: Unit tests produce JUnitXML for integration with jenkins jobs,
: these are stored in ${IMPALA_JS_TEST_LOGS_DIR}.
I tried to run Jenkins job with BE_TEST. >From console output, I could see JS 
Tests has been ran successfully. But the JS Tests result is not shown Test 
Result page. You need to extend the code for Jenkins job to show the JS Test 
result.


http://gerrit.cloudera.org:8080/#/c/20538/6/bin/run-all-tests.sh
File bin/run-all-tests.sh:

http://gerrit.cloudera.org:8080/#/c/20538/6/bin/run-all-tests.sh@64
PS6, Line 64: # Run JS tests
: : ${JS_TEST:=${BE_TEST}}
It's fine to set it BE_TEST now. I would like to have a separate setting for JS 
Tests in future.


http://gerrit.cloudera.org:8080/#/c/20538/6/tests/run-js-tests.sh
File tests/run-js-tests.sh:

http://gerrit.cloudera.org:8080/#/c/20538/6/tests/run-js-tests.sh@40
PS6, Line 40:   curl 
https://nodejs.org/dist/${NODEJS_VERSION}/node-${NODEJS_VERSION}-${NODEJS_DISTRO}.tar.xz
 -O
> line too long (98 > 90)
You can break this down using local variable so the line can be shorter.


http://gerrit.cloudera.org:8080/#/c/20538/6/tests/run-js-tests.sh@49
PS6, Line 49:
nit: two more spaces



--
To view, visit http://gerrit.cloudera.org:8080/20538
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0caf0a0beee23821f78c0b3fe1aeb7dbf92d6a3e
Gerrit-Change-Number: 20538
Gerrit-PatchSet: 6
Gerrit-Owner: Surya Hebbar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Surya Hebbar 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 25 Oct 2023 22:11:51 +
Gerrit-HasComments: Yes


[native-toolchain-CR] Add secondary S3 upload location for toolchain packages

2023-10-25 Thread Laszlo Gaal (Code Review)
Hello Joe McDonnell, Michael Smith,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20549

to look at the new patch set (#3).

Change subject: Add secondary S3 upload location for toolchain packages
..

Add secondary S3 upload location for toolchain packages

All Apache Impala precommit activities happen in AWS EC2, in their
us-west-2 region, while the native-toolchain bucket hosting the
binary toolchain packages is located in us-west-1 for historical
reasons. This adds a constant cost to every single build launched on the
upstream infrastucture.

This patch adds a secondary upload location to the package upload logic
to allow a second, "mirroring" bucket to be configured, which is
expected to be closer to the build location network-wise. The mirror
bucket is currently expected to be located in EC2's us-west-2 region,
but this is not specified in the upload code.

The secondary upload location is expected in a new environment variable,
S3_MIRROR_BUCKET.

Note: One important feature of the canonical toolchain bucket,
s3://native-toolchain, is to store the source packages for the toolchain
components. This function is not duplicated in the mirror buckets, so
toolchain builds will still download their source packages from
s3://native-toolchain, regardless of any upload location (primary or
mirror) specified.

The patch fixes a random typo as well.

Tested by running a full toolchain build, uploading the binary packages
to two temporary test buckets in S3, then comparing the result across
the two buckets. When this passed, a core-mode test pass was executed on
the Impala upstream master branch.

Change-Id: I3aabaaf699dde865efe11e600f4e78aeca550378
---
M functions.sh
M in-docker.py
2 files changed, 19 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/49/20549/3
--
To view, visit http://gerrit.cloudera.org:8080/20549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3aabaaf699dde865efe11e600f4e78aeca550378
Gerrit-Change-Number: 20549
Gerrit-PatchSet: 3
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Smith 


[native-toolchain-CR] Add secondary S3 upload location for toolchain packages

2023-10-25 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20549 )

Change subject: Add secondary S3 upload location for toolchain packages
..


Patch Set 3: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20549/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20549/3//COMMIT_MSG@13
PS3, Line 13: upstream infrastucture.
nit: The prior spelling was correct.



--
To view, visit http://gerrit.cloudera.org:8080/20549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3aabaaf699dde865efe11e600f4e78aeca550378
Gerrit-Change-Number: 20549
Gerrit-PatchSet: 3
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Wed, 25 Oct 2023 22:17:38 +
Gerrit-HasComments: Yes


[native-toolchain-CR] Add secondary S3 upload location for toolchain packages

2023-10-25 Thread Laszlo Gaal (Code Review)
Hello Joe McDonnell, Michael Smith,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20549

to look at the new patch set (#4).

Change subject: Add secondary S3 upload location for toolchain packages
..

Add secondary S3 upload location for toolchain packages

All Apache Impala precommit activities happen in AWS EC2, in their
us-west-2 region, while the native-toolchain bucket hosting the
binary toolchain packages is located in us-west-1 for historical
reasons. This adds a constant cost to every single build launched on the
upstream infrastucture.

This patch adds a secondary upload location to the package upload logic
to allow a second, "mirroring" bucket to be configured, which is
expected to be closer to the build location network-wise. The mirror
bucket is currently expected to be located in EC2's us-west-2 region,
but this is not specified in the upload code.

The secondary upload location is expected in a new environment variable,
S3_MIRROR_BUCKET.

Note: One important feature of the canonical toolchain bucket,
s3://native-toolchain, is to store the source packages for the toolchain
components. This function is not duplicated in the mirror buckets, so
toolchain builds will still download their source packages from
s3://native-toolchain, regardless of any upload location (primary or
mirror) specified.

The patch fixes a random typo as well.

Tested by running a full toolchain build, uploading the binary packages
to two temporary test buckets in S3, then comparing the result across
the two buckets. When this passed, a core-mode test pass was executed on
the Impala upstream master branch.

Change-Id: I3aabaaf699dde865efe11e600f4e78aeca550378
---
M functions.sh
M in-docker.py
2 files changed, 19 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/49/20549/4
--
To view, visit http://gerrit.cloudera.org:8080/20549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3aabaaf699dde865efe11e600f4e78aeca550378
Gerrit-Change-Number: 20549
Gerrit-PatchSet: 4
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Smith 


[native-toolchain-CR] Add secondary S3 upload location for toolchain packages

2023-10-25 Thread Laszlo Gaal (Code Review)
Hello Joe McDonnell, Michael Smith,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20549

to look at the new patch set (#5).

Change subject: Add secondary S3 upload location for toolchain packages
..

Add secondary S3 upload location for toolchain packages

All Apache Impala precommit activities happen in AWS EC2, in their
us-west-2 region, while the native-toolchain bucket hosting the
binary toolchain packages is located in us-west-1 for historical
reasons. This adds a constant cost to every single build launched on the
upstream infrastructure.

This patch adds a secondary upload location to the package upload logic
to allow a second, "mirroring" bucket to be configured, which is
expected to be closer to the build location network-wise. The mirror
bucket is currently expected to be located in EC2's us-west-2 region,
but this is not specified in the upload code.

The secondary upload location is expected in a new environment variable,
S3_MIRROR_BUCKET.

Note: One important feature of the canonical toolchain bucket,
s3://native-toolchain, is to store the source packages for the toolchain
components. This function is not duplicated in the mirror buckets, so
toolchain builds will still download their source packages from
s3://native-toolchain, regardless of any upload location (primary or
mirror) specified.

The patch fixes a random typo as well.

Tested by running a full toolchain build, uploading the binary packages
to two temporary test buckets in S3, then comparing the result across
the two buckets. When this passed, a core-mode test pass was executed on
the Impala upstream master branch.

Change-Id: I3aabaaf699dde865efe11e600f4e78aeca550378
---
M functions.sh
M in-docker.py
2 files changed, 19 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/49/20549/5
--
To view, visit http://gerrit.cloudera.org:8080/20549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3aabaaf699dde865efe11e600f4e78aeca550378
Gerrit-Change-Number: 20549
Gerrit-PatchSet: 5
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Smith 


[native-toolchain-CR] Add secondary S3 upload location for toolchain packages

2023-10-25 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20549 )

Change subject: Add secondary S3 upload location for toolchain packages
..


Patch Set 5: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/20549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3aabaaf699dde865efe11e600f4e78aeca550378
Gerrit-Change-Number: 20549
Gerrit-PatchSet: 5
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Wed, 25 Oct 2023 22:37:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12415: Implement tests for graphical query timeline in webUI

2023-10-25 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20538 )

Change subject: IMPALA-12415: Implement tests for graphical query timeline in 
webUI
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20538/6/tests/run-js-tests.sh
File tests/run-js-tests.sh:

http://gerrit.cloudera.org:8080/#/c/20538/6/tests/run-js-tests.sh@33
PS6, Line 33: IMPALA_JS_TEST_LOGS_DIR
js-tests.xml was generated in this folder when ran this script. But I did not 
see any log file generated in this folder. Is there any log messages generated 
by JS Test which should be re-directed to this folder?



--
To view, visit http://gerrit.cloudera.org:8080/20538
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0caf0a0beee23821f78c0b3fe1aeb7dbf92d6a3e
Gerrit-Change-Number: 20538
Gerrit-PatchSet: 6
Gerrit-Owner: Surya Hebbar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Surya Hebbar 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 25 Oct 2023 22:50:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9118: Show catalog operation details in catalogd webUI

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/20428 )

Change subject: IMPALA-9118: Show catalog operation details in catalogd webUI
..

IMPALA-9118: Show catalog operation details in catalogd webUI

This patch extends the /operations page in catalogd WebUI to show the
in-flight and finished catalog operations. The following fields are
shown for each operation:
 - Thread ID
 - Query ID
 - Client IP
 - Coordinator
 - Type
 - Target
 - User
 - Start Time
 - End Time (only shown for finished operations)
 - Duration
 - Status
 - Details

Finished operation records are currently kept in memory and limited by
the catalog_operation_log_size flag (defaults to 100).

To collect the above fields, this patch extends
TCatalogServiceRequestHeader to contain the coordinator hostname. Also
fix some catalog RPCs that didn't fill these fields.

Tests:
 - Add e2e test in custom_cluster/test_web_pages.py
 - Manually verify the web pages when running a GVO job

Change-Id: I3cf3f0da2be2be79e546762a8083d4de338ff6aa
Reviewed-on: http://gerrit.cloudera.org:8080/20428
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-server.h
M be/src/exec/catalog-op-executor.cc
M be/src/exec/catalog-op-executor.h
M be/src/service/client-request-state.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/CatalogService.thrift
M common/thrift/JniCatalog.thrift
M fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M 
fe/src/main/java/org/apache/impala/catalog/monitor/CatalogFinalizeDmlCounter.java
M fe/src/main/java/org/apache/impala/catalog/monitor/CatalogMonitor.java
M 
fe/src/main/java/org/apache/impala/catalog/monitor/CatalogOperationCounter.java
D 
fe/src/main/java/org/apache/impala/catalog/monitor/CatalogOperationMetrics.java
A 
fe/src/main/java/org/apache/impala/catalog/monitor/CatalogOperationTracker.java
M 
fe/src/main/java/org/apache/impala/catalog/monitor/CatalogResetMetadataCounter.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M tests/custom_cluster/test_web_pages.py
M www/catalog_operations.tmpl
23 files changed, 635 insertions(+), 199 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/20428
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I3cf3f0da2be2be79e546762a8083d4de338ff6aa
Gerrit-Change-Number: 20428
Gerrit-PatchSet: 9
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-9118: Show catalog operation details in catalogd webUI

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20428 )

Change subject: IMPALA-9118: Show catalog operation details in catalogd webUI
..


Patch Set 8: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/20428
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3cf3f0da2be2be79e546762a8083d4de338ff6aa
Gerrit-Change-Number: 20428
Gerrit-PatchSet: 8
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 25 Oct 2023 23:14:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12517: Decode binary data with Python 3

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20624 )

Change subject: IMPALA-12517: Decode binary data with Python 3
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20624/1/shell/value_converter.py
File shell/value_converter.py:

http://gerrit.cloudera.org:8080/#/c/20624/1/shell/value_converter.py@42
PS1, Line 42: e
flake8: E131 continuation line unaligned for hanging indent



--
To view, visit http://gerrit.cloudera.org:8080/20624
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9222cd1ac081a38ab2b37d58628faac0812695ec
Gerrit-Change-Number: 20624
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Wed, 25 Oct 2023 23:45:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12517: Decode binary data with Python 3

2023-10-25 Thread Michael Smith (Code Review)
Michael Smith has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/20624


Change subject: IMPALA-12517: Decode binary data with Python 3
..

IMPALA-12517: Decode binary data with Python 3

When impala-shell receives binary data with the HS2 protocol, it uses a
stringifier to decode it. In Python 3, 'str' on binary data wraps it in
"b'...'"; to get equivalent output to 'str' in Python 2, we need to
decode as UTF-8 and handle errors.

Adds a test case for how impala-shell formats binary data.

Change-Id: I9222cd1ac081a38ab2b37d58628faac0812695ec
---
M shell/value_converter.py
M tests/shell/test_shell_commandline.py
2 files changed, 14 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/20624/1
--
To view, visit http://gerrit.cloudera.org:8080/20624
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I9222cd1ac081a38ab2b37d58628faac0812695ec
Gerrit-Change-Number: 20624
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Smith 


[Impala-ASF-CR] IMPALA-12517: Decode binary data with Python 3

2023-10-25 Thread Michael Smith (Code Review)
Hello Quanlong Huang, Joe McDonnell, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20624

to look at the new patch set (#2).

Change subject: IMPALA-12517: Decode binary data with Python 3
..

IMPALA-12517: Decode binary data with Python 3

When impala-shell receives binary data with the HS2 protocol, it uses a
stringifier to decode it. In Python 3, 'str' on binary data wraps it in
"b'...'"; to get equivalent output to 'str' in Python 2, we need to
decode as UTF-8 and handle errors.

Adds a test case for how impala-shell formats binary data.

Change-Id: I9222cd1ac081a38ab2b37d58628faac0812695ec
---
M shell/value_converter.py
M tests/shell/test_shell_commandline.py
2 files changed, 18 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/20624/2
--
To view, visit http://gerrit.cloudera.org:8080/20624
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9222cd1ac081a38ab2b37d58628faac0812695ec
Gerrit-Change-Number: 20624
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-12514: Remove dependency on jetty-server

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20616 )

Change subject: IMPALA-12514: Remove dependency on jetty-server
..


Patch Set 1: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/9844/


--
To view, visit http://gerrit.cloudera.org:8080/20616
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6cd2879ee88e0636186a27c3e16843a30ee9a738
Gerrit-Change-Number: 20616
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Thu, 26 Oct 2023 00:08:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12517: Decode binary data with Python 3

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20624 )

Change subject: IMPALA-12517: Decode binary data with Python 3
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/14255/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20624
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9222cd1ac081a38ab2b37d58628faac0812695ec
Gerrit-Change-Number: 20624
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 26 Oct 2023 00:12:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12517: Decode binary data with Python 3

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20624 )

Change subject: IMPALA-12517: Decode binary data with Python 3
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/14256/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20624
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9222cd1ac081a38ab2b37d58628faac0812695ec
Gerrit-Change-Number: 20624
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 26 Oct 2023 00:19:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12504: Split graphical query timeline script into es6 modules

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20588 )

Change subject: IMPALA-12504: Split graphical query timeline script into es6 
modules
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/20588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id9d62a799d838876509686a75ecea778b2c72cc6
Gerrit-Change-Number: 20588
Gerrit-PatchSet: 3
Gerrit-Owner: Surya Hebbar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Surya Hebbar 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 26 Oct 2023 00:28:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12504: Split graphical query timeline script into es6 modules

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/20588 )

Change subject: IMPALA-12504: Split graphical query timeline script into es6 
modules
..

IMPALA-12504: Split graphical query timeline script into es6 modules

This patch divides the query timeline script into multiple modules
for better maintainability and to help writing unit tests.

The patch also improves parsing of utilization values from the
query profile, as the order and name of counters may change within
the profile, to ensure appropriate mappings between plotted readings,
initially the profile is traversed to find the correct mapping.

Change-Id: Id9d62a799d838876509686a75ecea778b2c72cc6
Reviewed-on: http://gerrit.cloudera.org:8080/20588
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M www/query_timeline.tmpl
A www/scripts/query_timeline/chart_commons.js
A www/scripts/query_timeline/fragment_diagram.js
A www/scripts/query_timeline/global_members.js
A www/scripts/query_timeline/host_utilization_diagram.js
M www/scripts/util.js
6 files changed, 1,042 insertions(+), 730 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/20588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Id9d62a799d838876509686a75ecea778b2c72cc6
Gerrit-Change-Number: 20588
Gerrit-PatchSet: 4
Gerrit-Owner: Surya Hebbar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Surya Hebbar 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-12373: Small String Optimization for StringValue

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20496 )

Change subject: IMPALA-12373: Small String Optimization for StringValue
..


Patch Set 8:

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/9846/


--
To view, visit http://gerrit.cloudera.org:8080/20496
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I741c3a5f12ab620b6b64b57d4c89b5f8e056efd3
Gerrit-Change-Number: 20496
Gerrit-PatchSet: 8
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 26 Oct 2023 01:15:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3825: Delegate runtime filter aggregation to some executors

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20612 )

Change subject: IMPALA-3825: Delegate runtime filter aggregation to some 
executors
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20612/6/tests/common/test_vector.py
File tests/common/test_vector.py:

http://gerrit.cloudera.org:8080/#/c/20612/6/tests/common/test_vector.py@179
PS6, Line 179: s
flake8: E501 line too long (92 > 90 characters)



--
To view, visit http://gerrit.cloudera.org:8080/20612
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I11d38ed0f223d6e5b32a19ebe725af7738ee4ab0
Gerrit-Change-Number: 20612
Gerrit-PatchSet: 6
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 26 Oct 2023 01:16:11 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3825: Delegate runtime filter aggregation to some executors

2023-10-25 Thread Riza Suminto (Code Review)
Hello Kurt Deschler, Abhishek Rawat, Csaba Ringhofer, Michael Smith, Impala 
Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/20612

to look at the new patch set (#6).

Change subject: IMPALA-3825: Delegate runtime filter aggregation to some 
executors
..

IMPALA-3825: Delegate runtime filter aggregation to some executors

IMPALA-4400 improve the runtime filter by aggregating runtime filters
locally before sending filter update to the coordinator and sharing a
single RuntimeFilterBank for all fragment instances in a query. However,
local filter aggregation is still insufficient if the number of nodes in
an impala cluster is large. For example, in a cluster of around 700
impalad backends, aggregation of 1 MB bloom filter updates in the
coordinator can exceed more than 1 second.

This patch aims to reduce coordinator load and speed up runtime filter
aggregation by doing intermediate aggregation in a few designated impala
backends before doing final aggregation and publishing in the
coordinator. Query option MAX_NUM_FILTER_AGGREGATOR is added to control
this feature. Given N as the number of backend executors excluding the
coordinator, the selected number of intermediate aggregators M =
min(MAX_NUM_FILTER_AGGREGATOR, N / 2). Setting MAX_NUM_FILTER_AGGREGATOR
<= 0 will disable the intermediate aggregator feature. In the backend
scheduler, M impalad will be selected randomly as the intermediate
aggregator for that runtime filter. Information of this M selected
impalad then passed from scheduler to coordinator as a
RuntimeFilterAggregatorInfoPB. The coordinator then converts the
RuntimeFilterAggregatorInfoPB into a filter routing information
TRuntimeFilterAggDesc that is piggy-backed in TRuntimeFilterSource.

A new RPC endpoint named UpdateFilterFromRemote is added in
data_stream_service.proto to handle filter updates from fellow impalad
executor to the designated aggregator impalad. This RPC will merge
filter updates into 'pending_remote_filter'. The intermediate aggregator
will then combine 'pending_remote_filter' with
'pending_merge_filter' (from local aggregation) into 'result_filter'
which then send to the coordinator. Note that due to the random
selection of M impalad, an additional memory reservation for
'pending_remote_filter' is added for all impalad, even when only M
impalad is doing the intermediate aggregation per runtime filter.

This patch currently targets the bloom filter produced by partitioned
join build only. Another kind of runtime filter is still efficient to
aggregate in coordinator only, while the bloom filter from broadcast
join only requires 1 valid filter update for publishing.

test_runtime_filters.py is modified to clarify the exec_options
dimension, test matrix constraints, and reduce pytest.skip() calls on
each test. runtime_filters.test is also changed to use counter
aggregation and assert on ExecSummary table so that they stay valid
irrespective of the number of fragment instances.

We benchmark the aggregation speed of 1 MB runtime filter aggregation on
20 executor nodes cluster with MT_DOP=36 that is instrumented to disable
local aggregation, simulating 720 runtime filter updates. The speed is
approximated as the duration between the earliest time a filter update
is made and the time that the coordinator publishes the complete filter.
The result is following:

+---++
| MAX_NUM_FILTER_AGGREGATOR | Aggregation speed (ms) |
+---++
| 0 |   1296 |
| 1 |   1229 |
| 2 |608 |
| 4 |329 |
| 8 |205 |
+---++

Testing:
- Exercise MAX_NUM_FILTER_AGGREGATOR in test_runtime_filters.py and
  query-options-test.cc
- Add custom_cluster/test_runtime_filter_aggregation.py.
- Pass exhaustive end-to-end and custom-cluster tests.

Change-Id: I11d38ed0f223d6e5b32a19ebe725af7738ee4ab0
---
M be/src/common/logging.h
M be/src/runtime/coordinator.cc
M be/src/runtime/data-stream-test.cc
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/runtime-filter-bank.cc
M be/src/runtime/runtime-filter-bank.h
M be/src/runtime/runtime-filter.cc
M be/src/runtime/runtime-filter.h
M be/src/scheduling/scheduler.cc
M be/src/scheduling/scheduler.h
M be/src/service/data-stream-service.cc
M be/src/service/data-stream-service.h
M be/src/service/query-options-test.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M be/src/util/bloom-filter.cc
M be/src/util/bloom-filter.h
M be/src/util/runtime-profile-counters.h
M common/protobuf/admission_control_service.proto
M common/protobuf/data_stream_service.proto
M common/th

[Impala-ASF-CR] IMPALA-3825: Delegate runtime filter aggregation to some executors

2023-10-25 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20612 )

Change subject: IMPALA-3825: Delegate runtime filter aggregation to some 
executors
..


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/20612/3/be/src/runtime/runtime-filter-bank.cc
File be/src/runtime/runtime-filter-bank.cc:

http://gerrit.cloudera.org:8080/#/c/20612/3/be/src/runtime/runtime-filter-bank.cc@690
PS3, Line 690: wait_time_ms = 
query_state_->query_options().runtime_filter_wait_time_ms;
> Still seems like that should use an atomic then.
Changed to hold lock before checking cancelled_.
This should be safe, because cancelled_ can only change if all locks are hold.


http://gerrit.cloudera.org:8080/#/c/20612/5/tests/common/test_dimensions.py
File tests/common/test_dimensions.py:

http://gerrit.cloudera.org:8080/#/c/20612/5/tests/common/test_dimensions.py@261
PS5, Line 261:   combinations = product(*(exec_option_dimensions[name] for name 
in keys))
 :   exec_option_dimension_values = [dict(zip(keys, prod)) for prod 
in combinations]
 :
> This is probably the reason why extra exec options are always declared with
Added ImpalaTestMatrix.add_exec_option_dimension() in patch set 5 for easier 
extra exec option declaration without losing pairwise combination.



--
To view, visit http://gerrit.cloudera.org:8080/20612
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I11d38ed0f223d6e5b32a19ebe725af7738ee4ab0
Gerrit-Change-Number: 20612
Gerrit-PatchSet: 6
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 26 Oct 2023 01:22:24 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3825: Delegate runtime filter aggregation to some executors

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20612 )

Change subject: IMPALA-3825: Delegate runtime filter aggregation to some 
executors
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/14257/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20612
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I11d38ed0f223d6e5b32a19ebe725af7738ee4ab0
Gerrit-Change-Number: 20612
Gerrit-PatchSet: 6
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 26 Oct 2023 01:43:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3825: Delegate runtime filter aggregation to some executors

2023-10-25 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20612 )

Change subject: IMPALA-3825: Delegate runtime filter aggregation to some 
executors
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20612/6/tests/common/test_vector.py
File tests/common/test_vector.py:

http://gerrit.cloudera.org:8080/#/c/20612/6/tests/common/test_vector.py@179
PS6, Line 179: s
> flake8: E501 line too long (92 > 90 characters)
These changes in test_vector.py is getting complex. It is probably worth to put 
it as a separate patch to make backport easier.



--
To view, visit http://gerrit.cloudera.org:8080/20612
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I11d38ed0f223d6e5b32a19ebe725af7738ee4ab0
Gerrit-Change-Number: 20612
Gerrit-PatchSet: 6
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 26 Oct 2023 01:54:08 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12398: Fix Ranger role not exists when altering db/table/view owner to a role

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20508 )

Change subject: IMPALA-12398: Fix Ranger role not exists when altering 
db/table/view owner to a role
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9847/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/20508
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2b029bdb90111dbd0eab5189360cc81090225cda
Gerrit-Change-Number: 20508
Gerrit-PatchSet: 4
Gerrit-Owner: ji chen 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: ji chen 
Gerrit-Comment-Date: Thu, 26 Oct 2023 03:53:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12518: Combine all exec option dimension in test vector.py

2023-10-25 Thread Riza Suminto (Code Review)
Riza Suminto has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/20625


Change subject: IMPALA-12518: Combine all exec_option dimension in 
test_vector.py
..

IMPALA-12518: Combine all exec_option dimension in test_vector.py

Before this patch, when writing pytest that exercise custom query option
values, we need to declare it by making new test dimension, followed by
deepcopying the original vector, and inserting the selected dimension
value into 'exec_option' dictionary in generated vector.

This patch simplify this steps by accounting dimensions that is intended
to be part of 'exec_option' and automatically combining them during
vector generation in test_vector.py. Such dimension should be registered
via the new ImpalaTestMatrix.add_exec_option_dimension() function.

function add_exec_option_dimension() in test_dimensions.py is renamed to
add_mandatory_exec_option() to make it consistent with the same
functionality in ImpalaTestMatrix and avoid confusion with the new
ImpalaTestMatrix.add_exec_option_dimension() function. Function name
add_exec_option_dimension() in test_dimensions.py is then repurposed as
a shorthand for ImpalaTestMatrix.add_exec_option_dimension().

The remaining changes for other pytest files will be done gradually.

Testing:
- Fix bug in TestIcebergV2Table and confirm that both True and False
  value for 'disable_optimized_iceberg_v2_read' options are exercised.
- Run and pass all modified tests in this patch.

Change-Id: I3adba260990fccf4d2f2e7c8c4e4fadc6fd43fe1
---
M tests/common/test_dimensions.py
M tests/common/test_vector.py
M tests/custom_cluster/test_kudu.py
M tests/query_test/test_async_codegen.py
M tests/query_test/test_iceberg.py
M tests/query_test/test_kudu.py
M tests/query_test/test_runtime_filters.py
7 files changed, 86 insertions(+), 31 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/20625/1
--
To view, visit http://gerrit.cloudera.org:8080/20625
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I3adba260990fccf4d2f2e7c8c4e4fadc6fd43fe1
Gerrit-Change-Number: 20625
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 


[Impala-ASF-CR] IMPALA-3825: Delegate runtime filter aggregation to some executors

2023-10-25 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20612 )

Change subject: IMPALA-3825: Delegate runtime filter aggregation to some 
executors
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20612/6/tests/common/test_vector.py
File tests/common/test_vector.py:

http://gerrit.cloudera.org:8080/#/c/20612/6/tests/common/test_vector.py@179
PS6, Line 179: s
> These changes in test_vector.py is getting complex. It is probably worth to
Filed a separate patch for this at https://gerrit.cloudera.org/c/20625/



--
To view, visit http://gerrit.cloudera.org:8080/20612
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I11d38ed0f223d6e5b32a19ebe725af7738ee4ab0
Gerrit-Change-Number: 20612
Gerrit-PatchSet: 6
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 26 Oct 2023 05:17:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12518: Combine all exec option dimension in test vector.py

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20625 )

Change subject: IMPALA-12518: Combine all exec_option dimension in 
test_vector.py
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/14258/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20625
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3adba260990fccf4d2f2e7c8c4e4fadc6fd43fe1
Gerrit-Change-Number: 20625
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 26 Oct 2023 05:40:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12509: Optimize the backend startup and planner time of large Iceberg table query

2023-10-25 Thread Fu Lili (Code Review)
Fu Lili has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/20601 )

Change subject: IMPALA-12509: Optimize the backend startup and planner time of 
large Iceberg table query
..

IMPALA-12509: Optimize the backend startup and planner time of large Iceberg 
table query

Remove some objects from TTableDescriptor that will not be used by
backend for iceberg table, which can significantly improve the speed
of Query Plan and backend start for query in large Iceberg tables.

Change-Id: I46a68f21903c16db3b02681feddb61ddce57879a
---
M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/IcebergPositionDeleteTable.java
M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/IcebergTimeTravelTable.java
M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java
7 files changed, 42 insertions(+), 30 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/20601/6
--
To view, visit http://gerrit.cloudera.org:8080/20601
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I46a68f21903c16db3b02681feddb61ddce57879a
Gerrit-Change-Number: 20601
Gerrit-PatchSet: 6
Gerrit-Owner: Fu Lili 
Gerrit-Reviewer: Fu Lili 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-12509: Optimize the backend startup and planner time of large Iceberg table query

2023-10-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20601 )

Change subject: IMPALA-12509: Optimize the backend startup and planner time of 
large Iceberg table query
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/14259/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20601
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I46a68f21903c16db3b02681feddb61ddce57879a
Gerrit-Change-Number: 20601
Gerrit-PatchSet: 6
Gerrit-Owner: Fu Lili 
Gerrit-Reviewer: Fu Lili 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 26 Oct 2023 06:42:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7131: Support external data sources in LocalCatalog mode

2023-10-25 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20574 )

Change subject: IMPALA-7131: Support external data sources in LocalCatalog mode
..


Patch Set 6: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/20574
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I40841c9be9064ac67771c4d3f5acbb3b552a2e55
Gerrit-Change-Number: 20574
Gerrit-PatchSet: 6
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 26 Oct 2023 06:54:30 +
Gerrit-HasComments: No