[Impala-ASF-CR] IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column

2024-05-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21377 )

Change subject: IMPALA-8042: Assign BETWEEN selectivity for discrete-unique 
column
..


Patch Set 9:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10669/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/21377
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib349d97349d1ee99788645a66be1b81749684d10
Gerrit-Change-Number: 21377
Gerrit-PatchSet: 9
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Fri, 24 May 2024 06:56:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column

2024-05-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21377 )

Change subject: IMPALA-8042: Assign BETWEEN selectivity for discrete-unique 
column
..


Patch Set 9: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21377
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib349d97349d1ee99788645a66be1b81749684d10
Gerrit-Change-Number: 21377
Gerrit-PatchSet: 9
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Fri, 24 May 2024 06:56:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-05-23 Thread Riza Suminto (Code Review)
Riza Suminto has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21455


Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..

IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

ExprValuesCache uses BATCH_SIZE as a deciding factor to set its
capacity. It bounds the capacity such that expr_values_array_ memory
usage stays below 256KB. This patch tightens that limit to include all
memory usage from ExprValuesCache::MemUsage() instead of
expr_values_array_ only. Therefore, setting a very high BATCH_SIZE will
not push the total memory usage of ExprValuesCache beyond 256KB.

Testing:
- Add test_join_queries.py::TestExprValueCache.
- Pass core tests.

Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
---
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
A testdata/workloads/tpcds_partitioned/queries
M tests/common/test_dimensions.py
M tests/query_test/test_join_queries.py
5 files changed, 45 insertions(+), 4 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/21455/1
--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 


[Impala-ASF-CR] IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column

2024-05-23 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21377 )

Change subject: IMPALA-8042: Assign BETWEEN selectivity for discrete-unique 
column
..


Patch Set 8: Code-Review+2

> Patch Set 7: Verified-1
>
> Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10667/

I forgot to update planner tests after making change in patch set 6.
Patch set 8 update those planner tests. No plan shape change, only slight 
reduction in cardinality and cost from dividing over num rows.
Carry +2.


--
To view, visit http://gerrit.cloudera.org:8080/21377
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib349d97349d1ee99788645a66be1b81749684d10
Gerrit-Change-Number: 21377
Gerrit-PatchSet: 8
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Fri, 24 May 2024 06:47:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB

2024-05-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21455 )

Change subject: IMPALA-13075: Cap memory usage for ExprValuesCache at 256KB
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21455/1/tests/query_test/test_join_queries.py
File tests/query_test/test_join_queries.py:

http://gerrit.cloudera.org:8080/#/c/21455/1/tests/query_test/test_join_queries.py@27
PS1, Line 27: from tests.common.test_dimensions import (
flake8: F401 'tests.common.test_dimensions.add_exec_option_dimension' imported 
but unused



--
To view, visit http://gerrit.cloudera.org:8080/21455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee27cbbe8d3100301d05a6516b62c45975a8d0e0
Gerrit-Change-Number: 21455
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 24 May 2024 06:50:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column

2024-05-23 Thread Riza Suminto (Code Review)
Hello Aman Sinha, Kurt Deschler, David Rorke, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21377

to look at the new patch set (#8).

Change subject: IMPALA-8042: Assign BETWEEN selectivity for discrete-unique 
column
..

IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column

Impala frontend can not evaluate BETWEEN/NOT BETWEEN predicate directly.
It needs to transform a BetweenPredicate into a CompoundPredicate
consisting of upper bound and lower bound BinaryPredicate through
BetweenToCompoundRule.java. The BinaryPredicate can then be pushed down
or rewritten into other form by another expression rewrite rule.
However, the selectivity of BetweenPredicate or its derivatives remains
unassigned and often collapses with other unknown selectivity predicates
to have collective selectivity equals Expr.DEFAULT_SELECTIVITY (0.1).

This patch adds a narrow optimization of BetweenPredicate selectivity
when the following criteria are met:

1. The BetweenPredicate is bound to a slot reference of a single column
   of a table.
2. The column type is discrete, such as INTEGER or DATE.
3. The column stats are available.
4. The column is sufficiently unique based on available stats.
5. The BETWEEN/NOT BETWEEN predicate is in good form (lower bound value
   <= upper bound value).
6. The final calculated selectivity is less than or equal to
   Expr.DEFAULT_SELECTIVITY.

If these criteria are unmet, the Planner will revert to the old
behavior, which is letting the selectivity unassigned.

Since this patch only target BetweenPredicate over unique column, the
following query will still have the default scan selectivity (0.1):

select count(*) from tpch.customer c
where c.c_custkey >= 1234 and c.c_custkey <= 2345;

While this equivalent query written with BETWEEN predicate will have
lower scan selectivity:

select count(*) from tpch.customer c
where c.c_custkey between 1234 and 2345;

This patch calculates the BetweenPredicate selectivity during
transformation at BetweenToCompoundRule.java. The selectivity is
piggy-backed into the resulting CompoundPredicate and BinaryPredicate as
betweenSelectivity_ field, separate from the selectivity_ field.
Analyzer.getBoundPredicates() is modified to prioritize the derived
BinaryPredicate over ordinary BinaryPredicate in its return value to
prevent the derived BinaryPredicate from being eliminated by a matching
ordinary BinaryPredicate.

Testing:
- Add table functional_parquet.unique_with_nulls.
- Add FE tests in ExprCardinalityTest#testBetweenSelectivity,
  ExprCardinalityTest#testNotBetweenSelectivity, and
  PlannerTest#testScanCardinality.
- Pass core tests.

Change-Id: Ib349d97349d1ee99788645a66be1b81749684d10
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/catalog/Type.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java
M fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java
M testdata/bin/compute-table-stats.sh
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M testdata/workloads/functional-planner/queries/PlannerTest/card-scan.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q16.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q20.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q21.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q32.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q37.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q40.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q77.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q80.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q82.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q92.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q94.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q95.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q98.test
27 files changed, 3,908 insertions(+), 3,502 deletions(-)


  git pu

[Impala-ASF-CR] IMPALA-13105: Fix multiple imported query profiles fail to import/clear at once

2024-05-23 Thread Riza Suminto (Code Review)
Riza Suminto has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21450 )

Change subject: IMPALA-13105: Fix multiple imported query profiles fail to 
import/clear at once
..

IMPALA-13105: Fix multiple imported query profiles fail to import/clear at once

On importing multiple query profiles, insertion of the last query in the
queue fails as no delay is provided for the insertion.

This has been fixed by providing a delay after inserting the final query.

On clearing all the imported queries, in some instances page reloads
before clearing the IndexedDB object store.

This has been fixed by triggering the page reload after clearing
the object store succeeds.

Change-Id: I42470fecd0cff6e193f080102575e51d86a2d562
Reviewed-on: http://gerrit.cloudera.org:8080/21450
Reviewed-by: Wenzhe Zhou 
Reviewed-by: Riza Suminto 
Tested-by: Impala Public Jenkins 
---
M www/queries.tmpl
1 file changed, 4 insertions(+), 3 deletions(-)

Approvals:
  Wenzhe Zhou: Looks good to me, but someone else must approve
  Riza Suminto: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/21450
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I42470fecd0cff6e193f080102575e51d86a2d562
Gerrit-Change-Number: 21450
Gerrit-PatchSet: 2
Gerrit-Owner: Surya Hebbar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column

2024-05-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21377 )

Change subject: IMPALA-8042: Assign BETWEEN selectivity for discrete-unique 
column
..


Patch Set 7: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10667/


--
To view, visit http://gerrit.cloudera.org:8080/21377
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib349d97349d1ee99788645a66be1b81749684d10
Gerrit-Change-Number: 21377
Gerrit-PatchSet: 7
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Fri, 24 May 2024 05:26:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13105: Fix multiple imported query profiles fail to import/clear at once

2024-05-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21450 )

Change subject: IMPALA-13105: Fix multiple imported query profiles fail to 
import/clear at once
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21450
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I42470fecd0cff6e193f080102575e51d86a2d562
Gerrit-Change-Number: 21450
Gerrit-PatchSet: 1
Gerrit-Owner: Surya Hebbar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Fri, 24 May 2024 05:23:16 +
Gerrit-HasComments: No


[Impala-ASF-CR](asf-site) Add DOAP file

2024-05-23 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21449 )

Change subject: Add DOAP file
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21449/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21449/2//COMMIT_MSG@10
PS2, Line 10: informatio
> nit: typo
Done



--
To view, visit http://gerrit.cloudera.org:8080/21449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: asf-site
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib1a47c68345769281449dd377a6f82a7257cee07
Gerrit-Change-Number: 21449
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Fri, 24 May 2024 04:22:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR](asf-site) Add DOAP file

2024-05-23 Thread Quanlong Huang (Code Review)
Hello Laszlo Gaal, Michael Smith, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21449

to look at the new patch set (#3).

Change subject: Add DOAP file
..

Add DOAP file

Adds the DOAP file so Impala can be listed in Apache projects with more
information, e.g. https://projects.apache.org/projects.html?language#C++

Validated at https://www.w3.org/RDF/Validator/

Change-Id: Ib1a47c68345769281449dd377a6f82a7257cee07
---
A doap_Impala.rdf
1 file changed, 149 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/21449/3
--
To view, visit http://gerrit.cloudera.org:8080/21449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: asf-site
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib1a47c68345769281449dd377a6f82a7257cee07
Gerrit-Change-Number: 21449
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-12939: Bound IMPALA BUILD THREADS for cgroups and memory

2024-05-23 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21200 )

Change subject: IMPALA-12939: Bound IMPALA_BUILD_THREADS for cgroups and memory
..


Patch Set 8: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21200/8/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/21200/8/bin/impala-config.sh@1067
PS8, Line 1067: AVAILABLE_MEM
> Not needed in $(()), as done in the line below. All the others were though,
Ack



--
To view, visit http://gerrit.cloudera.org:8080/21200
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I87994d0464073fe2d91bc2f7c2592c012e42de71
Gerrit-Change-Number: 21200
Gerrit-PatchSet: 8
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Fri, 24 May 2024 00:30:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column

2024-05-23 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21377 )

Change subject: IMPALA-8042: Assign BETWEEN selectivity for discrete-unique 
column
..


Patch Set 6:

Thank you Aman for your review! I'll run gerrit-verify-dryrun next.


--
To view, visit http://gerrit.cloudera.org:8080/21377
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib349d97349d1ee99788645a66be1b81749684d10
Gerrit-Change-Number: 21377
Gerrit-PatchSet: 6
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Fri, 24 May 2024 00:22:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13105: Fix multiple imported query profiles fail to import/clear at once

2024-05-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21450 )

Change subject: IMPALA-13105: Fix multiple imported query profiles fail to 
import/clear at once
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10668/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/21450
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I42470fecd0cff6e193f080102575e51d86a2d562
Gerrit-Change-Number: 21450
Gerrit-PatchSet: 1
Gerrit-Owner: Surya Hebbar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Fri, 24 May 2024 00:24:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13083: Clarify REASON MEM LIMIT TOO LOW FOR RESERVATION

2024-05-23 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21436 )

Change subject: IMPALA-13083: Clarify REASON_MEM_LIMIT_TOO_LOW_FOR_RESERVATION
..


Patch Set 5:

Thank you Andrew for your review!


--
To view, visit http://gerrit.cloudera.org:8080/21436
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1ef7fb7e7a194b2036c2948639a06c392590bf66
Gerrit-Change-Number: 21436
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Fri, 24 May 2024 00:19:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column

2024-05-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21377 )

Change subject: IMPALA-8042: Assign BETWEEN selectivity for discrete-unique 
column
..


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10667/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/21377
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib349d97349d1ee99788645a66be1b81749684d10
Gerrit-Change-Number: 21377
Gerrit-PatchSet: 7
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Fri, 24 May 2024 00:22:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column

2024-05-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21377 )

Change subject: IMPALA-8042: Assign BETWEEN selectivity for discrete-unique 
column
..


Patch Set 7: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21377
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib349d97349d1ee99788645a66be1b81749684d10
Gerrit-Change-Number: 21377
Gerrit-PatchSet: 7
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Fri, 24 May 2024 00:22:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12680: Fix NullPointerException during AlterTableAddPartitions

2024-05-23 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21430 )

Change subject: IMPALA-12680: Fix NullPointerException during 
AlterTableAddPartitions
..


Patch Set 1:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/21430/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21430/1//COMMIT_MSG@13
PS1, Line 13: processer
nit: processor


http://gerrit.cloudera.org:8080/#/c/21430/1/fe/src/main/java/org/apache/impala/service/BackendConfig.java
File fe/src/main/java/org/apache/impala/service/BackendConfig.java:

http://gerrit.cloudera.org:8080/#/c/21430/1/fe/src/main/java/org/apache/impala/service/BackendConfig.java@463
PS1, Line 463:   public void setDebugActions(String debugActions) {
 : backendCfg_.debug_actions = debugActions;
 :   }
Is this needed?
AFAIK, all debugAction in CatalogOpExecutor comes from the thrift 
message(either TDdlExecRequest or TUpdateCatalogRequest).


http://gerrit.cloudera.org:8080/#/c/21430/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/21430/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@5303
PS1, Line 5303: @Nullable Map partitionToEventId
Can the @Nullable annotation removed here?


http://gerrit.cloudera.org:8080/#/c/21430/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@5336
PS1, Line 5336: Preconditions.checkNotNull(partitionToEventId);
I think moving this Precondition to the beginning of function is sufficient to 
test this change.
You can stop EP entirely and run testAlterTableWithEpDisabled without 
specifying new debug action to restart EP.
The assertion is that partitionToEventId must never be null.


http://gerrit.cloudera.org:8080/#/c/21430/1/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/21430/1/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@4043
PS1, Line 4043: String prevDebugActions = BackendConfig.INSTANCE.debugActions();
Is this variable necessary? BackendConfig.INSTANCE.setDebugActions() never 
called with other value.



--
To view, visit http://gerrit.cloudera.org:8080/21430
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I730fed311ebc09762dccc152d9583d5394b0b9b3
Gerrit-Change-Number: 21430
Gerrit-PatchSet: 1
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Fri, 24 May 2024 00:17:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12939: Bound IMPALA BUILD THREADS for cgroups and memory

2024-05-23 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21200 )

Change subject: IMPALA-12939: Bound IMPALA_BUILD_THREADS for cgroups and memory
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21200/8/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/21200/8/bin/impala-config.sh@1067
PS8, Line 1067: AVAILABLE_MEM
> missing $ sign?
Not needed in $(()), as done in the line below. All the others were though, 
it's an odd syntax.



--
To view, visit http://gerrit.cloudera.org:8080/21200
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I87994d0464073fe2d91bc2f7c2592c012e42de71
Gerrit-Change-Number: 21200
Gerrit-PatchSet: 8
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Fri, 24 May 2024 00:11:29 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12939: Bound IMPALA BUILD THREADS for cgroups and memory

2024-05-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21200 )

Change subject: IMPALA-12939: Bound IMPALA_BUILD_THREADS for cgroups and memory
..


Patch Set 8:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16215/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21200
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I87994d0464073fe2d91bc2f7c2592c012e42de71
Gerrit-Change-Number: 21200
Gerrit-PatchSet: 8
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Fri, 24 May 2024 00:14:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column

2024-05-23 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21377 )

Change subject: IMPALA-8042: Assign BETWEEN selectivity for discrete-unique 
column
..


Patch Set 6: Code-Review+2

(2 comments)

I reviewed the TPC-DS plan changes and they are along expected lines due to the 
change in selectivity estimate, mostly for the between predicate on date_dim 
table. Bumping to +2.

A note on the range predicate selectivity in general: this is normally done by 
histograms but in its absence in Impala, this patch is solving a narrower scope 
of this estimation. Future improvements in stats could subsume this patch.

http://gerrit.cloudera.org:8080/#/c/21377/6/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test:

http://gerrit.cloudera.org:8080/#/c/21377/6/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test@202
PS6, Line 202: |  |  tuple-ids=19,21 row-size=32B cardinality=9.65M 
cost=611185704
> Bulk of HashJoin memory is for the builder side. And in this case, I think
Makes sense.


http://gerrit.cloudera.org:8080/#/c/21377/6/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test:

http://gerrit.cloudera.org:8080/#/c/21377/6/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test@123
PS6, Line 123: 40.26M
> Yes, I think it is lower due to selective partition filter RF002 that comes
Done



--
To view, visit http://gerrit.cloudera.org:8080/21377
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib349d97349d1ee99788645a66be1b81749684d10
Gerrit-Change-Number: 21377
Gerrit-PatchSet: 6
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Fri, 24 May 2024 00:07:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12939: Bound IMPALA BUILD THREADS for cgroups and memory

2024-05-23 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21200 )

Change subject: IMPALA-12939: Bound IMPALA_BUILD_THREADS for cgroups and memory
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21200/8/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/21200/8/bin/impala-config.sh@1067
PS8, Line 1067: AVAILABLE_MEM
missing $ sign?



--
To view, visit http://gerrit.cloudera.org:8080/21200
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I87994d0464073fe2d91bc2f7c2592c012e42de71
Gerrit-Change-Number: 21200
Gerrit-PatchSet: 8
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Fri, 24 May 2024 00:06:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12939: Bound IMPALA BUILD THREADS for cgroups and memory

2024-05-23 Thread Michael Smith (Code Review)
Hello Laszlo Gaal, Riza Suminto, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21200

to look at the new patch set (#8).

Change subject: IMPALA-12939: Bound IMPALA_BUILD_THREADS for cgroups and memory
..

IMPALA-12939: Bound IMPALA_BUILD_THREADS for cgroups and memory

Updates IMPALA_BUILD_THREADS to bound it based on guideline of 2 GB
memory per core during builds. Computes cores and memory from cgroup
limits if applicable; memory is used as a bound on physical memory, as
sometimes cgroups will report a larger limit than available physical
memory.

Uses IMPALA_BUILD_THREADS for load-data.

Adds a default in case USER is unset during bootstrap, which can occur
in devcontainer.

Change-Id: I87994d0464073fe2d91bc2f7c2592c012e42de71
---
M bin/bootstrap_development.sh
M bin/bootstrap_system.sh
M bin/impala-config.sh
M bin/load-data.py
4 files changed, 73 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/21200/8
--
To view, visit http://gerrit.cloudera.org:8080/21200
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I87994d0464073fe2d91bc2f7c2592c012e42de71
Gerrit-Change-Number: 21200
Gerrit-PatchSet: 8
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column

2024-05-23 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21377 )

Change subject: IMPALA-8042: Assign BETWEEN selectivity for discrete-unique 
column
..


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21377/6/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test:

http://gerrit.cloudera.org:8080/#/c/21377/6/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test@202
PS6, Line 202: |  |  tuple-ids=19,21 row-size=32B cardinality=9.65M 
cost=611185704
> The estimate for the rows in the build side of the hashjoin dropped from 7.
Bulk of HashJoin memory is for the builder side. And in this case, I think it 
does not change possibly because perInstanceBuildMinMemReservation > 
perBuildInstanceMemEstimate before and after the patch.

https://github.com/apache/impala/blob/b975165a0acfe37af302dd7c007360633df54917/fe/src/main/java/org/apache/impala/planner/HashJoinNode.java#L322-L334


http://gerrit.cloudera.org:8080/#/c/21377/6/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test:

http://gerrit.cloudera.org:8080/#/c/21377/6/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test@123
PS6, Line 123: 40.26M
> This scan's cardinality estimate dropped but there's no BETWEEN predicate o
Yes, I think it is lower due to selective partition filter RF002 that comes 
from 02:SCAN.



--
To view, visit http://gerrit.cloudera.org:8080/21377
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib349d97349d1ee99788645a66be1b81749684d10
Gerrit-Change-Number: 21377
Gerrit-PatchSet: 6
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 23 May 2024 23:41:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12939: Bound IMPALA BUILD THREADS for cgroups and memory

2024-05-23 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21200 )

Change subject: IMPALA-12939: Bound IMPALA_BUILD_THREADS for cgroups and memory
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21200/7/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/21200/7/bin/impala-config.sh@1059
PS7, Line 1059:   echo "Detected $AVAILABLE_MEM GB memory from cgroups v1"
It seems sometimes this is also set to something near INT64_MAX

  Detected 8589934591 GB memory from cgroups v1

We should probably bound that by whatever physical memory we detected.



--
To view, visit http://gerrit.cloudera.org:8080/21200
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I87994d0464073fe2d91bc2f7c2592c012e42de71
Gerrit-Change-Number: 21200
Gerrit-PatchSet: 7
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 23 May 2024 23:41:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12939: Bound IMPALA BUILD THREADS for cgroups and memory

2024-05-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21200 )

Change subject: IMPALA-12939: Bound IMPALA_BUILD_THREADS for cgroups and memory
..


Patch Set 7: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21200
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I87994d0464073fe2d91bc2f7c2592c012e42de71
Gerrit-Change-Number: 21200
Gerrit-PatchSet: 7
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 23 May 2024 23:35:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column

2024-05-23 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21377 )

Change subject: IMPALA-8042: Assign BETWEEN selectivity for discrete-unique 
column
..


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21377/6/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test:

http://gerrit.cloudera.org:8080/#/c/21377/6/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test@202
PS6, Line 202: |  |  tuple-ids=19,21 row-size=32B cardinality=9.65M 
cost=611185704
The estimate for the rows in the build side of the hashjoin dropped from 7.3K 
to 16, so the output cardinality reduction of the hash join makes sense.  Also, 
the row size reduced by half.  However, the overall memory estimate on line 134 
didn't change. Any thoughts on that ?
(Although on a different query , q16, it does affect the the fragment's 
per-instance memory estimate).


http://gerrit.cloudera.org:8080/#/c/21377/6/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test:

http://gerrit.cloudera.org:8080/#/c/21377/6/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test@123
PS6, Line 123: 40.26M
This scan's cardinality estimate dropped but there's no BETWEEN predicate on 
this table.  Is the change coming from the RT filters ?



--
To view, visit http://gerrit.cloudera.org:8080/21377
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib349d97349d1ee99788645a66be1b81749684d10
Gerrit-Change-Number: 21377
Gerrit-PatchSet: 6
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 23 May 2024 23:17:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12562: Cast double and float to string with exact presicion

2024-05-23 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21441 )

Change subject: IMPALA-12562: Cast double and float to string with exact 
presicion
..


Patch Set 4:

> Patch Set 4: Verified-1
>
> Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10665/

Tests show some real failures: 
https://jenkins.impala.io/job/ubuntu-20.04-from-scratch/2665/testReport/


--
To view, visit http://gerrit.cloudera.org:8080/21441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icd79c55dd57dc0fa13e4ec11c2284ef2800e8b1a
Gerrit-Change-Number: 21441
Gerrit-PatchSet: 4
Gerrit-Owner: Yifan Zhang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yifan Zhang 
Gerrit-Comment-Date: Thu, 23 May 2024 23:01:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12562: Cast double and float to string with exact presicion

2024-05-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21441 )

Change subject: IMPALA-12562: Cast double and float to string with exact 
presicion
..


Patch Set 4: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10665/


--
To view, visit http://gerrit.cloudera.org:8080/21441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icd79c55dd57dc0fa13e4ec11c2284ef2800e8b1a
Gerrit-Change-Number: 21441
Gerrit-PatchSet: 4
Gerrit-Owner: Yifan Zhang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yifan Zhang 
Gerrit-Comment-Date: Thu, 23 May 2024 22:35:43 +
Gerrit-HasComments: No


[Impala-ASF-CR](asf-site) Add DOAP file

2024-05-23 Thread Laszlo Gaal (Code Review)
Laszlo Gaal has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21449 )

Change subject: Add DOAP file
..


Patch Set 2: Code-Review+1

(1 comment)

Thanks a lot for taking care of this, Quanlong.

http://gerrit.cloudera.org:8080/#/c/21449/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21449/2//COMMIT_MSG@10
PS2, Line 10: infomation
nit: typo



--
To view, visit http://gerrit.cloudera.org:8080/21449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: asf-site
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib1a47c68345769281449dd377a6f82a7257cee07
Gerrit-Change-Number: 21449
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 23 May 2024 22:07:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13105: Fix multiple imported query profiles fail to import/clear at once

2024-05-23 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21450 )

Change subject: IMPALA-13105: Fix multiple imported query profiles fail to 
import/clear at once
..


Patch Set 1: Code-Review+2

Looks OK in my machine as well.
Tested with Chrome, Firefox, and Safari.


--
To view, visit http://gerrit.cloudera.org:8080/21450
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I42470fecd0cff6e193f080102575e51d86a2d562
Gerrit-Change-Number: 21450
Gerrit-PatchSet: 1
Gerrit-Owner: Surya Hebbar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 23 May 2024 20:52:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13034: Add logs and counters for HTTP profile requests blocking client fetches

2024-05-23 Thread Michael Smith (Code Review)
Michael Smith has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21412 )

Change subject: IMPALA-13034: Add logs and counters for HTTP profile requests 
blocking client fetches
..

IMPALA-13034: Add logs and counters for HTTP profile requests blocking client 
fetches

There are several endpoints in WebUI that can dump a query profile:
/query_profile, /query_profile_encoded, /query_profile_plain_text,
/query_profile_json. The HTTP handler thread goes into
ImpalaServer::GetRuntimeProfileOutput() which acquires lock of the
ClientRequestState. This could block client requests in fetching query
results.

To help identify this issue, this patch adds warning logs when such
profile dumping requests run slow and the query is still in-flight. Also
adds a profile counter, GetInFlightProfileTimeStats, for the summary
stats of this time. Dumping the profiles after the query is archived
(e.g. closed) won't be tracked.

Logs for slow http responses are also added. The thresholds are defined
by two new flags, slow_profile_dump_warning_threshold_ms, and
slow_http_response_warning_threshold_ms.

Note that dumping the profile in-flight won't always block the query,
e.g. if there are no client fetch requests or if the coordinator
fragment is idle waiting for executor fragment instances. So a long time
shown in GetInFlightProfileTimeStats doesn't mean it's hitting the
issue.

To better identify this issue, this patch adds another profile counter,
ClientFetchLockWaitTimer, as the cumulative time client fetch requests
waiting for locks.

Also fixes false positive logs for complaining invalid query handles.
Such logs are added in GetQueryHandle() when the query is not found in
the active query map, but it could still exist in the query log. This
removes the logs in GetQueryHandle() and lets the callers decide whether
to log the error.

Tests:
 - Added e2e test
 - Ran CORE tests

Change-Id: I538ebe914f70f460bc8412770a8f7a1cc8b505dc
Reviewed-on: http://gerrit.cloudera.org:8080/21412
Reviewed-by: Impala Public Jenkins 
Tested-by: Michael Smith 
---
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/util/webserver.cc
M tests/query_test/test_observability.py
8 files changed, 101 insertions(+), 13 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved
  Michael Smith: Verified

--
To view, visit http://gerrit.cloudera.org:8080/21412
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I538ebe914f70f460bc8412770a8f7a1cc8b505dc
Gerrit-Change-Number: 21412
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-13034: Add logs and counters for HTTP profile requests blocking client fetches

2024-05-23 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21412 )

Change subject: IMPALA-13034: Add logs and counters for HTTP profile requests 
blocking client fetches
..


Patch Set 5: Verified+1

Ran into https://issues.apache.org/jira/browse/IMPALA-12266.


--
To view, visit http://gerrit.cloudera.org:8080/21412
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I538ebe914f70f460bc8412770a8f7a1cc8b505dc
Gerrit-Change-Number: 21412
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 23 May 2024 18:32:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13034: Add logs and counters for HTTP profile requests blocking client fetches

2024-05-23 Thread Michael Smith (Code Review)
Michael Smith has removed a vote on this change.

Change subject: IMPALA-13034: Add logs and counters for HTTP profile requests 
blocking client fetches
..


Removed Verified-1 by Impala Public Jenkins 
--
To view, visit http://gerrit.cloudera.org:8080/21412
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I538ebe914f70f460bc8412770a8f7a1cc8b505dc
Gerrit-Change-Number: 21412
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-12939: Bound IMPALA BUILD THREADS for cgroups and memory

2024-05-23 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21200 )

Change subject: IMPALA-12939: Bound IMPALA_BUILD_THREADS for cgroups and memory
..


Patch Set 7: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/21200
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I87994d0464073fe2d91bc2f7c2592c012e42de71
Gerrit-Change-Number: 21200
Gerrit-PatchSet: 7
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 23 May 2024 18:32:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12939: Bound IMPALA BUILD THREADS for cgroups and memory

2024-05-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21200 )

Change subject: IMPALA-12939: Bound IMPALA_BUILD_THREADS for cgroups and memory
..


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10666/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/21200
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I87994d0464073fe2d91bc2f7c2592c012e42de71
Gerrit-Change-Number: 21200
Gerrit-PatchSet: 7
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 23 May 2024 18:34:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12939: Bound IMPALA BUILD THREADS for cgroups and memory

2024-05-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21200 )

Change subject: IMPALA-12939: Bound IMPALA_BUILD_THREADS for cgroups and memory
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16214/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21200
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I87994d0464073fe2d91bc2f7c2592c012e42de71
Gerrit-Change-Number: 21200
Gerrit-PatchSet: 7
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 23 May 2024 18:33:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13034: Add logs and counters for HTTP profile requests blocking client fetches

2024-05-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21412 )

Change subject: IMPALA-13034: Add logs and counters for HTTP profile requests 
blocking client fetches
..


Patch Set 5: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10664/


--
To view, visit http://gerrit.cloudera.org:8080/21412
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I538ebe914f70f460bc8412770a8f7a1cc8b505dc
Gerrit-Change-Number: 21412
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 23 May 2024 18:28:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13057: Incorporate tuple/slot information into tuple cache key

2024-05-23 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21398 )

Change subject: IMPALA-13057: Incorporate tuple/slot information into tuple 
cache key
..


Patch Set 6: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21398/4/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/21398/4/fe/src/main/java/org/apache/impala/analysis/Expr.java@946
PS4, Line 946:   result.add(expr.treeToThrift(serialCtx));
> Ooops, that is a mistake. Fixed this to pass in serialCtx.
Ack



--
To view, visit http://gerrit.cloudera.org:8080/21398
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f5278e9dbb976cbebdc6a21a6e66bc90ce06c6c
Gerrit-Change-Number: 21398
Gerrit-PatchSet: 6
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Thu, 23 May 2024 18:22:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13102: Normalize invalid column stats from HMS

2024-05-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21445 )

Change subject: IMPALA-13102: Normalize invalid column stats from HMS
..

IMPALA-13102: Normalize invalid column stats from HMS

Column stats like numDVs, numNulls in HMS could have arbitrary values.
Impala expects them to be non-negative or -1 for unknown. So loading
tables with invalid stats values (<-1) will fail.

This patch adds logic to normalize the stats values. If the value < -1,
use -1 for it and add corresponding warning logs. Also refactor some
redundant codes in ColumnStats.

Tests:
 - Add e2e test

Change-Id: If6216e3d6e73a529a9b3a8c0ea9d22727ab43f1a
Reviewed-on: http://gerrit.cloudera.org:8080/21445
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M fe/src/main/java/org/apache/impala/analysis/AlterTableSetColumnStats.java
M fe/src/main/java/org/apache/impala/catalog/Column.java
M fe/src/main/java/org/apache/impala/catalog/ColumnStats.java
M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java
M tests/metadata/test_compute_stats.py
5 files changed, 147 insertions(+), 73 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/21445
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: If6216e3d6e73a529a9b3a8c0ea9d22727ab43f1a
Gerrit-Change-Number: 21445
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-13102: Normalize invalid column stats from HMS

2024-05-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21445 )

Change subject: IMPALA-13102: Normalize invalid column stats from HMS
..


Patch Set 4: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21445
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If6216e3d6e73a529a9b3a8c0ea9d22727ab43f1a
Gerrit-Change-Number: 21445
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 23 May 2024 18:18:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12939: Bound IMPALA BUILD THREADS for cgroups and memory

2024-05-23 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21200 )

Change subject: IMPALA-12939: Bound IMPALA_BUILD_THREADS for cgroups and memory
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21200/5/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/21200/5/bin/impala-config.sh@1042
PS5, Line 1042: # ASAN needs a matching version of llvm-symbolizer to symbolize 
stack tr
> Just as CORES, can we set reasonable AVAILABLE_MEM by default if /proc/memi
Added macOS and cgroups handling.



--
To view, visit http://gerrit.cloudera.org:8080/21200
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I87994d0464073fe2d91bc2f7c2592c012e42de71
Gerrit-Change-Number: 21200
Gerrit-PatchSet: 7
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 23 May 2024 18:08:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12939: Bound IMPALA BUILD THREADS for cgroups and memory

2024-05-23 Thread Michael Smith (Code Review)
Hello Laszlo Gaal, Riza Suminto, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21200

to look at the new patch set (#7).

Change subject: IMPALA-12939: Bound IMPALA_BUILD_THREADS for cgroups and memory
..

IMPALA-12939: Bound IMPALA_BUILD_THREADS for cgroups and memory

Updates IMPALA_BUILD_THREADS to bound it based on guideline of 2 GB
memory per core during builds. Computes cores and memory from cgroup
limits if applicable. Uses IMPALA_BUILD_THREADS for load-data.

Adds a default in case USER is unset during bootstrap, which can occur
in devcontainer.

Change-Id: I87994d0464073fe2d91bc2f7c2592c012e42de71
---
M bin/bootstrap_development.sh
M bin/bootstrap_system.sh
M bin/impala-config.sh
M bin/load-data.py
4 files changed, 69 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/21200/7
--
To view, visit http://gerrit.cloudera.org:8080/21200
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I87994d0464073fe2d91bc2f7c2592c012e42de71
Gerrit-Change-Number: 21200
Gerrit-PatchSet: 7
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-12562: Cast double and float to string with exact presicion

2024-05-23 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21441 )

Change subject: IMPALA-12562: Cast double and float to string with exact 
presicion
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icd79c55dd57dc0fa13e4ec11c2284ef2800e8b1a
Gerrit-Change-Number: 21441
Gerrit-PatchSet: 3
Gerrit-Owner: Yifan Zhang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yifan Zhang 
Gerrit-Comment-Date: Thu, 23 May 2024 17:30:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12562: Cast double and float to string with exact presicion

2024-05-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21441 )

Change subject: IMPALA-12562: Cast double and float to string with exact 
presicion
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10665/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/21441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icd79c55dd57dc0fa13e4ec11c2284ef2800e8b1a
Gerrit-Change-Number: 21441
Gerrit-PatchSet: 4
Gerrit-Owner: Yifan Zhang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yifan Zhang 
Gerrit-Comment-Date: Thu, 23 May 2024 17:31:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12562: Cast double and float to string with exact presicion

2024-05-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21441 )

Change subject: IMPALA-12562: Cast double and float to string with exact 
presicion
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icd79c55dd57dc0fa13e4ec11c2284ef2800e8b1a
Gerrit-Change-Number: 21441
Gerrit-PatchSet: 4
Gerrit-Owner: Yifan Zhang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yifan Zhang 
Gerrit-Comment-Date: Thu, 23 May 2024 17:31:27 +
Gerrit-HasComments: No


[Impala-ASF-CR](asf-site) Add DOAP file

2024-05-23 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21449 )

Change subject: Add DOAP file
..


Patch Set 2: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21449/1/doap_Impala.rdf
File doap_Impala.rdf:

http://gerrit.cloudera.org:8080/#/c/21449/1/doap_Impala.rdf@29
PS1, Line 29: https://impala.apache.org"; />
> Good point. I think we can update the chapter there.
Ack


http://gerrit.cloudera.org:8080/#/c/21449/1/doap_Impala.rdf@41
PS1, Line 41: 4.3.0
> Yes, if we want to show them in this page: https://projects.apache.org/proj
Ack



--
To view, visit http://gerrit.cloudera.org:8080/21449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: asf-site
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib1a47c68345769281449dd377a6f82a7257cee07
Gerrit-Change-Number: 21449
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 23 May 2024 16:43:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12939: Bound IMPALA BUILD THREADS for cgroups and memory

2024-05-23 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21200 )

Change subject: IMPALA-12939: Bound IMPALA_BUILD_THREADS for cgroups and memory
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21200/5/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/21200/5/bin/impala-config.sh@1042
PS5, Line 1042: AVAILABLE_MEM=$(awk '/MemTotal/{print int($2/1024/1024)}' 
/proc/meminfo)
Just as CORES, can we set reasonable AVAILABLE_MEM by default if /proc/meminfo 
does not exist (ie., IS_OSX in L577 is true)? Maybe 8 or 16?
I know Impala only build in Linux now, but just in case we can do ARM Mac in 
the future.



--
To view, visit http://gerrit.cloudera.org:8080/21200
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I87994d0464073fe2d91bc2f7c2592c012e42de71
Gerrit-Change-Number: 21200
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 23 May 2024 16:41:24 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13093: Fix failure in inserting into OBS tables

2024-05-23 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21438 )

Change subject: IMPALA-13093: Fix failure in inserting into OBS tables
..


Patch Set 2: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/21438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2441da0fc521b4bbed10c8edceb937bde481
Gerrit-Change-Number: 21438
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Xiang Yang 
Gerrit-Reviewer: Zihao Ye 
Gerrit-Comment-Date: Thu, 23 May 2024 15:45:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder

2024-05-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21452 )

Change subject: IMPALA-13088: (part 2) Parallelize final sorts in 
IcebergDeleteBuilder
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16213/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21452
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807
Gerrit-Change-Number: 21452
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 23 May 2024 13:20:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13034: Add logs and counters for HTTP profile requests blocking client fetches

2024-05-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21412 )

Change subject: IMPALA-13034: Add logs and counters for HTTP profile requests 
blocking client fetches
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10664/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/21412
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I538ebe914f70f460bc8412770a8f7a1cc8b505dc
Gerrit-Change-Number: 21412
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 23 May 2024 13:17:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13034: Add logs and counters for HTTP profile requests blocking client fetches

2024-05-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21412 )

Change subject: IMPALA-13034: Add logs and counters for HTTP profile requests 
blocking client fetches
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21412
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I538ebe914f70f460bc8412770a8f7a1cc8b505dc
Gerrit-Change-Number: 21412
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 23 May 2024 13:17:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13102: Normalize invalid column stats from HMS

2024-05-23 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21445 )

Change subject: IMPALA-13102: Normalize invalid column stats from HMS
..


Patch Set 4:

Thanks for the review! Merging this.


--
To view, visit http://gerrit.cloudera.org:8080/21445
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If6216e3d6e73a529a9b3a8c0ea9d22727ab43f1a
Gerrit-Change-Number: 21445
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 23 May 2024 13:16:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13102: Normalize invalid column stats from HMS

2024-05-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21445 )

Change subject: IMPALA-13102: Normalize invalid column stats from HMS
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10663/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/21445
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If6216e3d6e73a529a9b3a8c0ea9d22727ab43f1a
Gerrit-Change-Number: 21445
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 23 May 2024 13:16:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13102: Normalize invalid column stats from HMS

2024-05-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21445 )

Change subject: IMPALA-13102: Normalize invalid column stats from HMS
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21445
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If6216e3d6e73a529a9b3a8c0ea9d22727ab43f1a
Gerrit-Change-Number: 21445
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 23 May 2024 13:16:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13088: (part 1) Improve build batch processing of IcebergDeleteBuilder

2024-05-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21435 )

Change subject: IMPALA-13088: (part 1) Improve build batch processing of 
IcebergDeleteBuilder
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16211/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21435
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14541a064a522d4780fb5f02636736259e79b9cf
Gerrit-Change-Number: 21435
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 23 May 2024 13:09:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder

2024-05-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21452 )

Change subject: IMPALA-13088: (part 2) Parallelize final sorts in 
IcebergDeleteBuilder
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/16212/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/21452
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807
Gerrit-Change-Number: 21452
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 23 May 2024 13:09:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder

2024-05-23 Thread Zoltan Borok-Nagy (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21452

to look at the new patch set (#2).

Change subject: IMPALA-13088: (part 2) Parallelize final sorts in 
IcebergDeleteBuilder
..

IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder

With this patch IcebergDeleteBuilder checks how many probe threads
are actually blocked on the builder. Let's assume the following plan:

 UNION ALL
/ \
   /   \
  / \
 SCAN allANTI JOIN
 datafiles  / \
 without   /   \
 deletes  SCAN SCAN
  datafilesdeletes
  with deletes

In that case UNION ALL, and the two "SCAN datafiles" operators are in
the same fragment, while the builder of the ANTI JOIN is in a different
fragment. This means that "SCAN datafiles without deletes" can run in
parallel with the builder. But once that SCAN is exhausted, the UNION
ALL will drain rows from "SCAN datafiles with deletes" via the ANTI JOIN
operator, but that operator depends on the join builder output.

This means in some cases the SCAN fragments are busy, while in other
cases the SCAN fragments are blocked. It depends on how much work
they need to do, and how much work the build-side needs to do. So to
handle all cases, we dynamically check how many build fragments are
blocked on the builder, then spin up as many threads to parellelize
the final sort.

The also works well when we have the following plan:

ANTI JOIN
   / \
  /   \
 SCAN SCAN
 datafilesdeletes
 with deletes

The above plan is created when all data files have corresponding
deletes, or when we are running a simple count(*) query. In that
case all "SCAN datafiles" fragments are blocked on the builder,
so we can use that many threads to sort the build results.

A new field "ThreadCountInFinalBuild" was added, so we can check the
query profile about how many threads were used for the final
sorting in the builders.

Measurements:
In a table with 1 Trillion data records and 68.5 Billion delete records
it lowered "IcebergDeletePositionSortTimer" from ~1 minute to
8-10 seconds, in an environment with 40 executors and MT_DOP=12.

TODO:
 * e2e tests that check counter "ThreadCountInFinalBuild"

Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807
---
M be/src/exec/iceberg-delete-builder.cc
M be/src/exec/iceberg-delete-builder.h
M be/src/exec/join-builder.cc
M be/src/exec/join-builder.h
4 files changed, 105 insertions(+), 24 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/21452/2
--
To view, visit http://gerrit.cloudera.org:8080/21452
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807
Gerrit-Change-Number: 21452
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-13102: Normalize invalid column stats from HMS

2024-05-23 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21445 )

Change subject: IMPALA-13102: Normalize invalid column stats from HMS
..


Patch Set 3: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21445/1/tests/metadata/test_compute_stats.py
File tests/metadata/test_compute_stats.py:

http://gerrit.cloudera.org:8080/#/c/21445/1/tests/metadata/test_compute_stats.py@474
PS1, Line 474: client.update_table_column_sta
> I can also see warning logs like this:
thanks for looking into it!



--
To view, visit http://gerrit.cloudera.org:8080/21445
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If6216e3d6e73a529a9b3a8c0ea9d22727ab43f1a
Gerrit-Change-Number: 21445
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 23 May 2024 12:56:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13034: Add logs and counters for HTTP profile requests blocking client fetches

2024-05-23 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21412 )

Change subject: IMPALA-13034: Add logs and counters for HTTP profile requests 
blocking client fetches
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21412
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I538ebe914f70f460bc8412770a8f7a1cc8b505dc
Gerrit-Change-Number: 21412
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 23 May 2024 12:54:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12562: Cast double and float to string with exact presicion

2024-05-23 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21441 )

Change subject: IMPALA-12562: Cast double and float to string with exact 
presicion
..


Patch Set 3: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21441/3/be/src/exprs/cast-functions-ir.cc
File be/src/exprs/cast-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/21441/3/be/src/exprs/cast-functions-ir.cc@352
PS3, Line 352: strlen
> It seems that the returned pointer doesn't point to the terminating null by
You are right, sorry, I was confused by the way FastInt32ToBufferLeft and 
friends work and thought that FloatToBuffer/DoubleToBuffer doesn't point to the 
beginning of the buffer.



--
To view, visit http://gerrit.cloudera.org:8080/21441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icd79c55dd57dc0fa13e4ec11c2284ef2800e8b1a
Gerrit-Change-Number: 21441
Gerrit-PatchSet: 3
Gerrit-Owner: Yifan Zhang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yifan Zhang 
Gerrit-Comment-Date: Thu, 23 May 2024 12:53:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13088: (part 1) Improve build batch processing of IcebergDeleteBuilder

2024-05-23 Thread Zoltan Borok-Nagy (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21435

to look at the new patch set (#2).

Change subject: IMPALA-13088: (part 1) Improve build batch processing of 
IcebergDeleteBuilder
..

IMPALA-13088: (part 1) Improve build batch processing of IcebergDeleteBuilder

When there are lots of delete records the IcebergDeleteBuilder can
become a bottleneck. Since the left side of the JOIN is blocked on
the build side any improvement we make here significantly improves
Iceberg V2 table scanning.

Improvements of this patch:

* Use a vector of vectors to collect the position delete records.
  This way we can avoid large re-allocations and copyings.
* Insert large ranges from the build batches into the collected
  delete records instead of doing it one-by-one.

Measurements

Local measurement with 824 Million position delete records:
JOIN BUILD: ~32s -> ~14s (6s is the final sorting)

40-node cluster with 68.5 Billion position delete records:
JOIN BUILD: 4m15s -> 1m45s (1m7s is the final sorting)

Parallelization of the final sort will be added in a follow-up CR.

Change-Id: I14541a064a522d4780fb5f02636736259e79b9cf
(cherry picked from commit d08315fe5c57ccb5b197cd196b62eeedf7d90ec3)
---
M be/src/exec/iceberg-delete-builder.cc
M be/src/exec/iceberg-delete-builder.h
2 files changed, 101 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/21435/2
--
To view, visit http://gerrit.cloudera.org:8080/21435
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I14541a064a522d4780fb5f02636736259e79b9cf
Gerrit-Change-Number: 21435
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder

2024-05-23 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21452


Change subject: IMPALA-13088: (part 2) Parallelize final sorts in 
IcebergDeleteBuilder
..

IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder

With this patch IcebergDeleteBuilder checks how many probe threads
are actually blocked on the builder. Let's assume the following plan:

UNION ALL
   / \
  /   \
 / \
SCAN allANTI JOIN
datafiles  / \
without   /   \
deletes  SCAN SCAN
 datafilesdeletes
 with deletes

In that case UNION ALL, and the two "SCAN datafiles" operators are in
the same fragment, while the builder of the ANTI JOIN is in a different
fragment. This means that "SCAN datafiles without deletes" can run in
parallel with the builder. But once that SCAN is exhausted, the UNION
ALL will drain rows from "SCAN datafiles with deletes" via the ANTI JOIN
operator, but that operator depends on the join builder output.

This means in some cases the SCAN fragments are busy, while in other
cases the SCAN fragments are blocked. It depends on how much work
they need to do, and how much work the build-side needs to do. So to
handle all cases, we dynamically check how many build fragments are
blocked on the builder, then spin up as many threads to parellelize
the final sort.

The also works well when we have the following plan:

ANTI JOIN
   / \
  /   \
 SCAN SCAN
 datafilesdeletes
 with deletes

The above plan is created when all data files have corresponding
deletes, or when we are running a simple count(*) query. In that
case all "SCAN datafiles" fragments are blocked on the builder,
so we can use that many threads to sort the build results.

A new field "ThreadCountInFinalBuild" was added, so we can check the
query profile about how many threads were used for the final
sorting in the builders.

Measurements:
In a table with 1 Trillion data records and 68.5 Billion delete records
it lowered "IcebergDeletePositionSortTimer" from ~1 minute to
8-10 seconds, in an environment with 40 executors and MT_DOP=12.

TODO:
 * e2e tests that check counter "ThreadCountInFinalBuild"

Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807
---
M be/src/exec/iceberg-delete-builder.cc
M be/src/exec/iceberg-delete-builder.h
M be/src/exec/join-builder.cc
M be/src/exec/join-builder.h
4 files changed, 102 insertions(+), 24 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/21452/1
--
To view, visit http://gerrit.cloudera.org:8080/21452
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807
Gerrit-Change-Number: 21452
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-12562: Cast double and float to string with exact presicion

2024-05-23 Thread Yifan Zhang (Code Review)
Yifan Zhang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21441 )

Change subject: IMPALA-12562: Cast double and float to string with exact 
presicion
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21441/3/be/src/exprs/cast-functions-ir.cc
File be/src/exprs/cast-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/21441/3/be/src/exprs/cast-functions-ir.cc@352
PS3, Line 352: strlen
> strlen could be avoid by saving the original pointer and comparing the diff
It seems that the returned pointer doesn't point to the terminating null byte 
and it can't be used to compare with the original pointer.

I also found this strlen redundant, since snprintf returns the length of the 
buffer. Maybe we have to use strlen if we don't want to change implementations 
in gutil/strings.



--
To view, visit http://gerrit.cloudera.org:8080/21441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icd79c55dd57dc0fa13e4ec11c2284ef2800e8b1a
Gerrit-Change-Number: 21441
Gerrit-PatchSet: 3
Gerrit-Owner: Yifan Zhang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yifan Zhang 
Gerrit-Comment-Date: Thu, 23 May 2024 08:33:27 +
Gerrit-HasComments: Yes