[Impala-ASF-CR] IMPALA-13468: Fix various aggregation issues in aggregation.test

2024-10-23 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21961 )

Change subject: IMPALA-13468: Fix various aggregation issues in aggregation.test
..


Patch Set 5: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21961/3/java/calcite-planner/src/main/java/org/apache/impala/calcite/operators/ImpalaMinMaxAggFunction.java
File 
java/calcite-planner/src/main/java/org/apache/impala/calcite/operators/ImpalaMinMaxAggFunction.java:

http://gerrit.cloudera.org:8080/#/c/21961/3/java/calcite-planner/src/main/java/org/apache/impala/calcite/operators/ImpalaMinMaxAggFunction.java@45
PS3, Line 45: return ImpalaTypeConverter.getRelDataType(Type.BOOLEAN);
> https://github.com/apache/impala/blob/master/testdata/workloads/functional-
Ok, for NULL constants, I wasn't aware that Impala's MIN/MAX returns boolean. 
Marking this resolved.  (as an aside, I checked on Postgres .. it returns 
'text' type for MIN(NULL).)



--
To view, visit http://gerrit.cloudera.org:8080/21961
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I038127d6a2f228ae8d263e983b1906e99ae05f77
Gerrit-Change-Number: 21961
Gerrit-PatchSet: 5
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Anonymous Coward (816)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Thu, 24 Oct 2024 01:06:09 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13465: Trace TupleId further to reduce Agg cardinality

2024-10-23 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21955 )

Change subject: IMPALA-13465: Trace TupleId further to reduce Agg cardinality
..


Patch Set 3: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21955/3/fe/src/main/java/org/apache/impala/planner/AggregationNode.java
File fe/src/main/java/org/apache/impala/planner/AggregationNode.java:

http://gerrit.cloudera.org:8080/#/c/21955/3/fe/src/main/java/org/apache/impala/planner/AggregationNode.java@322
PS3, Line 322: referrence
nit: typo: "reference"



--
To view, visit http://gerrit.cloudera.org:8080/21955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I11f59ccc469c24c1800abaad3774c56190306944
Gerrit-Change-Number: 21955
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Wed, 23 Oct 2024 23:06:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13468: Fix various aggregation issues in aggregation.test

2024-10-23 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21961 )

Change subject: IMPALA-13468: Fix various aggregation issues in aggregation.test
..


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/21961/3/java/calcite-planner/src/main/java/org/apache/impala/calcite/operators/ImpalaAggOperator.java
File 
java/calcite-planner/src/main/java/org/apache/impala/calcite/operators/ImpalaAggOperator.java:

http://gerrit.cloudera.org:8080/#/c/21961/3/java/calcite-planner/src/main/java/org/apache/impala/calcite/operators/ImpalaAggOperator.java@32
PS3, Line 32: ImpalaAggOpera
> Bad cut and paste.  No, ImpalaAggOperator needs to extend SqlAggFunction fo
Done


http://gerrit.cloudera.org:8080/#/c/21961/3/java/calcite-planner/src/main/java/org/apache/impala/calcite/operators/ImpalaMinMaxAggFunction.java
File 
java/calcite-planner/src/main/java/org/apache/impala/calcite/operators/ImpalaMinMaxAggFunction.java:

http://gerrit.cloudera.org:8080/#/c/21961/3/java/calcite-planner/src/main/java/org/apache/impala/calcite/operators/ImpalaMinMaxAggFunction.java@45
PS3, Line 45: return ImpalaTypeConverter.getRelDataType(Type.BOOLEAN);
> Having NULL return boolean allows a specific test to pass.
I still didn't get this and why this is needed for the MIN/MAX function to 
return a boolean. Can you point to the test that expects this ?


http://gerrit.cloudera.org:8080/#/c/21961/3/java/calcite-planner/src/main/java/org/apache/impala/calcite/operators/ImpalaMinMaxAggFunction.java@58
PS3, Line 58:   SqlCallBinding callBinding,
> The inferReturnType for timestamp is able to return a timestamp. I'd imagin
Done



--
To view, visit http://gerrit.cloudera.org:8080/21961
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I038127d6a2f228ae8d263e983b1906e99ae05f77
Gerrit-Change-Number: 21961
Gerrit-PatchSet: 5
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Anonymous Coward (816)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Wed, 23 Oct 2024 17:36:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13468: Fix various aggregation issues in aggregation.test

2024-10-22 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21961 )

Change subject: IMPALA-13468: Fix various aggregation issues in aggregation.test
..


Patch Set 3:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/21961/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21961/3//COMMIT_MSG@20
PS3, Line 20: min(NULL) and min(char types)
nit: to be precise, this is both min and max


http://gerrit.cloudera.org:8080/#/c/21961/3/java/calcite-planner/src/main/java/org/apache/impala/calcite/operators/ImpalaAggOperator.java
File 
java/calcite-planner/src/main/java/org/apache/impala/calcite/operators/ImpalaAggOperator.java:

http://gerrit.cloudera.org:8080/#/c/21961/3/java/calcite-planner/src/main/java/org/apache/impala/calcite/operators/ImpalaAggOperator.java@32
PS3, Line 32: ImpalaOperator
This says ImpalaOperator but the actual class is ImpalaAggOperator.  Is 
ImpalaAggOperator supposed to be a derived class of ImpalaOperator ?


http://gerrit.cloudera.org:8080/#/c/21961/3/java/calcite-planner/src/main/java/org/apache/impala/calcite/operators/ImpalaAvgAggFunction.java
File 
java/calcite-planner/src/main/java/org/apache/impala/calcite/operators/ImpalaAvgAggFunction.java:

http://gerrit.cloudera.org:8080/#/c/21961/3/java/calcite-planner/src/main/java/org/apache/impala/calcite/operators/ImpalaAvgAggFunction.java@31
PS3, Line 31: TIMESTMP
typo: TIMESTAMP


http://gerrit.cloudera.org:8080/#/c/21961/3/java/calcite-planner/src/main/java/org/apache/impala/calcite/operators/ImpalaGroupingIdFunction.java
File 
java/calcite-planner/src/main/java/org/apache/impala/calcite/operators/ImpalaGroupingIdFunction.java:

http://gerrit.cloudera.org:8080/#/c/21961/3/java/calcite-planner/src/main/java/org/apache/impala/calcite/operators/ImpalaGroupingIdFunction.java@21
PS3, Line 21: import com.google.common.base.Preconditions;
Most of these imported classes are not used in this file.


http://gerrit.cloudera.org:8080/#/c/21961/3/java/calcite-planner/src/main/java/org/apache/impala/calcite/operators/ImpalaGroupingIdFunction.java@67
PS3, Line 67: /**
:  */
nit: remove empty comments


http://gerrit.cloudera.org:8080/#/c/21961/3/java/calcite-planner/src/main/java/org/apache/impala/calcite/operators/ImpalaMinMaxAggFunction.java
File 
java/calcite-planner/src/main/java/org/apache/impala/calcite/operators/ImpalaMinMaxAggFunction.java:

http://gerrit.cloudera.org:8080/#/c/21961/3/java/calcite-planner/src/main/java/org/apache/impala/calcite/operators/ImpalaMinMaxAggFunction.java@31
PS3, Line 31: SqlAvgAggFunction
Confused by the mention of SqlAvgAgg since the class is doing MinMax, not Avg.


http://gerrit.cloudera.org:8080/#/c/21961/3/java/calcite-planner/src/main/java/org/apache/impala/calcite/operators/ImpalaMinMaxAggFunction.java@31
PS3, Line 31: TIMESTMP
Typo: TIMESTAMP


http://gerrit.cloudera.org:8080/#/c/21961/3/java/calcite-planner/src/main/java/org/apache/impala/calcite/operators/ImpalaMinMaxAggFunction.java@45
PS3, Line 45: return ImpalaTypeConverter.getRelDataType(Type.BOOLEAN);
Why would  the Min/Max of a NULL type return a Boolean ? Is this what Impala 
expects ?


http://gerrit.cloudera.org:8080/#/c/21961/3/java/calcite-planner/src/main/java/org/apache/impala/calcite/operators/ImpalaMinMaxAggFunction.java@58
PS3, Line 58: return 
callBinding.getOperandType(0).getSqlTypeName().equals(SqlTypeName.TIMESTAMP)
It wasn't clear why this is checking for TIMESTAMP even though the 
inferReturnType is mainly concerned with string types and null type.



--
To view, visit http://gerrit.cloudera.org:8080/21961
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I038127d6a2f228ae8d263e983b1906e99ae05f77
Gerrit-Change-Number: 21961
Gerrit-PatchSet: 3
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Tue, 22 Oct 2024 20:52:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13465: Trace TupleId further to reduce Agg cardinality

2024-10-20 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21955 )

Change subject: IMPALA-13465: Trace TupleId further to reduce Agg cardinality
..


Patch Set 2: Code-Review+2

This LGTM. Carry over Michael's +1.


--
To view, visit http://gerrit.cloudera.org:8080/21955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I11f59ccc469c24c1800abaad3774c56190306944
Gerrit-Change-Number: 21955
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Mon, 21 Oct 2024 05:48:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13405: Do tuple analysis to lower AggregationNode cardinality

2024-10-17 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21860 )

Change subject: IMPALA-13405: Do tuple analysis to lower AggregationNode 
cardinality
..


Patch Set 13: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21860/11/fe/src/main/java/org/apache/impala/planner/AggregationNode.java
File fe/src/main/java/org/apache/impala/planner/AggregationNode.java:

http://gerrit.cloudera.org:8080/#/c/21860/11/fe/src/main/java/org/apache/impala/planner/AggregationNode.java@368
PS11, Line 368:   // (in other words, the lowest PlanNode in query plan 
tree).
> Thank you Aman for the counter example. I craft one planner test under agg-
Yes, we can use a memoization approach in the future.  The test with nested 
views looks good.  Marking this resolved.



--
To view, visit http://gerrit.cloudera.org:8080/21860
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icd589ab5f7ba9566a0d35784f61f5ffaef5696e7
Gerrit-Change-Number: 21860
Gerrit-PatchSet: 13
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Fri, 18 Oct 2024 00:03:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13459: Handle duplicate table in same query

2024-10-17 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21945 )

Change subject: IMPALA-13459: Handle duplicate table in same query
..


Patch Set 3:

> Patch Set 3:
>
> There is already a Jira that views in general don't work, but there is 
> nothing specific about duplicate views within a query.  Do you think I should 
> file it now if there is a general "views don't work" jira?

I suppose you could handle the view aliases referring to the same view later 
when views are supported, so I am good with this.


--
To view, visit http://gerrit.cloudera.org:8080/21945
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9997bc642c320c2e26294d7d02a05bccbba6a0d
Gerrit-Change-Number: 21945
Gerrit-PatchSet: 3
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Anonymous Coward (816)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Thu, 17 Oct 2024 23:55:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13459: Handle duplicate table in same query

2024-10-17 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21945 )

Change subject: IMPALA-13459: Handle duplicate table in same query
..


Patch Set 3:

Does this unique alias work for views as well ?
If not, can we create a separate jira for that ?


--
To view, visit http://gerrit.cloudera.org:8080/21945
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9997bc642c320c2e26294d7d02a05bccbba6a0d
Gerrit-Change-Number: 21945
Gerrit-PatchSet: 3
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Anonymous Coward (816)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Thu, 17 Oct 2024 21:09:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13405: Do tuple analysis to lower AggregationNode cardinality

2024-10-17 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21860 )

Change subject: IMPALA-13405: Do tuple analysis to lower AggregationNode 
cardinality
..


Patch Set 11:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21860/11//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21860/11//COMMIT_MSG@12
PS11, Line 12: AggragationNode
nit: typo "AggragationNode" -> AggregationNode


http://gerrit.cloudera.org:8080/#/c/21860/11/fe/src/main/java/org/apache/impala/planner/AggregationNode.java
File fe/src/main/java/org/apache/impala/planner/AggregationNode.java:

http://gerrit.cloudera.org:8080/#/c/21860/11/fe/src/main/java/org/apache/impala/planner/AggregationNode.java@368
PS11, Line 368:   List postOrderNodes = 
planNode.getNodesPostOrder();
Just thinking about a more complex plan where we have nested subqueries or 
views.  In this, there will be repeated post order traversals of the sub trees 
for each level of the group-by.
For example:
create view v1 as select a, b, c, count(*) from t1 inner join t2 on t1.x = t2.y 
group by a, b, c;
create view v2 as select a, b, count(*) from v1 group by a, b;
select a, count(*) from v2 group by a;

For doing the estimate of the outermost query, it would do a traversal of v2 
and v1, although v1 would have already been traversed when v2's cardinality was 
estimated. For TPC-DS, this may not be a concern but just thinking about other 
scenarios. The tradeoff of extra compilation time may be acceptable but just 
want to bring up for discussion.



--
To view, visit http://gerrit.cloudera.org:8080/21860
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icd589ab5f7ba9566a0d35784f61f5ffaef5696e7
Gerrit-Change-Number: 21860
Gerrit-PatchSet: 11
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Thu, 17 Oct 2024 17:57:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13430: Too many RelNodes created for "IN" literals

2024-10-13 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21911 )

Change subject: IMPALA-13430: Too many RelNodes created for "IN" literals
..


Patch Set 4: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21911/4/java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaProjectRel.java
File 
java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaProjectRel.java:

http://gerrit.cloudera.org:8080/#/c/21911/4/java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaProjectRel.java@148
PS4, Line 148:   private boolean isCoercedProjectForValues(ParentPlanRelContext 
context) {
> I don't feel too comfortable keeping RelNode information in a global contex
Marking this resolved.



--
To view, visit http://gerrit.cloudera.org:8080/21911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifc3d84c70af9cd4db44359c4ab7f0c9eb70738f5
Gerrit-Change-Number: 21911
Gerrit-PatchSet: 4
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Mon, 14 Oct 2024 00:58:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13430: Too many RelNodes created for "IN" literals

2024-10-13 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21911 )

Change subject: IMPALA-13430: Too many RelNodes created for "IN" literals
..


Patch Set 4: Code-Review+1

(1 comment)

I have a suggestion below about improving the code. Doing a +1 for now. 
Depending on your feedback on this, can bump it up.

http://gerrit.cloudera.org:8080/#/c/21911/4/java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaProjectRel.java
File 
java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaProjectRel.java:

http://gerrit.cloudera.org:8080/#/c/21911/4/java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaProjectRel.java@148
PS4, Line 148:   private boolean isCoercedProjectForValues(ParentPlanRelContext 
context) {
> Line 151 looks specifically for the Values node pattern and returns right a
Ok, yeah that limits it to only the Project-Values plan; although the second 
part of my comment was actually more relevant.  I understand this current 
implementation is a hack but it seems a simpler thing would be to just store 
the Project that was created in CoerceNode.processValues() in a List or a Set 
in a planner context class and just do a lookup here.  The iterating through 
all the project exprs seems redundant if we have full control over what happens 
in CoerceNode.



--
To view, visit http://gerrit.cloudera.org:8080/21911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifc3d84c70af9cd4db44359c4ab7f0c9eb70738f5
Gerrit-Change-Number: 21911
Gerrit-PatchSet: 4
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Sun, 13 Oct 2024 19:31:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13430: Too many RelNodes created for "IN" literals

2024-10-12 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21911 )

Change subject: IMPALA-13430: Too many RelNodes created for "IN" literals
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21911/4/java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaProjectRel.java
File 
java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaProjectRel.java:

http://gerrit.cloudera.org:8080/#/c/21911/4/java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaProjectRel.java@148
PS4, Line 148:   private boolean isCoercedProjectForValues(ParentPlanRelContext 
context) {
The logic in this method is looking for a specific pattern for project 
expressions but what's the guarantee that the Values rel is the only case where 
such a pattern occurs ?  The number of project expressions could  be quite 
large, so in the worst case, this could be iterating over all of them.
Is there a better way to just annotate a Project (or a derived class of 
Project) sitting above a Values rel ?



--
To view, visit http://gerrit.cloudera.org:8080/21911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifc3d84c70af9cd4db44359c4ab7f0c9eb70738f5
Gerrit-Change-Number: 21911
Gerrit-PatchSet: 4
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Sat, 12 Oct 2024 22:07:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13430: Too many RelNodes created for "IN" literals

2024-10-11 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21911 )

Change subject: IMPALA-13430: Too many RelNodes created for "IN" literals
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21911/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21911/3//COMMIT_MSG@13
PS3, Line 13: within one Values RelNode.
For large IN-lists, Calcite supports the LogicalValues relnode (derived from 
Values) and converts it to a join [see comments in [1]).  This would be the 
most efficient representation but we would need a physical representation of 
this as well, so something to consider for future work.

[1]
https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java#L6470


http://gerrit.cloudera.org:8080/#/c/21911/3//COMMIT_MSG@18
PS3, Line 18: limitation prevents the string literal to be of type "string" but
: instead is of type "char(x)"
For an IN list which has string literals of different lengths , 'ab', 'abcd' 
etc.  would Calcite treat each one as different length char ?  Curious why it 
would not just treat it as a VARCHAR instead.  Can we look at the place it does 
this coercion to validate the behavior ?



--
To view, visit http://gerrit.cloudera.org:8080/21911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifc3d84c70af9cd4db44359c4ab7f0c9eb70738f5
Gerrit-Change-Number: 21911
Gerrit-PatchSet: 3
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Fri, 11 Oct 2024 19:29:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column

2024-05-23 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21377 )

Change subject: IMPALA-8042: Assign BETWEEN selectivity for discrete-unique 
column
..


Patch Set 6: Code-Review+2

(2 comments)

I reviewed the TPC-DS plan changes and they are along expected lines due to the 
change in selectivity estimate, mostly for the between predicate on date_dim 
table. Bumping to +2.

A note on the range predicate selectivity in general: this is normally done by 
histograms but in its absence in Impala, this patch is solving a narrower scope 
of this estimation. Future improvements in stats could subsume this patch.

http://gerrit.cloudera.org:8080/#/c/21377/6/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test:

http://gerrit.cloudera.org:8080/#/c/21377/6/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test@202
PS6, Line 202: |  |  tuple-ids=19,21 row-size=32B cardinality=9.65M 
cost=611185704
> Bulk of HashJoin memory is for the builder side. And in this case, I think
Makes sense.


http://gerrit.cloudera.org:8080/#/c/21377/6/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test:

http://gerrit.cloudera.org:8080/#/c/21377/6/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test@123
PS6, Line 123: 40.26M
> Yes, I think it is lower due to selective partition filter RF002 that comes
Done



--
To view, visit http://gerrit.cloudera.org:8080/21377
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib349d97349d1ee99788645a66be1b81749684d10
Gerrit-Change-Number: 21377
Gerrit-PatchSet: 6
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Fri, 24 May 2024 00:07:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column

2024-05-23 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21377 )

Change subject: IMPALA-8042: Assign BETWEEN selectivity for discrete-unique 
column
..


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21377/6/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test:

http://gerrit.cloudera.org:8080/#/c/21377/6/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q05.test@202
PS6, Line 202: |  |  tuple-ids=19,21 row-size=32B cardinality=9.65M 
cost=611185704
The estimate for the rows in the build side of the hashjoin dropped from 7.3K 
to 16, so the output cardinality reduction of the hash join makes sense.  Also, 
the row size reduced by half.  However, the overall memory estimate on line 134 
didn't change. Any thoughts on that ?
(Although on a different query , q16, it does affect the the fragment's 
per-instance memory estimate).


http://gerrit.cloudera.org:8080/#/c/21377/6/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test:

http://gerrit.cloudera.org:8080/#/c/21377/6/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test@123
PS6, Line 123: 40.26M
This scan's cardinality estimate dropped but there's no BETWEEN predicate on 
this table.  Is the change coming from the RT filters ?



--
To view, visit http://gerrit.cloudera.org:8080/21377
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib349d97349d1ee99788645a66be1b81749684d10
Gerrit-Change-Number: 21377
Gerrit-PatchSet: 6
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 23 May 2024 23:17:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column

2024-05-22 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21377 )

Change subject: IMPALA-8042: Assign BETWEEN selectivity for discrete-unique 
column
..


Patch Set 6: Code-Review+1

(2 comments)

Updated changes Lgtm. Doing a +1.  I had done a cursory review of the tpc-ds 
plan changes. Once I review that a little more closely, will bump up the vote.

http://gerrit.cloudera.org:8080/#/c/21377/2/fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java
File fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java:

http://gerrit.cloudera.org:8080/#/c/21377/2/fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java@104
PS2, Line 104: hasNumDist
> It is enough to check for hasNumDistinctValues() only. Updated this code ac
Done


http://gerrit.cloudera.org:8080/#/c/21377/2/fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java@107
PS2, Line 107: numNotNulls
> Added tests against functional_parquet.unique_with_nulls.
Thx for adding this new table.



--
To view, visit http://gerrit.cloudera.org:8080/21377
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib349d97349d1ee99788645a66be1b81749684d10
Gerrit-Change-Number: 21377
Gerrit-PatchSet: 6
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 22 May 2024 21:23:22 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column

2024-05-19 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21377 )

Change subject: IMPALA-8042: Assign BETWEEN selectivity for discrete-unique 
column
..


Patch Set 2:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/21377/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21377/2//COMMIT_MSG@18
PS2, Line 18: This patch adds a narrow optimization of BetweenPredicate 
selectivity
If the user's query has  a predicate unique_col >=5 AND unique_col <= 10 
instead of the BETWEEN, then this patch is not applying the selectivity 
calculation even though BETWEEN gets converted to the first representation.  Is 
the reason mainly to keep this narrow in scope ?


http://gerrit.cloudera.org:8080/#/c/21377/2/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

http://gerrit.cloudera.org:8080/#/c/21377/2/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@2626
PS2, Line 2626:   // Prioritize members of 'betweenPredicates' ahead of 
'result'.
The 'prioritization' part was not clear.. why exactly is it needed ? .. could 
you elaborate a bit on this ?


http://gerrit.cloudera.org:8080/#/c/21377/2/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/21377/2/fe/src/main/java/org/apache/impala/analysis/Expr.java@1856
PS2, Line 1856:   public long evalToInteger(Analyzer analyzer, String name, 
boolean acceptDate)
nit: in the comments above, could you add a note about the acceptDate parameter 
?


http://gerrit.cloudera.org:8080/#/c/21377/2/fe/src/main/java/org/apache/impala/planner/PlanNode.java
File fe/src/main/java/org/apache/impala/planner/PlanNode.java:

http://gerrit.cloudera.org:8080/#/c/21377/2/fe/src/main/java/org/apache/impala/planner/PlanNode.java@759
PS2, Line 759:* The first issue is addressed by using a single default 
selectivity that is
nit: this comment needs updating to reflect the between selectivity.


http://gerrit.cloudera.org:8080/#/c/21377/2/fe/src/main/java/org/apache/impala/planner/PlanNode.java@774
PS2, Line 774: assigned
nit: have 'been' assigned ..


http://gerrit.cloudera.org:8080/#/c/21377/2/fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java
File fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java:

http://gerrit.cloudera.org:8080/#/c/21377/2/fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java@104
PS2, Line 104: hasStats()
Looking at the hasStats() method:
   public boolean hasStats() { return numNulls_ != -1 || numDistinctValues_ != 
-1; }

It can return true if either of the 2 stats are present.  But the computation 
below depends on both being present.  So it would be good to check for both 
explicitly.


http://gerrit.cloudera.org:8080/#/c/21377/2/fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java@107
PS2, Line 107: numNotNulls
Is there any test that covers the case where the null count made a difference 
to the Between selectivity ?


http://gerrit.cloudera.org:8080/#/c/21377/2/fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java
File fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java:

http://gerrit.cloudera.org:8080/#/c/21377/2/fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java@544
PS2, Line 544:* something like 0.33^2.
nit: this last sentence should be updated for the new behavior.



--
To view, visit http://gerrit.cloudera.org:8080/21377
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib349d97349d1ee99788645a66be1b81749684d10
Gerrit-Change-Number: 21377
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Mon, 20 May 2024 06:44:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12291 impala checks hdfs ranger policy

2024-04-05 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20221 )

Change subject: IMPALA-12291 impala checks hdfs ranger policy
..


Patch Set 9:

> Patch Set 9:
>
> Hi all, based on our discussion so far, there are 2 approaches discussed. Any 
> any better alternative is also appreciated!
> - Set the access level of a loaded table to both READ and WRITE as long as 
> Ranger is used as the authorization provider.
> - Introduce a startup flag to allow the administrator to decide whether to 
> skip the file system permissions check during table loading.
>
> There are some implementation details to consider for each approach.
> - How to determine whether Ranger is enabled (corresponding to the 1st 
> approach)? It seems checking the value of the key 
> 'dfs.namenode.inode.attributes.provider.class' in core-site.xml via the 
> instance of Configuration as done in the patch set 9 could not be verified 
> easily via a new test due to HDFS Ranger plug-in not being configured 
> correctly. To be more specific, if we try to add the following configuration 
> via  
> https://github.com/apache/impala/blob/master/testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.py,
>  the name node of HDFS could not be started. More plumbing has to be done to 
> set up the plug-in. On the other hand, using the instance of BackendConfig 
> allows us to add an end-to-end test to briefly verify Impala's behavior after 
> the patch more easily.
>
> 
>  dfs.namenode.inode.attributes.provider.class
>  org.apache.ranger.authorization.hadoop.RangerHdfsAuthorizer
> 
>
> - What should be the default behavior with respect to setting the access 
> level of a loaded table after this patch (corresponding to the 2nd approach)? 
> It looks like making Impala assume the READ and WRITE access by default may 
> be better especially in the legacy catalog mode. This relieves the Impala 
> administrator of having to manually tweak the HDFS access control entries of 
> the target HDFS paths that are not writable to the Impala service every time 
> when an end user encounters such a problem.
>
> I have also collected the related tests that need to be revised if we decide 
> to adopt the 2nd approach and make Impala assume the READ and WRITE access by 
> default.
> - 
> https://github.com/apache/impala/blob/master/tests/metadata/test_hdfs_permissions.py#L56
>  (TestHdfsPermissions.test_insert_into_read_only_table()).
> - 
> https://github.com/apache/impala/blob/master/tests/query_test/test_insert_behaviour.py#L563
>  (TestInsertBehaviour.test_multiple_group_acls).
> - 
> https://github.com/apache/impala/blob/master/tests/query_test/test_insert_behaviour.py#L331
>  (TestInsertBehaviour.test_readonly_table_dir).
> -  
> https://github.com/apache/impala/blob/master/tests/query_test/test_insert_behaviour.py#L362
>  (TestInsertBehaviour.test_insert_acl_permissions).
> -  
> https://github.com/apache/impala/blob/master/tests/query_test/test_insert_behaviour.py#L439
>  (TestInsertBehaviour.test_load_permissions).
> -  
> https://github.com/apache/impala/blob/master/tests/query_test/test_insert_behaviour.py#L252
>  (TestInsertBehaviour.test_mixed_partition_permissions).
> -  
> https://github.com/apache/impala/blob/master/tests/query_test/test_insert_behaviour.py#L202
>  (TestInsertBehaviour.test_insert_file_permissions).
> -  
> https://github.com/apache/impala/blob/master/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java#L4008
>  (AnalyzeStmtsTest.TestLoadData).
> -  
> https://github.com/apache/impala/blob/master/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java#L4042-L4047
>  (AnalyzeStmtsTest.TestInsert).
> - 
> https://github.com/apache/impala/blob/master/fe/src/test/java/org/apache/impala/catalog/CatalogObjectToFromThriftTest.java#L110-L119
>  (CatalogObjectToFromThriftTest.TestPartitionedTable).

The first option sounds good to me:
- Set the access level of a loaded table to both READ and WRITE as long as 
Ranger is used as the authorization provider.

The second option of introducing a startup flag would create confusion because 
it would only apply to the legacy catalog mode, not the local catalog mode. 
Further, for the cloud object stores, we are in any case doing the first 
approach.

The downside of the first approach is that it will defer the full access level 
checks to execution time and suppose some partitions allow the write operation 
while other partitions don't, we can end up with partial writes into the table. 
 However, this issue already exists with the local catalog mode (which is the 
default mode).


--
To view, visit http://gerrit.cloudera.org:8080/20221
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id33c400fbe0c918b6b65d713b09009512835a4c9
Gerrit-Change-Number: 20221
Gerrit-PatchSet: 

[Impala-ASF-CR] IMPALA-12872: Use Calcite for optimization - part 1: simple queries

2024-04-05 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21109 )

Change subject: IMPALA-12872: Use Calcite for optimization - part 1: simple 
queries
..


Patch Set 17:

(14 comments)

http://gerrit.cloudera.org:8080/#/c/21109/17/java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/util/CreateExprVisitor.java
File 
java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/util/CreateExprVisitor.java:

http://gerrit.cloudera.org:8080/#/c/21109/17/java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/util/CreateExprVisitor.java@1
PS17, Line 1: /*
In some files, the License header is enclosed in the multi statement syntax /*. 
 .. */ whereas in others it is the single line comment.
Pls make it consistent in all the files and use whatever is being used in 
existing Impala files.
Impala uses the single line comment style.  '// '


http://gerrit.cloudera.org:8080/#/c/21109/17/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/ImpalaCalciteCatalogReader.java
File 
java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/ImpalaCalciteCatalogReader.java:

http://gerrit.cloudera.org:8080/#/c/21109/17/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/ImpalaCalciteCatalogReader.java@30
PS17, Line 30:  queryCtx;
Pls add underscore


http://gerrit.cloudera.org:8080/#/c/21109/17/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/ImpalaCalciteCatalogReader.java@31
PS17, Line 31: stmtTableCache
Add underscore


http://gerrit.cloudera.org:8080/#/c/21109/16/java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteJniFrontend.java
File 
java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteJniFrontend.java:

http://gerrit.cloudera.org:8080/#/c/21109/16/java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteJniFrontend.java@94
PS16, Line 94: :
nit: add space or newline after :  so that the query string is separated out.


http://gerrit.cloudera.org:8080/#/c/21109/16/java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteJniFrontend.java@148
PS16, Line 148:   if (e == null) {
> I wanted the ability to print the stack trace hree.  But I also throw an In
So this should be e != null then.


http://gerrit.cloudera.org:8080/#/c/21109/16/java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteJniFrontend.java@166
PS16, Line 166:   if (line.trim().startsWith("--") || 
line.trim().equals("")) {
Comments can be multi-statement also e.g.
 /*  ignore
   this
*/
SELECT a, b, c.
There could be other patterns not handled here.  Can we not use the parser 
methods directly ? If this is supposed to be a temporary function or if this is 
expected to go through revision later, pls add the relevant comments.


http://gerrit.cloudera.org:8080/#/c/21109/16/java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteJniFrontend.java@209
PS16, Line 209: LOG.info("Using Impala planner");
Can you make this consistent with the corresponding message for Calcite planner 
on line 94.  Something like:
LOG.info("Using Impala Planner for the following query: " + queryCtx.getStmt());


http://gerrit.cloudera.org:8080/#/c/21109/16/java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteJniFrontend.java@245
PS16, Line 245: Frontend Timeline (Calcite Planner)")
This QueryContext is used in both cases right ?  i.e if Calcite planner fails, 
fall back to Impala planner.  So this message should be modified if the fall 
back occurs.


http://gerrit.cloudera.org:8080/#/c/21109/17/java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteJniFrontend.java
File 
java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteJniFrontend.java:

http://gerrit.cloudera.org:8080/#/c/21109/17/java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteJniFrontend.java@116
PS17, Line 116: logical plan
nit: to be precise, can we say 'initial logical plan'


http://gerrit.cloudera.org:8080/#/c/21109/17/java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteJniFrontend.java@121
PS17, Line 121: optimized plan
nit: to be precise, can we say 'optimized logical plan' .


http://gerrit.cloudera.org:8080/#/c/21109/17/java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteJniFrontend.java@234
PS17, Line 234:   // hack to match the FrontEnd code
Can you add some more context to this .. which part of the FE code is this 
referring to ?


http://gerrit.cloudera.org:8080/#/c/21109/17/java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteQueryParser.java
File 
java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteQueryParser.java:

http://gerrit.cloudera.org:8080/#/c/21109/17/java/calcite-planner/src/main/java/org/apache/impala

[Impala-ASF-CR] IMPALA-12872: Use Calcite for optimization - part 1: simple queries

2024-03-29 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21109 )

Change subject: IMPALA-12872: Use Calcite for optimization - part 1: simple 
queries
..


Patch Set 16:

(7 comments)

A few more comments based on a second pass.

http://gerrit.cloudera.org:8080/#/c/21109/16/bin/set-classpath.sh
File bin/set-classpath.sh:

http://gerrit.cloudera.org:8080/#/c/21109/16/bin/set-classpath.sh@62
PS16, Line 62: FE
nit: use expanded form, not acronym


http://gerrit.cloudera.org:8080/#/c/21109/16/bin/start-impala-cluster.py
File bin/start-impala-cluster.py:

http://gerrit.cloudera.org:8080/#/c/21109/16/bin/start-impala-cluster.py@182
PS16, Line 182: U
nit: lowercase 'u'.


http://gerrit.cloudera.org:8080/#/c/21109/16/bin/start-impala-cluster.py@183
PS16, Line 183:   "instead of JniFrontend.")
JniFrontend is an internal class, shouldn't be mentioned in a user message.  
How about 'If true, use the Calcite planner for query optimization instead of 
Impala planner'


http://gerrit.cloudera.org:8080/#/c/21109/16/fe/src/main/java/org/apache/impala/planner/PlannerContext.java
File fe/src/main/java/org/apache/impala/planner/PlannerContext.java:

http://gerrit.cloudera.org:8080/#/c/21109/16/fe/src/main/java/org/apache/impala/planner/PlannerContext.java@97
PS16, Line 97:   // Constructor useful for an external planner module
nit: this comment can be removed since the purpose of this patch is to make 
Calcite planner an internal module.


http://gerrit.cloudera.org:8080/#/c/21109/16/java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaHdfsScanRel.java
File 
java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaHdfsScanRel.java:

http://gerrit.cloudera.org:8080/#/c/21109/16/java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaHdfsScanRel.java@95
PS16, Line 95: List scanOutputExprs = new 
ArrayList<>(Collections.nCopies(totalCols, null));
For wide tables where we are only needing a few columns projected, we will end 
up with a long list with mostly Nulls.
A LinkedHashMap (preserves Insertion order) where the key is position and value 
is the SlotRef would be better suited despite the cpu cost of hashing.  In 
general, in a query planner,  memory is the most precious commodity since the 
plan search space can be large, so anything we can do to reduce memory 
footprint would be preferred.
That said, I would be ok if this is done in a subsequent patch, just keep track 
through a Jira.


http://gerrit.cloudera.org:8080/#/c/21109/15/java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteJniFrontend.java
File 
java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteJniFrontend.java:

http://gerrit.cloudera.org:8080/#/c/21109/15/java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteJniFrontend.java@52
PS15, Line 52: an experimental
nit: remove experimental


http://gerrit.cloudera.org:8080/#/c/21109/16/java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteJniFrontend.java
File 
java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteJniFrontend.java:

http://gerrit.cloudera.org:8080/#/c/21109/16/java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteJniFrontend.java@148
PS16, Line 148:   if (e == null) {
Why is this checking for null here ? e is already being referenced above.



--
To view, visit http://gerrit.cloudera.org:8080/21109
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I453fd75b7b705f4d7de1ed73c3e24cafad0b8c98
Gerrit-Change-Number: 21109
Gerrit-PatchSet: 16
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Sat, 30 Mar 2024 01:56:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12872: Use Calcite for ...

2024-03-18 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21109 )

Change subject: IMPALA-12872: Use Calcite for ...
..


Patch Set 5:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/21109/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21109/5//COMMIT_MSG@7
PS5, Line 7: IMPALA-12872: Use Calcite for ...
nit: combine the two lines. it seems short enough.  How about 'Use Calcite for 
optimization - part 1: simple queries'


http://gerrit.cloudera.org:8080/#/c/21109/5/java/experimental-planner/pom.xml
File java/experimental-planner/pom.xml:

http://gerrit.cloudera.org:8080/#/c/21109/5/java/experimental-planner/pom.xml@29
PS5, Line 29:   experimental-planner
you could call it calcite-planner since there's a clear association. I would 
think that if there was a configuration option introduced (at a later time) to 
enable this planner, it would likely be something like 'use_new_planner' or 
'use_calcite_planner' rather than use_experimental_planner.


http://gerrit.cloudera.org:8080/#/c/21109/5/java/experimental-planner/src/main/java/org/apache/impala/calcite/rel/node/ParentPlanRelContext.java
File 
java/experimental-planner/src/main/java/org/apache/impala/calcite/rel/node/ParentPlanRelContext.java:

http://gerrit.cloudera.org:8080/#/c/21109/5/java/experimental-planner/src/main/java/org/apache/impala/calcite/rel/node/ParentPlanRelContext.java@33
PS5, Line 33: analyzer
nit:use underscore to be consistent


http://gerrit.cloudera.org:8080/#/c/21109/5/java/experimental-planner/src/main/java/org/apache/impala/calcite/rel/node/ParentPlanRelContext.java@36
PS5, Line 36: ctx
nit: use underscore to be consistent with other variable names.


http://gerrit.cloudera.org:8080/#/c/21109/5/java/experimental-planner/src/main/java/org/apache/impala/calcite/rel/node/ParentPlanRelContext.java@39
PS5, Line 39:   public final Class parentClass_;
We should use the actual base class name (RelNode/ImpalaPlanRel) instead of the 
generic Class.


http://gerrit.cloudera.org:8080/#/c/21109/5/java/experimental-planner/src/main/java/org/apache/impala/calcite/schema/CalciteDb.java
File 
java/experimental-planner/src/main/java/org/apache/impala/calcite/schema/CalciteDb.java:

http://gerrit.cloudera.org:8080/#/c/21109/5/java/experimental-planner/src/main/java/org/apache/impala/calcite/schema/CalciteDb.java@31
PS5, Line 31: tableMap
nit: use underscore


http://gerrit.cloudera.org:8080/#/c/21109/5/java/experimental-planner/src/main/java/org/apache/impala/calcite/schema/CalciteTable.java
File 
java/experimental-planner/src/main/java/org/apache/impala/calcite/schema/CalciteTable.java:

http://gerrit.cloudera.org:8080/#/c/21109/5/java/experimental-planner/src/main/java/org/apache/impala/calcite/schema/CalciteTable.java@78
PS5, Line 78:   private List foreignKeys;
This field is not used in this implementation.  Is it for future use ?


http://gerrit.cloudera.org:8080/#/c/21109/5/java/experimental-planner/src/main/java/org/apache/impala/calcite/service/CalciteOptimizer.java
File 
java/experimental-planner/src/main/java/org/apache/impala/calcite/service/CalciteOptimizer.java:

http://gerrit.cloudera.org:8080/#/c/21109/5/java/experimental-planner/src/main/java/org/apache/impala/calcite/service/CalciteOptimizer.java@87
PS5, Line 87: LOG.debug(getDebugString(resultObject));
nit: should this logging be in an else block ?


http://gerrit.cloudera.org:8080/#/c/21109/5/java/experimental-planner/src/main/java/org/apache/impala/calcite/util/NotSupported.java
File 
java/experimental-planner/src/main/java/org/apache/impala/calcite/util/NotSupported.java:

http://gerrit.cloudera.org:8080/#/c/21109/5/java/experimental-planner/src/main/java/org/apache/impala/calcite/util/NotSupported.java@33
PS5, Line 33:
nit: missing 'are' .  Here and subsequent messages.



--
To view, visit http://gerrit.cloudera.org:8080/21109
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I453fd75b7b705f4d7de1ed73c3e24cafad0b8c98
Gerrit-Change-Number: 21109
Gerrit-PatchSet: 5
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Mon, 18 Mar 2024 22:58:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12847: Expose computeScanRangeLocations and computeStats

2024-02-27 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21077 )

Change subject: IMPALA-12847: Expose computeScanRangeLocations and computeStats
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21077
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia621309c67455bb599f71bec9efc1f67fc085022
Gerrit-Change-Number: 21077
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 28 Feb 2024 07:06:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12847: Expose computeScanRangeLocations and computeStats

2024-02-27 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21077 )

Change subject: IMPALA-12847: Expose computeScanRangeLocations and computeStats
..


Patch Set 2: Code-Review+1

(4 comments)

Thanks for the patch.  A few small comments but changes overall lgtm.

http://gerrit.cloudera.org:8080/#/c/21077/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/21077/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@218
PS2, Line 218: is
nit: 'if' instead of 'is'


http://gerrit.cloudera.org:8080/#/c/21077/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@221
PS2, Line 221: private
Can we make this protected to be consistent with partitions_  member veriable.


http://gerrit.cloudera.org:8080/#/c/21077/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@431
PS2, Line 431: getSampledPartitions
It would be preferable to name this method such that it reflects it can return 
either sampled or raw partitions.  The current naming gives a first impression 
of returning only sampled partitions.
How about  getSampledOrRawPartitions().


http://gerrit.cloudera.org:8080/#/c/21077/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1161
PS2, Line 1161: If 'sampleParams_' is not null
Since this method is no longer checking sampledParams_ itself but relying on 
other method, you could modify the comment accordingly.



--
To view, visit http://gerrit.cloudera.org:8080/21077
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia621309c67455bb599f71bec9efc1f67fc085022
Gerrit-Change-Number: 21077
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 28 Feb 2024 01:59:59 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12462: Update only changed partitions after COMPUTE STATS

2023-09-25 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20505 )

Change subject: IMPALA-12462: Update only changed partitions after COMPUTE STATS
..


Patch Set 5: Code-Review+1

(1 comment)

> Patch Set 4:
>
> (1 comment)

http://gerrit.cloudera.org:8080/#/c/20505/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/20505/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1891
PS4, Line 1891: !existingPartStats.equals(partitionStats)
> Added a test for incremental stats.
Sounds good.  Thx for adding the test.  A related point is that we advise not 
running a mix of compute stats and compute incremental stats on the same table 
.. so it is one or the other.



--
To view, visit http://gerrit.cloudera.org:8080/20505
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2703e0790d5c25db98ed26f26f6d96281c366a3
Gerrit-Change-Number: 20505
Gerrit-PatchSet: 5
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Mon, 25 Sep 2023 14:21:09 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12462: Update only changed partitiones after COMPUTE STATS

2023-09-24 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20505 )

Change subject: IMPALA-12462: Update only changed partitiones after COMPUTE 
STATS
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20505/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/20505/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1891
PS4, Line 1891: !existingPartStats.equals(partitionStats)
There's some hidden cost here for the equals method as it may be doing an 
expensive comparison if incremental stats are present and this would add up for 
thousands of partitions. Any thoughts on that ?That said, the  trade off 
could be acceptable considering the advantage of not performing the partition 
stats update.



-- 
To view, visit http://gerrit.cloudera.org:8080/20505
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2703e0790d5c25db98ed26f26f6d96281c366a3
Gerrit-Change-Number: 20505
Gerrit-PatchSet: 4
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Sun, 24 Sep 2023 22:38:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12357: Skip scheduling bloom filter from full-build scan

2023-08-28 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20366 )

Change subject: IMPALA-12357: Skip scheduling bloom filter from full-build scan
..


Patch Set 6: Code-Review+1

(1 comment)

Revised patch set LGTM.

http://gerrit.cloudera.org:8080/#/c/20366/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20366/4//COMMIT_MSG@29
PS4, Line 29: thus reducing the bloom filter building,
: aggregation, and evalua
> This patch does not differentiate between the locality of the filter target
Done



--
To view, visit http://gerrit.cloudera.org:8080/20366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I494533bc06da84e606cbd1ae161908089a5e
Gerrit-Change-Number: 20366
Gerrit-PatchSet: 6
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Tue, 29 Aug 2023 06:42:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12357: Skip scheduling bloom filter from full-build scan

2023-08-25 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20366 )

Change subject: IMPALA-12357: Skip scheduling bloom filter from full-build scan
..


Patch Set 4:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/20366/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20366/4//COMMIT_MSG@22
PS4, Line 22: e any predicate filter
> You mean planner add null checking predicate on join key column? I don't th
Right, currently Impala does not add the IS NOT NULL predicate for the join 
columns.  Hive does add this for inner joins (for outer joins it won't be 
applicable).


http://gerrit.cloudera.org:8080/#/c/20366/4//COMMIT_MSG@29
PS4, Line 29: thus reducing the bloom filter aggregation
: overhead in coordinator
This applies to the remote bloom filters right ? For local filters that are not 
sent to the coordinator, this patch could still potentially help for the high 
fpp case.


http://gerrit.cloudera.org:8080/#/c/20366/4/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
File fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java:

http://gerrit.cloudera.org:8080/#/c/20366/4/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@252
PS4, Line 252: private int rank_ = 1;
Is 'rank' the right term here ?  Normally, rank implies some type of comparison 
function.  Perhaps 'level' is more appropriate to indicate level in a subtree.


http://gerrit.cloudera.org:8080/#/c/20366/4/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@844
PS4, Line 844: Math.max(0.9
For testing purposes or for a backdoor in a real deployment, would it be useful 
to tune this ?  Maybe in some cases we might want higher threshold or if there 
is a bug in the fpp estimation for any reason.  I could also be convinced to 
not add yet another tuning option - so open to a counter argument.



--
To view, visit http://gerrit.cloudera.org:8080/20366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I494533bc06da84e606cbd1ae161908089a5e
Gerrit-Change-Number: 20366
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Sat, 26 Aug 2023 00:10:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12282: Refine correlation factor in AggregationNode

2023-07-13 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20194 )

Change subject: IMPALA-12282: Refine correlation factor in AggregationNode
..


Patch Set 2: Code-Review+2

LGTM.


--
To view, visit http://gerrit.cloudera.org:8080/20194
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4ffa9e82b83e7c0042bd918ac132668a47505688
Gerrit-Change-Number: 20194
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Fri, 14 Jul 2023 00:05:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12282: Refine correlation factor in AggregationNode

2023-07-13 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20194 )

Change subject: IMPALA-12282: Refine correlation factor in AggregationNode
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20194/1/fe/src/main/java/org/apache/impala/planner/AggregationNode.java
File fe/src/main/java/org/apache/impala/planner/AggregationNode.java:

http://gerrit.cloudera.org:8080/#/c/20194/1/fe/src/main/java/org/apache/impala/planner/AggregationNode.java@687
PS1, Line 687: 
Math.pow(queryOptions.getAgg_mem_correlation_factor(), nonLiteralExprCount);
I missed one detail when reviewing the previous patch (IMPALA-11842) .. the 
exponent should be N-1 rather than N since the correlation factor is 
representing correlation between 2 columns.  If N = 2 and correlation_factor = 
0.2, we should multiply once by 0.2.  If N.= 1, there shouldn't be a need to 
multiply by the correlation factor. What do you think ?



--
To view, visit http://gerrit.cloudera.org:8080/20194
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4ffa9e82b83e7c0042bd918ac132668a47505688
Gerrit-Change-Number: 20194
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 13 Jul 2023 23:10:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11842: Improve memory estimation for Aggregate

2023-07-12 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20104 )

Change subject: IMPALA-11842: Improve memory estimation for Aggregate
..


Patch Set 14: Code-Review+2

Carry forward +1's from Quanlong, Kurt, Abhishek, Aman and bumping to +2.


--
To view, visit http://gerrit.cloudera.org:8080/20104
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4b4b2e519ee89f0a13fdb62d0471ee4047f6421
Gerrit-Change-Number: 20104
Gerrit-PatchSet: 14
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 13 Jul 2023 01:14:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11842: Improve memory estimation for Aggregate

2023-07-12 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20104 )

Change subject: IMPALA-11842: Improve memory estimation for Aggregate
..


Patch Set 13:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20104/12/common/thrift/ImpalaService.thrift
File common/thrift/ImpalaService.thrift:

http://gerrit.cloudera.org:8080/#/c/20104/12/common/thrift/ImpalaService.thrift@819
PS12, Line 819:  new
> No, the LARGE_AGG_MEM_THRESHOLD will serve as a lower bound of new memory e
Ack



--
To view, visit http://gerrit.cloudera.org:8080/20104
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4b4b2e519ee89f0a13fdb62d0471ee4047f6421
Gerrit-Change-Number: 20104
Gerrit-PatchSet: 13
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 13 Jul 2023 01:08:22 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11842: Improve memory estimation for Aggregate

2023-07-12 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20104 )

Change subject: IMPALA-11842: Improve memory estimation for Aggregate
..


Patch Set 12: Code-Review+1

(4 comments)

Just a few comments below. Overall, looks good !

http://gerrit.cloudera.org:8080/#/c/20104/12/common/thrift/ImpalaService.thrift
File common/thrift/ImpalaService.thrift:

http://gerrit.cloudera.org:8080/#/c/20104/12/common/thrift/ImpalaService.thrift@816
PS12, Line 816: node
nit: to be precise can we say 'aggregation node's memory estimate' is deemed ..


http://gerrit.cloudera.org:8080/#/c/20104/12/common/thrift/ImpalaService.thrift@817
PS12, Line 817: node
nit: same as above


http://gerrit.cloudera.org:8080/#/c/20104/12/common/thrift/ImpalaService.thrift@819
PS12, Line 819: lower
shouldn't this be 'higher' ?  i.e new memory estimate will not be higher than 
the LARGE_AGG_MEM_THRESHOLD ..


http://gerrit.cloudera.org:8080/#/c/20104/12/fe/src/main/java/org/apache/impala/planner/AggregationNode.java
File fe/src/main/java/org/apache/impala/planner/AggregationNode.java:

http://gerrit.cloudera.org:8080/#/c/20104/12/fe/src/main/java/org/apache/impala/planner/AggregationNode.java@682
PS12, Line 682: aggInfo.getGroupingExprs().size()
In some queries, we have seen several constant literals also included in the 
grouping columns.
E.g   GROUP BY  col1, col2, 'abc', 'def'
or for the ROLLUP case, there could be NULL:
e.g GROUP BY col1,  col2, NULL, NULL
In such cases, the correlation adjustment should ideally only consider the 
actual columns, not the constants or NULLs.
If you agree, it is ok to do this in a follow up  patch since the tests are 
already passing and I don't think this should hold it up.



--
To view, visit http://gerrit.cloudera.org:8080/20104
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4b4b2e519ee89f0a13fdb62d0471ee4047f6421
Gerrit-Change-Number: 20104
Gerrit-PatchSet: 12
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 12 Jul 2023 23:20:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12183: Fix cardinality clamping across aggregation phases

2023-06-11 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20009 )

Change subject: IMPALA-12183: Fix cardinality clamping across aggregation phases
..


Patch Set 5: Code-Review+2

(1 comment)

LGTM. Since existing comments have been addressed, bumping to +2.
Thanks Riza for the patch.

http://gerrit.cloudera.org:8080/#/c/20009/5/testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q67.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q67.test:

http://gerrit.cloudera.org:8080/#/c/20009/5/testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q67.test@512
PS5, Line 512: mem
> This is because prior to this patch, each agg class is capped with inputCar
Yes, each agg class in the merge phase of the agg is consuming corresponding 
per-class rows from the first phase. Additional discussion here: 
https://gerrit.cloudera.org/c/20009/4/testdata/workloads/functional-planner/queries/PlannerTest/tpcds-processing-cost.test#322



--
To view, visit http://gerrit.cloudera.org:8080/20009
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1d414fe56b027f887c7f901d8a6799a388b16b95
Gerrit-Change-Number: 20009
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Sun, 11 Jun 2023 23:33:31 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12183: Fix cardinality clamping across aggregation phases

2023-06-07 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20009 )

Change subject: IMPALA-12183: Fix cardinality clamping across aggregation phases
..


Patch Set 5: Code-Review+1

(1 comment)

LGTM.  Will bump the vote once the tests are completed and also give other 
reviewers some time to review.

http://gerrit.cloudera.org:8080/#/c/20009/4/testdata/workloads/functional-planner/queries/PlannerTest/tpcds-processing-cost.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds-processing-cost.test:

http://gerrit.cloudera.org:8080/#/c/20009/4/testdata/workloads/functional-planner/queries/PlannerTest/tpcds-processing-cost.test@322
PS4, Line 322: 3373800
> I put some debugging logs and these are the num group estimates and the cap
Sounds good.  Just expanding a bit on the overall processing cost of this 
operator so we have a common understanding of this:
 - The UNION below the AGGREGATE #127 produces 276K rows.
 - AGGREGATE #127 has 5 agg classes - each class will internally process all 
276K rows but due to the group-by each class produces <= 276K rows and the 
total output cardinality of AGGREGATE #127 is 562K rows.  This corresponds to 
the processing cost of 8308890.  i.e processing 276K * 5 = 1380K rows costs 
8308890
 - AGGREGATE #215 is consuming the already grouped output per-class.  Class 0 
consumes Class 0 coming from #127 etc. so applying the per class cardinality 
you get total cost 3373800 which is 40% of the cost of #127.



--
To view, visit http://gerrit.cloudera.org:8080/20009
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1d414fe56b027f887c7f901d8a6799a388b16b95
Gerrit-Change-Number: 20009
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 07 Jun 2023 18:03:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12183: Fix cardinality clamping across aggregation phases

2023-06-06 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20009 )

Change subject: IMPALA-12183: Fix cardinality clamping across aggregation phases
..


Patch Set 4:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/20009/4/fe/src/main/java/org/apache/impala/planner/AggregationNode.java
File fe/src/main/java/org/apache/impala/planner/AggregationNode.java:

http://gerrit.cloudera.org:8080/#/c/20009/4/fe/src/main/java/org/apache/impala/planner/AggregationNode.java@98
PS4, Line 98: the either of
nit: either the input ..


http://gerrit.cloudera.org:8080/#/c/20009/4/fe/src/main/java/org/apache/impala/planner/AggregationNode.java@104
PS4, Line 104: content
nit: 'contain'


http://gerrit.cloudera.org:8080/#/c/20009/4/fe/src/main/java/org/apache/impala/planner/AggregationNode.java@253
PS4, Line 253: unknownEstimate = true;
If we get an unknown estimate for any one aggregate class, even though the 
final cardinality will be returned as -1, we are continuing the iteration.  Is 
it to initialize the num groups for each class ? Otherwise we could break out 
of the loop.


http://gerrit.cloudera.org:8080/#/c/20009/4/testdata/workloads/functional-planner/queries/PlannerTest/tpcds-processing-cost.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds-processing-cost.test:

http://gerrit.cloudera.org:8080/#/c/20009/4/testdata/workloads/functional-planner/queries/PlannerTest/tpcds-processing-cost.test@322
PS4, Line 322: 3373800
Could you clarify how this cost changed to 1/5th of the original cost ? There 
are 5 aggregation classes so effectively the aggregation is working 
proportionally more.  Similar changes are there in a few other tests.



--
To view, visit http://gerrit.cloudera.org:8080/20009
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1d414fe56b027f887c7f901d8a6799a388b16b95
Gerrit-Change-Number: 20009
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Wed, 07 Jun 2023 01:48:16 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12097: WITH CLAUSE should be skipped when optimizing COUNT(*) query on Iceberg table

2023-04-28 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19811 )

Change subject: IMPALA-12097: WITH CLAUSE should be skipped when optimizing 
COUNT(*) query on Iceberg table
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/19811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7b21cbea79be77f2ea8490bd7f7b2f62063eb0e4
Gerrit-Change-Number: 19811
Gerrit-PatchSet: 2
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 28 Apr 2023 21:14:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12097: WITH CLAUSE should be skipped when optimizing COUNT(*) query on Iceberg table

2023-04-27 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19811 )

Change subject: IMPALA-12097: WITH CLAUSE should be skipped when optimizing 
COUNT(*) query on Iceberg table
..


Patch Set 1: Code-Review+1

(3 comments)

A few minor comments below.

http://gerrit.cloudera.org:8080/#/c/19811/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19811/1//COMMIT_MSG@9
PS1, Line 9: start
nit: 'star' instead of 'start'


http://gerrit.cloudera.org:8080/#/c/19811/1//COMMIT_MSG@15
PS1, Line 15:  * Add e2e tests
Were you also planning to add a planner test for the WITH clause ?  It would be 
useful to have 1 such test.


http://gerrit.cloudera.org:8080/#/c/19811/1/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
File fe/src/main/java/org/apache/impala/analysis/SelectStmt.java:

http://gerrit.cloudera.org:8080/#/c/19811/1/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@1437
PS1, Line 1437: optimizePlainCountStarQuery
nit: Since this function is specifically meant for Iceberg tables, can we 
rename this to optimizePlainCountStarQueryForIceberg() ? It is slightly verbose 
but makes the intent clearer to avoid confusion.



--
To view, visit http://gerrit.cloudera.org:8080/19811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7b21cbea79be77f2ea8490bd7f7b2f62063eb0e4
Gerrit-Change-Number: 19811
Gerrit-PatchSet: 1
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 27 Apr 2023 19:01:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12006: Improve cardinality estimation for joins involving multiple conjuncts

2023-04-11 Thread Aman Sinha (Code Review)
Aman Sinha has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/19682 )

Change subject: IMPALA-12006: Improve cardinality estimation for joins 
involving multiple conjuncts
..

IMPALA-12006: Improve cardinality estimation for joins involving multiple 
conjuncts

When an inner or outer join involves conjuncts such as the following:
 SELECT * FROM t1 inner join
  (SELECT a2, MAX(b2) as max_b2 FROM t2 GROUP BY a2) s2
  ON t1.a1 = s2.a2 AND t1.b1 = s2.max_b2
the stats for the second conjunct involving the MAX expression gets
added to the 'otherEqJoinStats' list. These stats were being used
only when no other equijoin conjunct (involving base columns) was
present but not for the above case, thus leading to over-estimation.

The main change in this patch is an improvement to the cardinality
estimate for such cases by considering the conjuncts in the
'otherEqJoinStats list' in combination with the equijoin conjuncts.

A second change is the addition of a new query option which is a
floating point number between 0 to 1 inclusive:
  join_selectivity_correlation_factor
It defaults to 0.0 which preserves the existing behavior of using
the Minimum selectivity of the conjuncts. Given multiple
join conjuncts C1,C2..Cn having comparable selectivity, we sometimes
see over-estimation by several orders of magnitude. Setting the above
query option to a value higher than 0 first computes the product of the
selectivities: sel(C1) * sel(C2) .. sel(Cn)  and then scales it by
dividing by the join_selectivity_correlation_factor.

Note that this setting applies to all the joins in the query so it has
it's trade offs. Another approach is to use per join hints but there
are limitations of the hints approach too.

Testing:
 - Added planner tests with a combination of outer and inner join with
   different values of join_selectivity_correlation_factor
 - Ran PlannerTest and TpcdsPlannerTest

Change-Id: I845d778a58404af834f7501fc8157a5a4b4bcc35
Reviewed-on: http://gerrit.cloudera.org:8080/19682
Tested-by: Impala Public Jenkins 
Reviewed-by: Kurt Deschler 
Reviewed-by: Quanlong Huang 
---
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M testdata/workloads/functional-planner/queries/PlannerTest/card-outer-join.test
6 files changed, 297 insertions(+), 17 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Kurt Deschler: Looks good to me, but someone else must approve
  Quanlong Huang: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/19682
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I845d778a58404af834f7501fc8157a5a4b4bcc35
Gerrit-Change-Number: 19682
Gerrit-PatchSet: 9
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-12006: Improve cardinality estimation for joins involving multiple conjuncts

2023-04-10 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19682 )

Change subject: IMPALA-12006: Improve cardinality estimation for joins 
involving multiple conjuncts
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/19682/6/fe/src/main/java/org/apache/impala/planner/JoinNode.java
File fe/src/main/java/org/apache/impala/planner/JoinNode.java:

http://gerrit.cloudera.org:8080/#/c/19682/6/fe/src/main/java/org/apache/impala/planner/JoinNode.java@496
PS6, Line 496:   if (corrfactor > 0) cumulativeSel *= (((double) 
joinCard/lhsCard)/rhsCard);
> This would be more readable as (double) joinCard/(lhsCard*rhsCard);
Not doing (lhsCard * rhsCard) was intentional here and and other places.  This 
was to avoid multiplication overflow for large values since both are long 
datatype.  In a prior version of the patch I was doing a checkedMultiply() 
which checks for overflow but changed it to handle the lhsCard and rhsCard 
separately based on Csaba's comment.

Regarding applying the correlation factor, I had a slightly different 
interpretation of this.
If you have 4 equijoin conjuncts  c1, c2, c3, c4 and each one's selectivity is 
s1, s2, s3, s4,  the correlation factor is not necessarily representing a 
pair-wise correlation.  If that was the case, we would get (s1 * s2)/CF for the 
first two then considering s3, we would get. (((s1 * s2)/CF) * s3)/CF  and so 
on.. but c1 and c2 may not be correlated to each other at all. Only one of them 
might be correlated to c3.   Having a single denominator allows us to set it to 
whatever factor we want to represent whether 2 or 3 or all 4 are correlated. It 
also makes it easier to articulate in the documentation.



--
To view, visit http://gerrit.cloudera.org:8080/19682
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I845d778a58404af834f7501fc8157a5a4b4bcc35
Gerrit-Change-Number: 19682
Gerrit-PatchSet: 8
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Mon, 10 Apr 2023 23:37:40 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12006: Improve cardinality estimation for joins involving multiple conjuncts

2023-04-10 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19682 )

Change subject: IMPALA-12006: Improve cardinality estimation for joins 
involving multiple conjuncts
..


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/19682/6/fe/src/main/java/org/apache/impala/planner/JoinNode.java
File fe/src/main/java/org/apache/impala/planner/JoinNode.java:

http://gerrit.cloudera.org:8080/#/c/19682/6/fe/src/main/java/org/apache/impala/planner/JoinNode.java@432
PS6, Line 432: cumulativeSel
> nit: "cumulativeSel"
Done


http://gerrit.cloudera.org:8080/#/c/19682/6/fe/src/main/java/org/apache/impala/planner/JoinNode.java@487
PS6, Line 487: cumulativeSel
> nit: "cumulativeSel"
Done



--
To view, visit http://gerrit.cloudera.org:8080/19682
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I845d778a58404af834f7501fc8157a5a4b4bcc35
Gerrit-Change-Number: 19682
Gerrit-PatchSet: 7
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Mon, 10 Apr 2023 14:48:04 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12006: Improve cardinality estimation for joins involving multiple conjuncts

2023-04-10 Thread Aman Sinha (Code Review)
Hello Quanlong Huang, Kurt Deschler, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/19682

to look at the new patch set (#7).

Change subject: IMPALA-12006: Improve cardinality estimation for joins 
involving multiple conjuncts
..

IMPALA-12006: Improve cardinality estimation for joins involving multiple 
conjuncts

When an inner or outer join involves conjuncts such as the following:
 SELECT * FROM t1 inner join
  (SELECT a2, MAX(b2) as max_b2 FROM t2 GROUP BY a2) s2
  ON t1.a1 = s2.a2 AND t1.b1 = s2.max_b2
the stats for the second conjunct involving the MAX expression gets
added to the 'otherEqJoinStats' list. These stats were being used
only when no other equijoin conjunct (involving base columns) was
present but not for the above case, thus leading to over-estimation.

The main change in this patch is an improvement to the cardinality
estimate for such cases by considering the conjuncts in the
'otherEqJoinStats list' in combination with the equijoin conjuncts.

A second change is the addition of a new query option which is a
floating point number between 0 to 1 inclusive:
  join_selectivity_correlation_factor
It defaults to 0.0 which preserves the existing behavior of using
the Minimum selectivity of the conjuncts. Given multiple
join conjuncts C1,C2..Cn having comparable selectivity, we sometimes
see over-estimation by several orders of magnitude. Setting the above
query option to a value higher than 0 first computes the product of the
selectivities: sel(C1) * sel(C2) .. sel(Cn)  and then scales it by
dividing by the join_selectivity_correlation_factor.

Note that this setting applies to all the joins in the query so it has
it's trade offs. Another approach is to use per join hints but there
are limitations of the hints approach too.

Testing:
 - Added planner tests with a combination of outer and inner join with
   different values of join_selectivity_correlation_factor
 - Ran PlannerTest and TpcdsPlannerTest

Change-Id: I845d778a58404af834f7501fc8157a5a4b4bcc35
---
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M testdata/workloads/functional-planner/queries/PlannerTest/card-outer-join.test
6 files changed, 297 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/19682/7
--
To view, visit http://gerrit.cloudera.org:8080/19682
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I845d778a58404af834f7501fc8157a5a4b4bcc35
Gerrit-Change-Number: 19682
Gerrit-PatchSet: 7
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-12006: Improve cardinality estimation for joins involving multiple conjuncts

2023-04-10 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19682 )

Change subject: IMPALA-12006: Improve cardinality estimation for joins 
involving multiple conjuncts
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/19682/6/fe/src/main/java/org/apache/impala/planner/JoinNode.java
File fe/src/main/java/org/apache/impala/planner/JoinNode.java:

http://gerrit.cloudera.org:8080/#/c/19682/6/fe/src/main/java/org/apache/impala/planner/JoinNode.java@496
PS6, Line 496:   if (corrfactor > 0) cumulative_sel *= (((double) 
joinCard/lhsCard)/rhsCard);
> Should we consider corrfactor here as well so that multiple conditions in t
On line 500 we are dividing the cumulative selectivity by the corrfactor. The 
idea is to multiply all the individual selectivities and divide only once.  Let 
me know if you had some other intent.



--
To view, visit http://gerrit.cloudera.org:8080/19682
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I845d778a58404af834f7501fc8157a5a4b4bcc35
Gerrit-Change-Number: 19682
Gerrit-PatchSet: 6
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Mon, 10 Apr 2023 14:26:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12006: Improve cardinality estimation for joins involving multiple conjuncts

2023-04-08 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19682 )

Change subject: IMPALA-12006: Improve cardinality estimation for joins 
involving multiple conjuncts
..


Patch Set 6:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/19682/1/fe/src/main/java/org/apache/impala/planner/JoinNode.java
File fe/src/main/java/org/apache/impala/planner/JoinNode.java:

http://gerrit.cloudera.org:8080/#/c/19682/1/fe/src/main/java/org/apache/impala/planner/JoinNode.java@421
PS1, Line 421:   slots.lhsNumRows(), slots.rhsNumRows(), lhsCard, 
rhsCard));
> Correct, the result will be somewhere between the multiplied selectivity an
Done


http://gerrit.cloudera.org:8080/#/c/19682/3/fe/src/main/java/org/apache/impala/planner/JoinNode.java
File fe/src/main/java/org/apache/impala/planner/JoinNode.java:

http://gerrit.cloudera.org:8080/#/c/19682/3/fe/src/main/java/org/apache/impala/planner/JoinNode.java@414
PS3, Line 414: kState(!eqJoinC
> Hitting the limit seems possible here with big RHS+LHS and could lead to we
Ack.  Changed it to divide by the lhsCard, rhsCard.


http://gerrit.cloudera.org:8080/#/c/19682/3/fe/src/main/java/org/apache/impala/planner/JoinNode.java@434
PS3, Line 434:
> This seems to be the same logic as in the loop above. I think that it would
Refactored this.


http://gerrit.cloudera.org:8080/#/c/19682/3/fe/src/main/java/org/apache/impala/planner/JoinNode.java@443
PS3, Line 443: if (corrfactor > 0) {
> This could be moved to line 430.
I think we still need to check the final result is >=0 since result is getting 
modified at a couple of places.


http://gerrit.cloudera.org:8080/#/c/19682/3/fe/src/main/java/org/apache/impala/planner/JoinNode.java@487
PS3, Line 487: double cumulative_sel = 1.0;
> Should we consider multiply the selectivity here too? (if the query option
Yes, thanks for pointing this.  Added the same logic here.



--
To view, visit http://gerrit.cloudera.org:8080/19682
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I845d778a58404af834f7501fc8157a5a4b4bcc35
Gerrit-Change-Number: 19682
Gerrit-PatchSet: 6
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Sun, 09 Apr 2023 06:25:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12006: Improve cardinality estimation for joins involving multiple conjuncts

2023-04-08 Thread Aman Sinha (Code Review)
Hello Quanlong Huang, Kurt Deschler, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/19682

to look at the new patch set (#6).

Change subject: IMPALA-12006: Improve cardinality estimation for joins 
involving multiple conjuncts
..

IMPALA-12006: Improve cardinality estimation for joins involving multiple 
conjuncts

When an inner or outer join involves conjuncts such as the following:
 SELECT * FROM t1 inner join
  (SELECT a2, MAX(b2) as max_b2 FROM t2 GROUP BY a2) s2
  ON t1.a1 = s2.a2 AND t1.b1 = s2.max_b2
the stats for the second conjunct involving the MAX expression gets
added to the 'otherEqJoinStats' list. These stats were being used
only when no other equijoin conjunct (involving base columns) was
present but not for the above case, thus leading to over-estimation.

The main change in this patch is an improvement to the cardinality
estimate for such cases by considering the conjuncts in the
'otherEqJoinStats list' in combination with the equijoin conjuncts.

A second change is the addition of a new query option which is a
floating point number between 0 to 1 inclusive:
  join_selectivity_correlation_factor
It defaults to 0.0 which preserves the existing behavior of using
the Minimum selectivity of the conjuncts. Given multiple
join conjuncts C1,C2..Cn having comparable selectivity, we sometimes
see over-estimation by several orders of magnitude. Setting the above
query option to a value higher than 0 first computes the product of the
selectivities: sel(C1) * sel(C2) .. sel(Cn)  and then scales it by
dividing by the join_selectivity_correlation_factor.

Note that this setting applies to all the joins in the query so it has
it's trade offs. Another approach is to use per join hints but there
are limitations of the hints approach too.

Testing:
 - Added planner tests with a combination of outer and inner join with
   different values of join_selectivity_correlation_factor
 - Ran PlannerTest and TpcdsPlannerTest

Change-Id: I845d778a58404af834f7501fc8157a5a4b4bcc35
---
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M testdata/workloads/functional-planner/queries/PlannerTest/card-outer-join.test
6 files changed, 297 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/19682/6
--
To view, visit http://gerrit.cloudera.org:8080/19682
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I845d778a58404af834f7501fc8157a5a4b4bcc35
Gerrit-Change-Number: 19682
Gerrit-PatchSet: 6
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-12006: Improve cardinality estimation for joins involving multiple conjuncts

2023-04-08 Thread Aman Sinha (Code Review)
Hello Quanlong Huang, Kurt Deschler, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/19682

to look at the new patch set (#5).

Change subject: IMPALA-12006: Improve cardinality estimation for joins 
involving multiple conjuncts
..

IMPALA-12006: Improve cardinality estimation for joins involving multiple 
conjuncts

When an inner or outer join involves conjuncts such as the following:
 SELECT * FROM t1 inner join
  (SELECT a2, MAX(b2) as max_b2 FROM t2 GROUP BY a2) s2
  ON t1.a1 = s2.a2 AND t1.b1 = s2.max_b2
the stats for the second conjunct involving the MAX expression gets
added to the 'otherEqJoinStats' list. These stats were being used
only when no other equijoin conjunct (involving base columns) was
present but not for the above case, thus leading to over-estimation.

The main change in this patch is an improvement to the cardinality
estimate for such cases by considering the conjuncts in the
'otherEqJoinStats list' in combination with the equijoin conjuncts.

A second change is the addition of a new query option which is a
floating point number between 0 to 1 inclusive:
  join_selectivity_correlation_factor
It defaults to 0.0 which preserves the existing behavior of using
the Minimum selectivity of the conjuncts. Given multiple
join conjuncts C1,C2..Cn having comparable selectivity, we sometimes
see over-estimation by several orders of magnitude. Setting the above
query option to a value higher than 0 first computes the product of the
selectivities: sel(C1) * sel(C2) .. sel(Cn)  and then scales it by
dividing by the join_selectivity_correlation_factor.

Note that this setting applies to all the joins in the query so it has
it's trade offs. Another approach is to use per join hints but there
are limitations of the hints approach too.

Testing:
 - Added planner tests with a combination of outer and inner join with
   different values of join_selectivity_correlation_factor
 - Ran PlannerTest and TpcdsPlannerTest

Change-Id: I845d778a58404af834f7501fc8157a5a4b4bcc35
---
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M testdata/workloads/functional-planner/queries/PlannerTest/card-outer-join.test
6 files changed, 298 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/19682/5
--
To view, visit http://gerrit.cloudera.org:8080/19682
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I845d778a58404af834f7501fc8157a5a4b4bcc35
Gerrit-Change-Number: 19682
Gerrit-PatchSet: 5
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-12006: Improve cardinality estimation for joins involving multiple conjuncts

2023-04-08 Thread Aman Sinha (Code Review)
Hello Quanlong Huang, Kurt Deschler, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/19682

to look at the new patch set (#4).

Change subject: IMPALA-12006: Improve cardinality estimation for joins 
involving multiple conjuncts
..

IMPALA-12006: Improve cardinality estimation for joins involving multiple 
conjuncts

When an inner or outer join involves conjuncts such as the following:
 SELECT * FROM t1 inner join
  (SELECT a2, MAX(b2) as max_b2 FROM t2 GROUP BY a2) s2
  ON t1.a1 = s2.a2 AND t1.b1 = s2.max_b2
the stats for the second conjunct involving the MAX expression gets
added to the 'otherEqJoinStats' list. These stats were being used
only when no other equijoin conjunct (involving base columns) was
present but not for the above case, thus leading to over-estimation.

The main change in this patch is an improvement to the cardinality
estimate for such cases by considering the conjuncts in the
'otherEqJoinStats list' in combination with the equijoin conjuncts.

A second change is the addition of a new query option which is a
floating point number between 0 to 1 inclusive:
  join_selectivity_correlation_factor
It defaults to 0.0 which preserves the existing behavior of using
the Minimum selectivity of the conjuncts. Given multiple
join conjuncts C1,C2..Cn having comparable selectivity, we sometimes
see over-estimation by several orders of magnitude. Setting the above
query option to a value higher than 0 first computes the product of the
selectivities: sel(C1) * sel(C2) .. sel(Cn)  and then scales it by
dividing by the join_selectivity_correlation_factor.

Note that this setting applies to all the joins in the query so it has
it's trade offs. Another approach is to use per join hints but there
are limitations of the hints approach too.

Testing:
 - Added planner test with a combination of outer and inner join with
   different values of join_selectivity_correlation_factor
 - Ran PlannerTest and TpcdsPlannerTest

Change-Id: I845d778a58404af834f7501fc8157a5a4b4bcc35
---
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M testdata/workloads/functional-planner/queries/PlannerTest/card-outer-join.test
6 files changed, 294 insertions(+), 14 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/19682/4
--
To view, visit http://gerrit.cloudera.org:8080/19682
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I845d778a58404af834f7501fc8157a5a4b4bcc35
Gerrit-Change-Number: 19682
Gerrit-PatchSet: 4
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-12006: Improve cardinality estimation for joins involving multiple conjuncts

2023-04-06 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19682 )

Change subject: IMPALA-12006: Improve cardinality estimation for joins 
involving multiple conjuncts
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/19682/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19682/3//COMMIT_MSG@30
PS3, Line 30: consider per join hints but in a complex query (or a view that is
> It might also be hard to tune the query option when the query has hundreds
Yes, there are tradeoffs in both the query option approach and the hints 
approach.  Join cardinality  estimation is a hard problem and we do need to 
look at better ways of doing this in the future using learning models or other 
techniques.  Of course, better stats will always help.

In terms of hints, couple of things to point out:
1.  In a lot of cases, the queries may come from a BI tool and are generated 
queries instead of hand written and users may not have control over setting the 
hints.
2.  For views:
CREATE VIEW v1 AS ….   t1  inner join t2 on  a1 = a2 AND b1 = b2
q1: select * from v1 WHERE 
q2: select * from v1 WHERE 
The predicates p1, p2 may be pushed down below the join/one side of the join or 
both sides .. also the behavior of pushdown depends on inner vs outer joins.  
The static hints at the join level level will not be aware of such 
transformations.

We could use both the query option and hints for more flexibility.



--
To view, visit http://gerrit.cloudera.org:8080/19682
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I845d778a58404af834f7501fc8157a5a4b4bcc35
Gerrit-Change-Number: 19682
Gerrit-PatchSet: 3
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Fri, 07 Apr 2023 03:42:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12006: Improve cardinality estimation for joins involving multiple conjuncts

2023-04-05 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19682 )

Change subject: IMPALA-12006: Improve cardinality estimation for joins 
involving multiple conjuncts
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/19682/1/fe/src/main/java/org/apache/impala/planner/JoinNode.java
File fe/src/main/java/org/apache/impala/planner/JoinNode.java:

http://gerrit.cloudera.org:8080/#/c/19682/1/fe/src/main/java/org/apache/impala/planner/JoinNode.java@330
PS1, Line 330:   return getGenericJoinCardinality(eqJoinConjunctSlots, 
otherEqJoinStats, lhsCard,
> line too long (95 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/19682/1/fe/src/main/java/org/apache/impala/planner/JoinNode.java@421
PS1, Line 421: if 
(analyzer.getQueryOptions().isUse_multiplied_selectivity_for_join()) {
> +1 for making this more flexible
Thanks for the suggestion.  So conceptually a coefficient which is less than 
1.0 (and > 0) would mean that the combined selectivity: (sel1 x sel2 x 
1/coefficient) is *higher* than what the current formula that does the 
multiplication of selectivity would compute.  I doubt we will need to go in the 
other direction, i.e lower than the product of the selectivities.
Let me modify this query option.



--
To view, visit http://gerrit.cloudera.org:8080/19682
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I845d778a58404af834f7501fc8157a5a4b4bcc35
Gerrit-Change-Number: 19682
Gerrit-PatchSet: 3
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 06 Apr 2023 00:58:29 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12006: Improve cardinality estimation for joins involving multiple conjuncts

2023-04-03 Thread Aman Sinha (Code Review)
Hello Quanlong Huang, Kurt Deschler, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/19682

to look at the new patch set (#2).

Change subject: IMPALA-12006: Improve cardinality estimation for joins 
involving multiple conjuncts
..

IMPALA-12006: Improve cardinality estimation for joins involving multiple 
conjuncts

When an inner or outer join involves conjuncts such as the following:
 SELECT * FROM t1 inner join
  (SELECT a2, MAX(b2) as max_b2 FROM t2 GROUP BY a2) s2
  ON t1.a1 = s2.a2 AND t1.b1 = s2.max_b2
the stats for the second conjunct involving the MAX expression gets
added to the 'otherEqJoinStats' list. These stats were being used
only when no other equijoin conjunct (involving base columns) was
present but not for the above case, thus leading to over-estimation.

The main change in this patch is an improvement to the cardinality
estimate for such cases by considering the conjuncts in the
'otherEqJoinStats list' in combination with the equijoin conjuncts.

A second change is the addition of a new boolean query option:
   use_multiplied_selectivity_for_join
It defaults to false. Given multiple independent join conjuncts
C1,C2..Cn having comparable selectivity, we sometimes see over-estimation
by several orders of magnitude. Setting the above query option to
TRUE allows the individual selectivities (fraction between 0 and 1)
of the conjuncts to be multiplied instead of taking the minimum. Note
that this applies to all the joins in the query. In the future we can
consider per join hints but in a complex query (or a view that is
shared among users), it is cumbersome to identify exactly where to
set hints.

Testing:
 - Added planner test with a combination of outer and inner join
  (with use_multiplied_selectivity_for_join enabled and disabled)
 - Ran PlannerTest and TpcdsPlannerTest
 TODO: run additional tests

Change-Id: I845d778a58404af834f7501fc8157a5a4b4bcc35
---
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M testdata/workloads/functional-planner/queries/PlannerTest/card-outer-join.test
6 files changed, 139 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/19682/2
--
To view, visit http://gerrit.cloudera.org:8080/19682
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I845d778a58404af834f7501fc8157a5a4b4bcc35
Gerrit-Change-Number: 19682
Gerrit-PatchSet: 2
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-12006: Improve cardinality estimation for joins involving multiple conjuncts

2023-04-03 Thread Aman Sinha (Code Review)
Aman Sinha has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/19682


Change subject: IMPALA-12006: Improve cardinality estimation for joins 
involving multiple conjuncts
..

IMPALA-12006: Improve cardinality estimation for joins involving multiple 
conjuncts

When an inner or outer join involves conjuncts such as the following:
 SELECT * FROM t1 inner join
  (SELECT a2, MAX(b2) as max_b2 FROM t2 GROUP BY a2) s2
  ON t1.a1 = s2.a2 AND t1.b1 = s2.max_b2
the stats for the second conjunct involving the MAX expression gets
added to the 'otherEqJoinStats' list. These stats were being used
only when no other equijoin conjunct (involving base columns) was
present but not for the above case, thus leading to over-estimation.

The main change in this patch is an improvement to the cardinality
estimate for such cases by considering the conjuncts in the
'otherEqJoinStats list' in combination with the equijoin conjuncts.

A second change is the addition of a new boolean query option:
   use_multiplied_selectivity_for_join
It defaults to false. Given multiple independent join conjuncts
C1,C2..Cn having comparable selectivity, we sometimes see over-estimation
by several orders of magnitude. Setting the above query option to
TRUE allows the individual selectivities (fraction between 0 and 1)
of the conjuncts to be multiplied instead of taking the minimum. Note
that this applies to all the joins in the query. In the future we can
consider per join hints but in a complex query (or a view that is
shared among users), it is cumbersome to identify exactly where to
set hints.

Testing:
 - Added planner test with a combination of outer and inner join
  (with use_multiplied_selectivity_for_join enabled and disabled)
 - Ran PlannerTest and TpcdsPlannerTest
 TODO: run additional tests

Change-Id: I845d778a58404af834f7501fc8157a5a4b4bcc35
---
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M testdata/workloads/functional-planner/queries/PlannerTest/card-outer-join.test
6 files changed, 137 insertions(+), 11 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/19682/1
--
To view, visit http://gerrit.cloudera.org:8080/19682
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I845d778a58404af834f7501fc8157a5a4b4bcc35
Gerrit-Change-Number: 19682
Gerrit-PatchSet: 1
Gerrit-Owner: Aman Sinha 


[Impala-ASF-CR] IMPALA-11946: Add Thrift HTTP support for external frontend

2023-03-08 Thread Aman Sinha (Code Review)
Aman Sinha has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/19537 )

Change subject: IMPALA-11946: Add Thrift HTTP support for external frontend
..

IMPALA-11946: Add Thrift HTTP support for external frontend

- Add enable_external_fe_http flag that defaults to false
  - When true the external frontend service (external_fe_port) will
  expect clients to use http transport.
  - When false the external frontend service will expect binary
  transport.

- Add tests for basic external frontend functionality
- Add test to ensure the non-external frontend services do not expose
  the ExecutePlannedStatement interface.

Change-Id: I2ad400b1df471e3d61b62d8c0360b27396c26050
Reviewed-on: http://gerrit.cloudera.org:8080/19537
Tested-by: Impala Public Jenkins 
Reviewed-by: Aman Sinha 
---
M be/src/service/impala-server.cc
M fe/src/test/java/org/apache/impala/customcluster/CustomClusterRunner.java
A fe/src/test/java/org/apache/impala/customcluster/ExternalFrontendTest.java
3 files changed, 220 insertions(+), 7 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Aman Sinha: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/19537
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I2ad400b1df471e3d61b62d8c0360b27396c26050
Gerrit-Change-Number: 19537
Gerrit-PatchSet: 6
Gerrit-Owner: John Sherman 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: John Sherman 


[Impala-ASF-CR] IMPALA-11946: Add Thrift HTTP support for external frontend

2023-03-08 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19537 )

Change subject: IMPALA-11946: Add Thrift HTTP support for external frontend
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/19537
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2ad400b1df471e3d61b62d8c0360b27396c26050
Gerrit-Change-Number: 19537
Gerrit-PatchSet: 5
Gerrit-Owner: John Sherman 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: John Sherman 
Gerrit-Comment-Date: Thu, 09 Mar 2023 02:02:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11946: Add Thrift HTTP support for external frontend

2023-03-08 Thread Aman Sinha (Code Review)
Aman Sinha has removed a vote on this change.

Change subject: IMPALA-11946: Add Thrift HTTP support for external frontend
..


Removed Verified-1 by Impala Public Jenkins 
--
To view, visit http://gerrit.cloudera.org:8080/19537
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I2ad400b1df471e3d61b62d8c0360b27396c26050
Gerrit-Change-Number: 19537
Gerrit-PatchSet: 4
Gerrit-Owner: John Sherman 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: John Sherman 


[Impala-ASF-CR] IMPALA-11946: Add Thrift HTTP support for external frontend

2023-03-08 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19537 )

Change subject: IMPALA-11946: Add Thrift HTTP support for external frontend
..


Patch Set 4: Verified+1

The 2 test failures in the first run are known flaky tests. The re-triggered 
run https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/7105/. was 
successful.


--
To view, visit http://gerrit.cloudera.org:8080/19537
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2ad400b1df471e3d61b62d8c0360b27396c26050
Gerrit-Change-Number: 19537
Gerrit-PatchSet: 4
Gerrit-Owner: John Sherman 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: John Sherman 
Gerrit-Comment-Date: Wed, 08 Mar 2023 20:04:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11946: Add Thrift HTTP support for external frontend

2023-03-06 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19537 )

Change subject: IMPALA-11946: Add Thrift HTTP support for external frontend
..


Patch Set 3: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/19537/2/fe/src/test/java/org/apache/impala/customcluster/ExternalFrontendTest.java
File fe/src/test/java/org/apache/impala/customcluster/ExternalFrontendTest.java:

http://gerrit.cloudera.org:8080/#/c/19537/2/fe/src/test/java/org/apache/impala/customcluster/ExternalFrontendTest.java@22
PS2, Line 22: import org.apache.hive.service.rpc.thrift.TCloseSessionReq;
> Done
I thought that such imports would fail checkstyle but perhaps checkstyle is not 
enforced for the tests. The FE source code does not have wildcard imports. In 
any case you have made the change so all good !



--
To view, visit http://gerrit.cloudera.org:8080/19537
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2ad400b1df471e3d61b62d8c0360b27396c26050
Gerrit-Change-Number: 19537
Gerrit-PatchSet: 3
Gerrit-Owner: John Sherman 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: John Sherman 
Gerrit-Comment-Date: Mon, 06 Mar 2023 22:17:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11946: Add Thrift HTTP support for external frontend

2023-03-06 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19537 )

Change subject: IMPALA-11946: Add Thrift HTTP support for external frontend
..


Patch Set 2: Code-Review+1

(2 comments)

It would be a useful enhancement.  Code changes LGTM (couple of nits).  Any 
thoughts on the potential performance latency comparison HTTP is used vs binary 
? I assume there are existing thrift-over-http vs thrift-over-binary published 
benchmarks.

http://gerrit.cloudera.org:8080/#/c/19537/2/fe/src/test/java/org/apache/impala/customcluster/ExternalFrontendTest.java
File fe/src/test/java/org/apache/impala/customcluster/ExternalFrontendTest.java:

http://gerrit.cloudera.org:8080/#/c/19537/2/fe/src/test/java/org/apache/impala/customcluster/ExternalFrontendTest.java@22
PS2, Line 22: import org.apache.hive.service.rpc.thrift.*;
nit: I think the wildcard is not allowed for imports.


http://gerrit.cloudera.org:8080/#/c/19537/2/fe/src/test/java/org/apache/impala/customcluster/ExternalFrontendTest.java@46
PS2, Line 46: hs2Port
nit: maybe call this hs2BinaryPort to easily distinguish with the other one.



--
To view, visit http://gerrit.cloudera.org:8080/19537
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2ad400b1df471e3d61b62d8c0360b27396c26050
Gerrit-Change-Number: 19537
Gerrit-PatchSet: 2
Gerrit-Owner: John Sherman 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 06 Mar 2023 20:37:31 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11960: Fix constant propagation from TIMESTAMP to DATE

2023-03-03 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19572 )

Change subject: IMPALA-11960: Fix constant propagation from TIMESTAMP to DATE
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/19572
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1938bf5e91057b220daf8a1892940f674aac3d68
Gerrit-Change-Number: 19572
Gerrit-PatchSet: 3
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 03 Mar 2023 15:41:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11960: Fix constant propagation from TIMESTAMP to DATE

2023-03-02 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19572 )

Change subject: IMPALA-11960: Fix constant propagation from TIMESTAMP to DATE
..


Patch Set 2: Code-Review+1

(2 comments)

Thanks Csaba for fixing this. The code change LGTM. One suggestion on adding a 
test case.

http://gerrit.cloudera.org:8080/#/c/19572/2/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test:

http://gerrit.cloudera.org:8080/#/c/19572/2/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test@439
PS2, Line 439: with dp_view as
This test is good.  Could you also add one variant of this which has a mix of  
>= and < in the WHERE clause:
e.g
where timestamp_col >=  '2009-01-01 01:00:00' AND timestamp_col <  '2009-02-01 
01:00:00'

After the rewrite the >= is expected to be preserved and the < is going to 
change to <=. This is just a sanity check.


http://gerrit.cloudera.org:8080/#/c/19572/2/testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test
File 
testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test:

http://gerrit.cloudera.org:8080/#/c/19572/2/testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test@14
PS2, Line 14: select count(*), sum(int_col) from alltypes_date_partition
Same suggestion as above for the end-to-end test.



--
To view, visit http://gerrit.cloudera.org:8080/19572
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1938bf5e91057b220daf8a1892940f674aac3d68
Gerrit-Change-Number: 19572
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 02 Mar 2023 22:43:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3120: Support Bucket Shuffle Join for bucketed table

2023-02-22 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19430 )

Change subject: IMPALA-3120: Support Bucket Shuffle Join for bucketed table
..


Patch Set 13:

> Patch Set 13:
>
> (1 comment)
>
> > Patch Set 13:
> >
> > (1 comment)

The problem is that it would not be practical to check the block locations for 
for potential relocations when doing the query planning.  Given N blocks in a 
bucket for one table and M blocks for the second table, it would be O(N+M) time 
to decide which distribution method to use. This would add up depending on the 
number of joins in the query. We really want to 'pin' the location but AFAIK 
HDFS does not allow us to do that. Other systems such as MemSQL that do bucket 
join don't have to worry about this since the data is memory resident.


--
To view, visit http://gerrit.cloudera.org:8080/19430
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If321e7987bc88374d79500cffb77ea25b2ed0316
Gerrit-Change-Number: 19430
Gerrit-PatchSet: 13
Gerrit-Owner: Baike Xia 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Baike Xia 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Wed, 22 Feb 2023 08:21:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11925: Added a new api add write notification log in batch in the MetastoreServiceHandler class

2023-02-17 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19508 )

Change subject: IMPALA-11925: Added a new api 
add_write_notification_log_in_batch in the MetastoreServiceHandler class
..


Patch Set 1: Code-Review+1

Changes lgtm.  Could you address the various formatting comments ?


--
To view, visit http://gerrit.cloudera.org:8080/19508
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia9a35557c2ed79ed0276c4a418b5292fc6dd6194
Gerrit-Change-Number: 19508
Gerrit-PatchSet: 1
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: John Sherman 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Fri, 17 Feb 2023 20:49:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11895: Need accessor methods for third party extension

2023-02-13 Thread Aman Sinha (Code Review)
Aman Sinha has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/19469 )

Change subject: IMPALA-11895: Need accessor methods for third party extension
..

IMPALA-11895: Need accessor methods for third party extension

Changed the visibility of some methods needed for a third
party extension.

The methods included in this are for the statements 'compute stats',
'create function', and 'refresh/invalidate'.

Change-Id: Iaea15ef8a96c9509b9cd79df595868fd1db47e83
Reviewed-on: http://gerrit.cloudera.org:8080/19469
Reviewed-by: Aman Sinha 
Tested-by: Aman Sinha 
---
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateFunctionStmtBase.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java
4 files changed, 24 insertions(+), 3 deletions(-)

Approvals:
  Aman Sinha: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/19469
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Iaea15ef8a96c9509b9cd79df595868fd1db47e83
Gerrit-Change-Number: 19469
Gerrit-PatchSet: 3
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Steve Carlin 


[Impala-ASF-CR] IMPALA-11895: Need accessor methods for third party extension

2023-02-13 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19469 )

Change subject: IMPALA-11895: Need accessor methods for third party extension
..


Patch Set 2: Verified+1 Code-Review+2

Carry forward Kurt and mine +1 and bumping to +2. Also carry forward the 
Verified +1 since there hasn't been code change since the last run.
Will wait for a few hours before merging though in case other reviewers have 
further comment.


--
To view, visit http://gerrit.cloudera.org:8080/19469
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaea15ef8a96c9509b9cd79df595868fd1db47e83
Gerrit-Change-Number: 19469
Gerrit-PatchSet: 2
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Mon, 13 Feb 2023 20:05:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11895: Need accessor methods for third party extension

2023-02-13 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19469 )

Change subject: IMPALA-11895: Need accessor methods for third party extension
..


Patch Set 2:

> Patch Set 1:
>
> One extra note which I should have included the first time.
>
> Another potential solution would be to copy the PartitionSpec in the get 
> method.  In this case, it might work...unless some other object is a member 
> of PartitionSpec in which case it would have to be a deepCopy (and 
> PartitionSpec then has the responsibility to make sure it does the deep 
> copy).  I just don't like the "mixed message" architecture approach since 
> this isn't done elsewhere and it is better to make PartitionSpec an immutable 
> object.

The first time analyze() is called, the partitionSpec_ is marked Immutable 
according to: 
https://github.com/apache/impala/blame/master/fe/src/main/java/org/apache/impala/analysis/PartitionSpec.java#L146


--
To view, visit http://gerrit.cloudera.org:8080/19469
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaea15ef8a96c9509b9cd79df595868fd1db47e83
Gerrit-Change-Number: 19469
Gerrit-PatchSet: 2
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Mon, 13 Feb 2023 19:59:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11895: Need accessor methods for third party extension

2023-02-10 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19469 )

Change subject: IMPALA-11895: Need accessor methods for third party extension
..


Patch Set 1: Code-Review+1

(2 comments)

> Patch Set 1:
>
> (1 comment)
>
> Can we be sure that we don't run into problems if we expose these 
> methods/variables? Won't users be able to change the state in ways the code 
> doesn't expect?

Probably @scarlin can respond to this but I only see accessors have been added 
or changed so none of them modify state.
Carry forward Kurt's +1.  Can bump to +2 once comments are addressed.

http://gerrit.cloudera.org:8080/#/c/19469/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19469/1//COMMIT_MSG@9
PS1, Line 9: viewability
Should this be 'visibility' instead of 'viewability' ?


http://gerrit.cloudera.org:8080/#/c/19469/1//COMMIT_MSG@9
PS1, Line 9: Changed the viewability of some methods needed for a third
> It would be more informative to list the methods that have been changed.
Instead of individual method names, you could mention that the accessor methods 
are related to COMPUTE STATS, CREATE FUNCTION / TABLE, REFRESH commands.



--
To view, visit http://gerrit.cloudera.org:8080/19469
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaea15ef8a96c9509b9cd79df595868fd1db47e83
Gerrit-Change-Number: 19469
Gerrit-PatchSet: 1
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Comment-Date: Sat, 11 Feb 2023 00:32:40 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3120: Support Bucket Shuffle Join for bucketed table

2023-02-07 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19430 )

Change subject: IMPALA-3120: Support Bucket Shuffle Join for bucketed table
..


Patch Set 13:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/19430/13//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19430/13//COMMIT_MSG@25
PS13, Line 25: based on hdfs storage are supported.
Thanks for the detailed patch.  I have a high level question about the physical 
co-location for bucketed tables.  HDFS supports re-balancing of data which 
moves data across nodes in a cluster for more uniform distribution [1].  If a 
table is bucketed and the re-balancing occurred after the table was created,  
the bucket shuffle hash join will produce incorrect results. In such cases, we 
would want to not pick this join method and fall back to the regular 
distributed join. Is there a consideration for such scenario in this patch ?

[1] 
https://docs.cloudera.com/cdp-private-cloud-base/7.1.6/scaling-namespaces/topics/hdfs-balancing-data-across-hdfs-cluster.html



--
To view, visit http://gerrit.cloudera.org:8080/19430
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If321e7987bc88374d79500cffb77ea25b2ed0316
Gerrit-Change-Number: 19430
Gerrit-PatchSet: 13
Gerrit-Owner: Baike Xia 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Baike Xia 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Wed, 08 Feb 2023 02:51:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11843: Fix IndexOutOfBoundsException in analytic limit pushdown

2023-01-16 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19422 )

Change subject: IMPALA-11843: Fix IndexOutOfBoundsException in analytic limit 
pushdown
..


Patch Set 1: Code-Review+2

LGTM.  Thanks for the patch. I am surprised we did not already have such a test 
case with mixed conjuncts.


--
To view, visit http://gerrit.cloudera.org:8080/19422
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I26d6bd58be58d09a29b8b81972e76665f41cf103
Gerrit-Change-Number: 19422
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Mon, 16 Jan 2023 19:27:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11808: Add support for reload event in catalogD

2023-01-05 Thread Aman Sinha (Code Review)
Aman Sinha has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/19378 )

Change subject: IMPALA-11808: Add support for reload event in catalogD
..

IMPALA-11808: Add support for reload event in catalogD

This patch supports a new event called reload event in catalogD.
This event is used to update table/file metadata for a refresh
or invalidate command in other replicas of catalogDs when one
of the replicas issued refresh/invalidate command. HIVE-26838 is
the Hive jira that adds support for this reload event. This
feature is disabled by default using a config
enable_reload_events. To use this feature set this config to
true and impala will be able to fire reload events. The
processing of reload events in the event processor is always
enabled. There is an end-to-end test added for this feature
which currently checks firing/creation of the reload event and
self-event check in the event processor. TODO: end-to-end test
should also test this reload event in the event processor.
There is also a follow up jira IMPALA-11822 to track the
optimization patch for this feature.

Change-Id: Ic62d58837d356dc2113f3c0904228ac9de484136
Reviewed-on: http://gerrit.cloudera.org:8080/19378
Tested-by: Aman Sinha 
Reviewed-by: Andrew Sherman 
Reviewed-by: Aman Sinha 
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M tests/custom_cluster/test_events_custom_configs.py
9 files changed, 337 insertions(+), 2 deletions(-)

Approvals:
  Aman Sinha: Looks good to me, approved; Verified
  Andrew Sherman: Looks good to me, but someone else must approve

--
To view, visit http://gerrit.cloudera.org:8080/19378
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ic62d58837d356dc2113f3c0904228ac9de484136
Gerrit-Change-Number: 19378
Gerrit-PatchSet: 17
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 


[Impala-ASF-CR] IMPALA-11808: Add support for reload event in catalogD

2023-01-05 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19378 )

Change subject: IMPALA-11808: Add support for reload event in catalogD
..


Patch Set 16: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/19378
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic62d58837d356dc2113f3c0904228ac9de484136
Gerrit-Change-Number: 19378
Gerrit-PatchSet: 16
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Fri, 06 Jan 2023 00:38:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11808: Add support for reload event in catalogD

2023-01-05 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19378 )

Change subject: IMPALA-11808: Add support for reload event in catalogD
..


Patch Set 16: Verified+1 Code-Review+1

Latest patchset LGTM. Carry forward previous +1 from Quanlong and Csaba and 
adding mine. Will bump to +2 after Andrew has another look.
Also carry forward previous Verified +1 since the latest changes are not 
affecting the code functionality.


--
To view, visit http://gerrit.cloudera.org:8080/19378
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic62d58837d356dc2113f3c0904228ac9de484136
Gerrit-Change-Number: 19378
Gerrit-PatchSet: 16
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Thu, 05 Jan 2023 23:26:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11808: Add support for reload event in catalogD

2023-01-05 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19378 )

Change subject: IMPALA-11808: Add support for reload event in catalogD
..


Patch Set 13:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/19378/13//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19378/13//COMMIT_MSG@10
PS13, Line 10: update
nit: to update ..


http://gerrit.cloudera.org:8080/#/c/19378/13//COMMIT_MSG@11
PS13, Line 11: one the
nit: one of the ..



--
To view, visit http://gerrit.cloudera.org:8080/19378
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic62d58837d356dc2113f3c0904228ac9de484136
Gerrit-Change-Number: 19378
Gerrit-PatchSet: 13
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Thu, 05 Jan 2023 21:41:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11808: Add support for reload event in catalogD

2023-01-05 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19378 )

Change subject: IMPALA-11808: Add support for reload event in catalogD
..


Patch Set 12:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/19378/12//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19378/12//COMMIT_MSG@13
PS12, Line 13: addes
nit: adds


http://gerrit.cloudera.org:8080/#/c/19378/12//COMMIT_MSG@14
PS12, Line 14: disable
nit: disabled


http://gerrit.cloudera.org:8080/#/c/19378/12//COMMIT_MSG@17
PS12, Line 17: reload
nit: 'of' reload events..



--
To view, visit http://gerrit.cloudera.org:8080/19378
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic62d58837d356dc2113f3c0904228ac9de484136
Gerrit-Change-Number: 19378
Gerrit-PatchSet: 12
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Thu, 05 Jan 2023 21:16:53 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11626, IMPALA-11808: Bump GBN to 36109364 for new HMS events

2023-01-04 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19399 )

Change subject: IMPALA-11626, IMPALA-11808: Bump GBN to 36109364 for new HMS 
events
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/19399
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I71dd1bf7b6d744f23008d8cefeeff7adcc8afa41
Gerrit-Change-Number: 19399
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Thu, 05 Jan 2023 00:49:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11812: Deduplicate column schema in hmsPartitions

2022-12-30 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19391 )

Change subject: IMPALA-11812: Deduplicate column schema in hmsPartitions
..


Patch Set 7: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/19391/6/tests/custom_cluster/test_wide_table_operations.py
File tests/custom_cluster/test_wide_table_operations.py:

http://gerrit.cloudera.org:8080/#/c/19391/6/tests/custom_cluster/test_wide_table_operations.py@36
PS6, Line 36: if cls.exploration_strategy() != 'exhaustive':
> Yeah, the progress is helpful so I added some progress logs for catalogd.
Ack



--
To view, visit http://gerrit.cloudera.org:8080/19391
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I511ecca0ace8bea4c24a19a54fb0a75390e50c4d
Gerrit-Change-Number: 19391
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Fri, 30 Dec 2022 15:48:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11812: Deduplicate column schema in hmsPartitions

2022-12-29 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19391 )

Change subject: IMPALA-11812: Deduplicate column schema in hmsPartitions
..


Patch Set 6:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/19391/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/19391/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1862
PS6, Line 1862: fetchPartitionsByName
> nit. This method probably should be renamed to fetchPartitionsByTable() due
I think the name is ok since the 'ByName' refers to the names of the partitions 
rather than the table. There's a corresponding underlying HMS api called 
get_partitions_by_name() so this is a higher level wrapper for that.


http://gerrit.cloudera.org:8080/#/c/19391/6/fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java
File fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java:

http://gerrit.cloudera.org:8080/#/c/19391/6/fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java@208
PS6, Line 208: LOG.info("Fetched {}/{} partitions for table {}.{}", numDone, 
partNames.size(),
 :   msTbl.getDbName(), msTbl.getTableName());
> This probably will log  number of progressive message
Actually, the number of logged messages would be 
partNames.size()/maxPartitionsPerRpc_ .  If there are 5000 partitions to fetch 
and the batch size per RPC is 1000 partitions, then there will be 5 such 
messages. If the batch size is too small, yes it will generate more messages 
but in general it is not a bad idea to track the progress.  I am ok either way 
but will let Quanlong respond.


http://gerrit.cloudera.org:8080/#/c/19391/6/tests/custom_cluster/test_wide_table_operations.py
File tests/custom_cluster/test_wide_table_operations.py:

http://gerrit.cloudera.org:8080/#/c/19391/6/tests/custom_cluster/test_wide_table_operations.py@36
PS6, Line 36: if cls.exploration_strategy() != 'exhaustive':
This is a good test case and given the high runtime it makes sense to restrict 
this to exhaustive run only. I am wondering if there is any intermediate 
message that should be displayed to the console ..e.g after each of the 
recover_stmt, invalidate_stmt, show_parts_stmt execute, a success message could 
be shown.  Also, how long does table creation itself take ? Developers may want 
to know if things are stuck or progressing. However, this is a nice to have, 
not must do.



--
To view, visit http://gerrit.cloudera.org:8080/19391
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I511ecca0ace8bea4c24a19a54fb0a75390e50c4d
Gerrit-Change-Number: 19391
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 29 Dec 2022 20:39:11 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11812: Deduplicate column schema in hmsPartitions

2022-12-28 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19391 )

Change subject: IMPALA-11812: Deduplicate column schema in hmsPartitions
..


Patch Set 4: Code-Review+1

Thanks for making the code changes. I will bump up to +2 once the tests are 
added.


--
To view, visit http://gerrit.cloudera.org:8080/19391
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I511ecca0ace8bea4c24a19a54fb0a75390e50c4d
Gerrit-Change-Number: 19391
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Wed, 28 Dec 2022 19:18:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11812: Deduplicate column list in hmsPartitions

2022-12-27 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19391 )

Change subject: IMPALA-11812: Deduplicate column list in hmsPartitions
..


Patch Set 3:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/19391/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19391/3//COMMIT_MSG@15
PS3, Line 15: interned.
nit: comma instead of period


http://gerrit.cloudera.org:8080/#/c/19391/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/19391/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1842
PS3, Line 1842: msTbl.getPartitionKeys().size()
On line 1828 this value was stored as numClusteringCols_  so we could use that.


http://gerrit.cloudera.org:8080/#/c/19391/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/19391/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1877
PS3, Line 1877: dereference
nit: did you mean deduplicate instead of dereference ?


http://gerrit.cloudera.org:8080/#/c/19391/3/fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java
File fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java:

http://gerrit.cloudera.org:8080/#/c/19391/3/fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java@210
PS3, Line 210:   cols = deduplicateColumnList(partitions, cols);
I am wondering why we would pass in a null cols list to begin with.  Could we 
use the table level columns similar to what was done for addHmsPartitions()  
(assuming the table is not an incomplete table) ?

   Table tbl = catalog_.getTable(dbName, tblName);
   cols = tbl.getMetaStoreTable().getSd().getCols()


http://gerrit.cloudera.org:8080/#/c/19391/3/fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java@224
PS3, Line 224:   public static List 
deduplicateColumnList(List partitions,
Since this method is actually force-setting the columns list for the partition, 
I think 'deduplicate' is not the best name.  It gives the impression that a 
duplicate elimination is being done.  Also, same comment as earlier .. what are 
the situations where we expect a null cols list ?  Are there cases where we 
don't have the table level schema and instead have to use the first partition's 
schema ?


http://gerrit.cloudera.org:8080/#/c/19391/3/fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java@229
PS3, Line 229: p.getSd().getCols()
Could this also return null ?  If so, given a set of partitions, we would have 
first N partitions set to null and once the first non-null is found then the 
rest of the partitions would be set to that.  If we want all of them to be set 
to non-null, we could do 2 passes.



--
To view, visit http://gerrit.cloudera.org:8080/19391
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I511ecca0ace8bea4c24a19a54fb0a75390e50c4d
Gerrit-Change-Number: 19391
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 27 Dec 2022 16:18:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11494: Don't always produce Ranger audit log for authorized query

2022-08-20 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18850 )

Change subject: IMPALA-11494: Don't always produce Ranger audit log for 
authorized query
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18850/1/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java
File 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java:

http://gerrit.cloudera.org:8080/#/c/18850/1/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java@200
PS1, Line 200: if (authzOk && !analysisOk) 
auditHandler.getAuthzEvents().clear();
If this condition is met, is it necessary to call flush() below ?



--
To view, visit http://gerrit.cloudera.org:8080/18850
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I701652e457d3118f43249e83be933713b17ce48f
Gerrit-Change-Number: 18850
Gerrit-PatchSet: 1
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Tue, 16 Aug 2022 00:16:07 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11494: Don't always produce Ranger audit log for authorized query

2022-08-20 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18850 )

Change subject: IMPALA-11494: Don't always produce Ranger audit log for 
authorized query
..


Patch Set 3: Code-Review+1

+1 from my side.  Will wait for Csaba's response before bumping to +2


--
To view, visit http://gerrit.cloudera.org:8080/18850
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I701652e457d3118f43249e83be933713b17ce48f
Gerrit-Change-Number: 18850
Gerrit-PatchSet: 3
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Tue, 16 Aug 2022 21:56:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11443: Fix partitoned top-n with -1 input cardinality

2022-07-18 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18744 )

Change subject: IMPALA-11443: Fix partitoned top-n with -1 input cardinality
..


Patch Set 1: Code-Review+1

(1 comment)

The code change lgtm. One comment about the test.

http://gerrit.cloudera.org:8080/#/c/18744/1/testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-partitioned-top-n.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-partitioned-top-n.test:

http://gerrit.cloudera.org:8080/#/c/18744/1/testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-partitioned-top-n.test@858
PS1, Line 858: # Relies on functional.alltypessmall not having stats and 
running this test with
Can we always rely on functional.alltypessmall not having stats ?
FWIW, PlannerTest.testKudu() creates a test-only external table without stats 
for a particular test.
Something similar could be an option.  I am ok with the alltypessmall if we 
have used it elsewhere (or similar tables) with the no stats assumption.



--
To view, visit http://gerrit.cloudera.org:8080/18744
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7e2dc7df397dc4684e6510a200f381a74c8dd984
Gerrit-Change-Number: 18744
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 18 Jul 2022 20:42:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9670: Fix unloaded views are shown as tables for GET TABLES requests

2022-06-22 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18626 )

Change subject: IMPALA-9670: Fix unloaded views are shown as tables for 
GET_TABLES requests
..


Patch Set 4: Code-Review+2

LGTM. Thanks for fixing this.


--
To view, visit http://gerrit.cloudera.org:8080/18626
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I528bb20272ebdd66a0118c30efc2b0566f2b0e2f
Gerrit-Change-Number: 18626
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Yu-Wen Lai 
Gerrit-Comment-Date: Wed, 22 Jun 2022 22:44:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11284: Enable expression rewrites for || and Between predicate in values clause

2022-06-06 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18581 )

Change subject: IMPALA-11284: Enable expression rewrites for || and Between 
predicate in values clause
..


Patch Set 3: Code-Review+2

Carry both +1s and bumping to +2.


--
To view, visit http://gerrit.cloudera.org:8080/18581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I99b8b33bf6468d12b9e26f0a6e744feb7072619c
Gerrit-Change-Number: 18581
Gerrit-PatchSet: 3
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 07 Jun 2022 05:34:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11284: Enable expression rewrites for || and Between predicate in values clause

2022-06-06 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18581 )

Change subject: IMPALA-11284: Enable expression rewrites for || and Between 
predicate in values clause
..


Patch Set 3: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/18581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I99b8b33bf6468d12b9e26f0a6e744feb7072619c
Gerrit-Change-Number: 18581
Gerrit-PatchSet: 3
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 07 Jun 2022 05:34:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11323: Don't evaluate constants-only inferred predicates

2022-06-03 Thread Aman Sinha (Code Review)
Aman Sinha has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/18579 )

Change subject: IMPALA-11323: Don't evaluate constants-only inferred predicates
..

IMPALA-11323: Don't evaluate constants-only inferred predicates

IMPALA-10182 fixed the problem of creating inferred predicates when
both sides of an equality predicate came from the same slot.

Inferred predicates also should not be created when both sides
of an equality predicate are constant values which do not have
scan slots.

Change-Id: If1cd4559dda406d2d38703ed594b70b41ed336fd
Reviewed-on: http://gerrit.cloudera.org:8080/18579
Tested-by: Impala Public Jenkins 
Reviewed-by: Aman Sinha 
---
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M testdata/workloads/functional-query/queries/QueryTest/inline-view.test
3 files changed, 62 insertions(+), 14 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Aman Sinha: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/18579
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: If1cd4559dda406d2d38703ed594b70b41ed336fd
Gerrit-Change-Number: 18579
Gerrit-PatchSet: 5
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Steve Carlin 


[Impala-ASF-CR] IMPALA-11323: Don't evaluate constants-only inferred predicates

2022-06-03 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18579 )

Change subject: IMPALA-11323: Don't evaluate constants-only inferred predicates
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/18579
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If1cd4559dda406d2d38703ed594b70b41ed336fd
Gerrit-Change-Number: 18579
Gerrit-PatchSet: 4
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Sat, 04 Jun 2022 03:27:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11323: Don't evaluate constants-only inferred predicates

2022-06-03 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18579 )

Change subject: IMPALA-11323: Don't evaluate constants-only inferred predicates
..


Patch Set 3: Code-Review+1

(1 comment)

Thanks for making the changes. Code changes LGTM.
One other thought was about the planner test (similar to what was added in 
https://gerrit.cloudera.org/c/16917/4/testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test).
  However, considering that you have the e2e test coverage, I am good with it.

http://gerrit.cloudera.org:8080/#/c/18579/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18579/3//COMMIT_MSG@9
PS3, Line 9: IMPALA-101822 fixed the problem of creating inferred predicates 
when
Still not the right jira id :)



--
To view, visit http://gerrit.cloudera.org:8080/18579
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If1cd4559dda406d2d38703ed594b70b41ed336fd
Gerrit-Change-Number: 18579
Gerrit-PatchSet: 3
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Fri, 03 Jun 2022 19:23:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11323: Do not infer predicates on equivalent values

2022-06-03 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18579 )

Change subject: IMPALA-11323: Do not infer predicates on equivalent values
..


Patch Set 2:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/18579/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18579/2//COMMIT_MSG@7
PS2, Line 7: Do not infer predicates on equivalent values
nit: We are still creating the inferred predicate (which is why isInferred() 
for the binary predicate returns true) but we are not going to evaluate it as 
part of the SELECT node.  You could say 'Don't evaluate constants-only inferred 
predicates' or something similar.


http://gerrit.cloudera.org:8080/#/c/18579/2//COMMIT_MSG@9
PS2, Line 9: IMPALA-10812
This is not the right jira id


http://gerrit.cloudera.org:8080/#/c/18579/2/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/18579/2/fe/src/main/java/org/apache/impala/analysis/Expr.java@1645
PS2, Line 1645:* Returns the source expression for this expression.
nit: could you fix the alignment at the end of these 2 lines.


http://gerrit.cloudera.org:8080/#/c/18579/2/fe/src/main/java/org/apache/impala/analysis/Expr.java@1660
PS2, Line 1660: if (slotDesc.isScanSlot()) {
nit: Impala code style put trivial single statement blocks in the same line i.e
  if (slotDesc.isScanSlot()) return slotRef;


http://gerrit.cloudera.org:8080/#/c/18579/2/fe/src/main/java/org/apache/impala/analysis/Expr.java@1677
PS2, Line 1677:   public SlotDescriptor findSrcScanSlot() {
Is findSrcScanSlot() getting used anywhere else after this change ?


http://gerrit.cloudera.org:8080/#/c/18579/2/fe/src/main/java/org/apache/impala/planner/PlanNode.java
File fe/src/main/java/org/apache/impala/planner/PlanNode.java:

http://gerrit.cloudera.org:8080/#/c/18579/2/fe/src/main/java/org/apache/impala/planner/PlanNode.java@588
PS2, Line 588:   SlotRef lhsSlotRef = lhsSrcExpr.unwrapSlotRef(false);
nit: it seems that unwrap is getting called multiple times ..once inside 
findSrcExpr and then here.  Could you add comments here about the reason.


http://gerrit.cloudera.org:8080/#/c/18579/2/fe/src/main/java/org/apache/impala/planner/PlanNode.java@592
PS2, Line 592:   if (lhsSlotRef.getDesc() != null &&
A slot ref would always have non-null slot descriptor.


http://gerrit.cloudera.org:8080/#/c/18579/2/testdata/workloads/functional-query/queries/QueryTest/inline-view.test
File testdata/workloads/functional-query/queries/QueryTest/inline-view.test:

http://gerrit.cloudera.org:8080/#/c/18579/2/testdata/workloads/functional-query/queries/QueryTest/inline-view.test@585
PS2, Line 585: with t as (select 1 a), v as
nit: remove whitespace at the end of this and lines below



--
To view, visit http://gerrit.cloudera.org:8080/18579
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If1cd4559dda406d2d38703ed594b70b41ed336fd
Gerrit-Change-Number: 18579
Gerrit-PatchSet: 2
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Fri, 03 Jun 2022 08:25:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11274: CNF Rewrite causes a regress in join node performance

2022-05-20 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18458 )

Change subject: IMPALA-11274: CNF Rewrite causes a regress in join node 
performance
..


Patch Set 13:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18458/13/testdata/workloads/functional-planner/queries/PlannerTest/convert-to-cnf.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/convert-to-cnf.test:

http://gerrit.cloudera.org:8080/#/c/18458/13/testdata/workloads/functional-planner/queries/PlannerTest/convert-to-cnf.test@439
PS13, Line 439: # IMPALA-11274: Simple arithmetic expressions should still be 
eligible for CNF
This predicate is already in CNF form so the rule would be a no-op.  Was there 
a specific reason to add this test ? If the intent is to test. CNF for 
predicates with arithmetic exprs, you can add OR predicates along the patterns 
of preceding examples and substitute one or more exprs with arithmetic exprs.



--
To view, visit http://gerrit.cloudera.org:8080/18458
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I326406c6b004fe31ec0e2a2f390a3845b8925aa9
Gerrit-Change-Number: 18458
Gerrit-PatchSet: 13
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Fri, 20 May 2022 23:18:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11301: Fix extreme != selectivity for NDV=1

2022-05-19 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18543 )

Change subject: IMPALA-11301: Fix extreme != selectivity for NDV=1
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/18543
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6b5334a8d7d6ca46a450ff98ae03e5269faaa3c6
Gerrit-Change-Number: 18543
Gerrit-PatchSet: 4
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 19 May 2022 17:57:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11301: Fix extreme != selectivity for NDV=1

2022-05-19 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18543 )

Change subject: IMPALA-11301: Fix extreme != selectivity for NDV=1
..


Patch Set 3: Code-Review+1

Latest changes lgtm.  Only other thought was about the unit test itself. It 
relies on the current behavior of row_number ndv estimate. Were you thinking of 
adding any other test against a table whose column had ndv=1 ?


--
To view, visit http://gerrit.cloudera.org:8080/18543
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6b5334a8d7d6ca46a450ff98ae03e5269faaa3c6
Gerrit-Change-Number: 18543
Gerrit-PatchSet: 3
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 19 May 2022 16:31:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11301: Fix extreme = and != selectivity for NDV=1

2022-05-18 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18543 )

Change subject: IMPALA-11301: Fix extreme = and != selectivity for NDV=1
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18543/1/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
File fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java:

http://gerrit.cloudera.org:8080/#/c/18543/1/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@267
PS1, Line 267: if (distinctValues == 1) distinctValues = 2;
> Yeah the NOT col = 1 is a good use case to justify this.   I am good with i
Hmm..not sure what I meant by the last sentence in my comment above.  Pls 
ignore that part.  Looking at the Jenkins test output,  I see several diffs in 
the hash join cardinality estimates for TPC-DS queries. I was worried about 
plan changes but wasn't expecting it for TPC-DS since I doubt there are columns 
with ndv=1.  However, what might be happening is that the ndv  gets adjusted 
based on filtering that was applied earlier in the plan.  A join's cardinality 
estimate formula leverages the ndv coming from the child which may have been 
adjusted (reduced).
We could look at all the plan diffs but it would need some time to validate 
functionally and with a performance run. Alternatively, we could limit the 
scope of this fix to the != predicates.



--
To view, visit http://gerrit.cloudera.org:8080/18543
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6b5334a8d7d6ca46a450ff98ae03e5269faaa3c6
Gerrit-Change-Number: 18543
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 19 May 2022 02:26:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11301: Fix extreme = and != selectivity for NDV=1

2022-05-18 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18543 )

Change subject: IMPALA-11301: Fix extreme = and != selectivity for NDV=1
..


Patch Set 2: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/18543/1/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
File fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java:

http://gerrit.cloudera.org:8080/#/c/18543/1/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@267
PS1, Line 267: if (distinctValues == 1) distinctValues = 2;
> Yes, and I do think that 0.5 is "more right" than 1.
Yeah the NOT col = 1 is a good use case to justify this.   I am good with it.  
In any case the cardinality will be rounded to the next integer value.


http://gerrit.cloudera.org:8080/#/c/18543/1/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@273
PS1, Line 273:   if (op_ == Operator.DISTINCT_FROM && rChildIsNull) {
> I have changed the patch to only affect ndv=1.
Thanks for making the change to do an explicit check.  The presence or absence 
of a value is not possible to be handled through NDV alone.  If we had 
Histogram or Most Common Values, those keep track of actual values which would 
help.
Marking this resolved.



--
To view, visit http://gerrit.cloudera.org:8080/18543
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6b5334a8d7d6ca46a450ff98ae03e5269faaa3c6
Gerrit-Change-Number: 18543
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 18 May 2022 22:37:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP: IMPALA-11301: Fix = and != selectivity for very low NDVs

2022-05-18 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18543 )

Change subject: WIP: IMPALA-11301: Fix = and != selectivity for very low NDVs
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/18543/1/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
File fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java:

http://gerrit.cloudera.org:8080/#/c/18543/1/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@267
PS1, Line 267:   selectivity_ = 1.0 / (distinctValues + 1);
If NDV == 1,  this would now produce 0.5 instead of 1 which is not right.   I 
suspect this change will change a lot of plans especially for predicates on 
boolean columns (ndv = 2).  IMHO It would be better to restrict the fix to the 
non-equality predicates.


http://gerrit.cloudera.org:8080/#/c/18543/1/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@273
PS1, Line 273: selectivity_ = 1.0 - 1.0 / (distinctValues + 1);
Since the distinctValues == 1 is a special case which causes 0 to be produced 
with the old formula, could we just explicitly check for distinctValues == 1  
and only in that case use the new formula ?  For the other values, old formula 
should still be ok right ?  Just want to see if a minimal change can be done 
since formula changes have to go through much more testing.



--
To view, visit http://gerrit.cloudera.org:8080/18543
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6b5334a8d7d6ca46a450ff98ae03e5269faaa3c6
Gerrit-Change-Number: 18543
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 18 May 2022 17:21:40 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11274: CNF Rewrite causes a regress in join node performance

2022-05-13 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18458 )

Change subject: IMPALA-11274: CNF Rewrite causes a regress in join node 
performance
..


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18458/7/fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java
File fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java:

http://gerrit.cloudera.org:8080/#/c/18458/7/fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java@605
PS7, Line 605:   public void TestFeasibleToConvertToCNF() {
> TestFeasibleToConvertToCNF passes. I'd appreciate thoughts on examples for
I added 2 tests to convert-to-cnf.test. The first one which uses UPPER function 
is expected to not be converted to CNF and it does not .. so this is the 
expected behavior.  However, the second query uses CAST function which I would 
expect to be converted to CNF. But it does not.


# IMPALA-11274: Test with string functions in the disjunctive predicate.
# In this case the predicate is not converted to CNF
select count(*) from lineitem, orders
   where l_orderkey = o_orderkey and
((upper(l_returnflag) = 'Y' and upper(o_orderpriority) = 'HIGH')
  or (upper(l_returnflag) = 'N' and upper(o_orderpriority) = 'LOW'))
and l_partkey > 0;

# IMPALA-11274: Functions like CAST should still be eligible for CNF
select count(*) from lineitem, orders
   where l_orderkey = o_orderkey and
((cast(l_returnflag as varchar(2)) = 'Y' and cast(o_orderpriority as 
varchar(5)) = 'HIGH')
  or (cast(l_returnflag as varchar(2)) = 'N' and cast(o_orderpriority as 
varchar(5)) = 'LOW'))
and l_partkey > 0;



--
To view, visit http://gerrit.cloudera.org:8080/18458
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I326406c6b004fe31ec0e2a2f390a3845b8925aa9
Gerrit-Change-Number: 18458
Gerrit-PatchSet: 9
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Sat, 14 May 2022 02:13:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11274: CNF Rewrite causes a regress in join node performance

2022-04-29 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18458 )

Change subject: IMPALA-11274: CNF Rewrite causes a regress in join node 
performance
..


Patch Set 7:

(3 comments)

Changes look good overall. A few comments below.

http://gerrit.cloudera.org:8080/#/c/18458/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18458/7//COMMIT_MSG@26
PS7, Line 26: norm
nit: normal


http://gerrit.cloudera.org:8080/#/c/18458/7/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java
File fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java:

http://gerrit.cloudera.org:8080/#/c/18458/7/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java@102
PS7, Line 102:   LOG.info("It is not feasible to rewrite predidcate " + 
pred.toSql() + " to CNF.");
nit: at info level this may create lots of log entries since the rules are 
applied iteratively whenever any other transformation occurs.  Would be better 
to log at debug level.


http://gerrit.cloudera.org:8080/#/c/18458/7/fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java
File fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java:

http://gerrit.cloudera.org:8080/#/c/18458/7/fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java@605
PS7, Line 605:   public void TestFeasibleToConvertToCNF() {
These tests don't have OR predicates.  One option is to add here but we should 
also add an actual PlannerTest with an Explain plan that has a string function 
that previously would have gotten converted through CNF and now is prohibited.  
I can help create one test case if you don't get around to it.  It could be 
added in 
testdata/workloads/functional-planner/queries/PlannerTest/convert-to-cnf.test 
(it used tpc-h schema).



--
To view, visit http://gerrit.cloudera.org:8080/18458
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I326406c6b004fe31ec0e2a2f390a3845b8925aa9
Gerrit-Change-Number: 18458
Gerrit-PatchSet: 7
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Fri, 29 Apr 2022 19:26:07 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11247: Test script changes for materialized views

2022-04-17 Thread Aman Sinha (Code Review)
Aman Sinha has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/18421 )

Change subject: IMPALA-11247: Test script changes for materialized views
..

IMPALA-11247: Test script changes for materialized views

IMPALA-10723 added support for treating materialized views as tables.
In certain test configurations, the rebuild of the materialized views
(which is done via Hive) was not populating the data in the MV. In
this patch, I have changed the source tables of materialized views
to be full-acid instead of insert-only transactional tables. This
enables the tests to succeed. Insert-only source tables are also
meant to work for the MV rebuild but that is a Hive issue that will
be investigated separately.

Change-Id: I349faa0ad36ec8ca6f574f7f92d9a32fb7d0d344
Reviewed-on: http://gerrit.cloudera.org:8080/18421
Reviewed-by: Aman Sinha 
Tested-by: Aman Sinha 
---
M fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java
M testdata/bin/compute-table-stats.sh
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
4 files changed, 12 insertions(+), 58 deletions(-)

Approvals:
  Aman Sinha: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/18421
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I349faa0ad36ec8ca6f574f7f92d9a32fb7d0d344
Gerrit-Change-Number: 18421
Gerrit-PatchSet: 8
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-11247: Test script changes for materialized views

2022-04-17 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18421 )

Change subject: IMPALA-11247: Test script changes for materialized views
..


Patch Set 7: Verified+1 Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/18421
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I349faa0ad36ec8ca6f574f7f92d9a32fb7d0d344
Gerrit-Change-Number: 18421
Gerrit-PatchSet: 7
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Sun, 17 Apr 2022 22:36:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11247: Test script changes for materialized views

2022-04-17 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18421 )

Change subject: IMPALA-11247: Test script changes for materialized views
..


Patch Set 6: Verified+1 Code-Review+2

Carry forward Quanlong's +2.
Marking Verified since the one test failure is a known (flaky ?) issue 
IMPALA-11160.


--
To view, visit http://gerrit.cloudera.org:8080/18421
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I349faa0ad36ec8ca6f574f7f92d9a32fb7d0d344
Gerrit-Change-Number: 18421
Gerrit-PatchSet: 6
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Sun, 17 Apr 2022 21:56:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11247: Test script changes for materialized views

2022-04-17 Thread Aman Sinha (Code Review)
Hello Quanlong Huang, Wenzhe Zhou, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/18421

to look at the new patch set (#6).

Change subject: IMPALA-11247: Test script changes for materialized views
..

IMPALA-11247: Test script changes for materialized views

IMPALA-10723 added support for treating materialized views as tables.
In certain test configurations, the rebuild of the materialized views
(which is done via Hive) was not populating the data in the MV. In
this patch, I have changed the source tables of materialized views
to be full-acid instead of insert-only transactional tables. This
enables the tests to succeed. Insert-only source tables are also
meant to work for the MV rebuild but that is a Hive issue that will
be investigated separately.

Change-Id: I349faa0ad36ec8ca6f574f7f92d9a32fb7d0d344
---
M fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java
M testdata/bin/compute-table-stats.sh
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
4 files changed, 12 insertions(+), 58 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/21/18421/6
--
To view, visit http://gerrit.cloudera.org:8080/18421
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I349faa0ad36ec8ca6f574f7f92d9a32fb7d0d344
Gerrit-Change-Number: 18421
Gerrit-PatchSet: 6
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-11247: Test script changes for materialized views

2022-04-17 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18421 )

Change subject: IMPALA-11247: Test script changes for materialized views
..


Patch Set 5:

> Patch Set 5: Verified-1
>
> Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/8052/

1 test failure in query_test.test_acid.TestAcid.test_acid_compute_stats which 
is a flaky test previously reported in 
https://issues.apache.org/jira/browse/IMPALA-11160.

One small change still needs to be done: the compute-table-stats.sh still has 
alltypes_transactional and jointbl_transactional which were removed in the last 
patch set.  I will update the compute-table-stats.sh


--
To view, visit http://gerrit.cloudera.org:8080/18421
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I349faa0ad36ec8ca6f574f7f92d9a32fb7d0d344
Gerrit-Change-Number: 18421
Gerrit-PatchSet: 5
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Sun, 17 Apr 2022 21:09:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11247: Test script changes for materialized views

2022-04-17 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18421 )

Change subject: IMPALA-11247: Test script changes for materialized views
..


Patch Set 4:

> Patch Set 3: Verified-1
>
> Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/8051/

1 test failed in this because I had missed updating the 
AuthorizationStmtTest.java after removing the alltypes_transactional source 
table.  This is fixed in patchset 4.


--
To view, visit http://gerrit.cloudera.org:8080/18421
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I349faa0ad36ec8ca6f574f7f92d9a32fb7d0d344
Gerrit-Change-Number: 18421
Gerrit-PatchSet: 4
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Sun, 17 Apr 2022 14:51:56 +
Gerrit-HasComments: No


  1   2   3   4   5   6   7   >