Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17847 )

Change subject: IMPALA-10838: Error when struct returned from WITH()
......................................................................


Patch Set 18:

(20 comments)

Nice patch! :)

http://gerrit.cloudera.org:8080/#/c/17847/17/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

http://gerrit.cloudera.org:8080/#/c/17847/17/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1415
PS17, Line 1415:     List<Type> ancestorTypes = matchedTypes.subList(0, 
matchedTypes.size() - 1);
No need to create a sublist here, just make a for loop that skips the last 
item. This way you can get rid of both the local Lists introduced here.


http://gerrit.cloudera.org:8080/#/c/17847/17/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1438
PS17, Line 1438:    * However, if the order is reversed, we are in a more 
difficult situation. If we try to
Can't you get rid og this hassle if you changed the processing order of the 
expressions? E.g. always process structs first.


http://gerrit.cloudera.org:8080/#/c/17847/17/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1462
PS17, Line 1462:       int highestAncestorDistance) throws AnalysisException {
Could you please describe the purpose of the function parameters in the comment 
above?


http://gerrit.cloudera.org:8080/#/c/17847/17/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1525
PS17, Line 1525:     structTuple.setParentSlotDesc(desc);
nit: this could go to the top of this function so that a user could more easily 
see that we accept structs as parameter here.


http://gerrit.cloudera.org:8080/#/c/17847/17/fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java
File fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java:

http://gerrit.cloudera.org:8080/#/c/17847/17/fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java@154
PS17, Line 154:     // If we are inside a struct, we need to materialise all 
enclosing struct
nit: I think this comment would be better at the callsite.


http://gerrit.cloudera.org:8080/#/c/17847/17/fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java
File fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java:

http://gerrit.cloudera.org:8080/#/c/17847/17/fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java@22
PS17, Line 22: import java.util.stream.Collectors;
This is not needed anymore, right?


http://gerrit.cloudera.org:8080/#/c/17847/17/fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
File fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java:

http://gerrit.cloudera.org:8080/#/c/17847/17/fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java@336
PS17, Line 336:       // createAuxPredicatesForStructMembers(analyzer, 
(SlotRef) colExpr);
Honestly, I don't know much about aux predicates so won't be able to answer 
this.


http://gerrit.cloudera.org:8080/#/c/17847/17/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
File fe/src/main/java/org/apache/impala/analysis/SelectStmt.java:

http://gerrit.cloudera.org:8080/#/c/17847/17/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@286
PS17, Line 286:       preregisterStructSlotRefPathsWithAnalyzer();
I have asked this question in another comment but let me duplicate it: Instead 
of pre-registering the struct SlotRefs here, wouldn't re-ordering the select 
list items help? E.g. Putting everything that involves a struct to the 
beginning of the select list.


http://gerrit.cloudera.org:8080/#/c/17847/17/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@327
PS17, Line 327:       // Preregister select list paths that point to structs. 
This will be used to unify
nit: This should be a function comment in above the function signature.


http://gerrit.cloudera.org:8080/#/c/17847/17/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@351
PS17, Line 351: List
Could this be a HashSet to avoid duplicates?


http://gerrit.cloudera.org:8080/#/c/17847/17/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@354
PS17, Line 354:       List<Path> paths = 
slotRefs.stream().map(this::slotRefToResolvedPath)
I observed you really don't like for loops and ifs :D


http://gerrit.cloudera.org:8080/#/c/17847/17/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@361
PS17, Line 361:     private Path slotRefToResolvedPath(SlotRef slotRef) {
I think this should be a part of SlotRef instead of SelectStmt. The whole 
content of this function reminds me the beginning of SlotRef.analyzeImpl() 
anyway.


http://gerrit.cloudera.org:8080/#/c/17847/17/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@361
PS17, Line 361: slotRefToResolvedPath
The naming should be something different to emphasise that this is not just a 
get method but also resolves the rawPath.


http://gerrit.cloudera.org:8080/#/c/17847/17/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@363
PS17, Line 363:         Path resolvedPath = 
analyzer_.resolvePathWithMasking(slotRef.getRawPath(),
If you do this resolvePathWithMasking() call here would the very same be still 
called when analyzing SlotRefs in the select list in analyzeSelectClause()?


http://gerrit.cloudera.org:8080/#/c/17847/17/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@370
PS17, Line 370:       } catch (AnalysisException e) {
Shouldn't we just let the AnalysisException be re-thrown from this function? Is 
there a case when we just want to swallow the exception and deliberately 
continue the execution?


http://gerrit.cloudera.org:8080/#/c/17847/17/fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
File fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java:

http://gerrit.cloudera.org:8080/#/c/17847/17/fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java@204
PS17, Line 204:       final TupleDescriptor structTupleDesc = 
parentStructSlotDesc != null ?
No need to introduce 'structTupleDesc' as its only purpose is to be assigned to 
'tupleDesc'. You could directly put these to 'tupleDesc'.


http://gerrit.cloudera.org:8080/#/c/17847/17/fe/src/main/java/org/apache/impala/analysis/SlotRef.java
File fe/src/main/java/org/apache/impala/analysis/SlotRef.java:

http://gerrit.cloudera.org:8080/#/c/17847/17/fe/src/main/java/org/apache/impala/analysis/SlotRef.java@266
PS17, Line 266:   public List<String> getRawPath() {
nit: fits in one line


http://gerrit.cloudera.org:8080/#/c/17847/17/fe/src/test/java/org/apache/impala/planner/PlannerTest.java
File fe/src/test/java/org/apache/impala/planner/PlannerTest.java:

http://gerrit.cloudera.org:8080/#/c/17847/17/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@779
PS17, Line 779:   public void testOnlyNeededStructFieldsMaterialized() throws 
ImpalaException {
With this testing approach we wouldn't verify the exact size of a row just that 
the size of a row in one query matches the size of a row in another query. But 
we know the size of the structs under test here so we could be more specific 
with the asserts, right?
e.g. Assume we query an int struct field in a regular select and using a WITH, 
and with this patch both would take let's say 1234567bytes for an integer 
instead of 4. This test would still succeed, right?


http://gerrit.cloudera.org:8080/#/c/17847/17/testdata/workloads/functional-query/queries/QueryTest/nested-struct-in-select-list.test
File 
testdata/workloads/functional-query/queries/QueryTest/nested-struct-in-select-list.test:

http://gerrit.cloudera.org:8080/#/c/17847/17/testdata/workloads/functional-query/queries/QueryTest/nested-struct-in-select-list.test@283
PS17, Line 283: ---- QUERY
The below 2 tests would be also good candidates to be checked for the row size 
as well in PlannerTest.


http://gerrit.cloudera.org:8080/#/c/17847/17/testdata/workloads/functional-query/queries/QueryTest/nested-struct-in-select-list.test@334
PS17, Line 334: 'NULL','NULL'
Could you please also add some tests where we select a struct and some of its 
fields (or do a WHERE on the fields) where the struct is inside a collection?



--
To view, visit http://gerrit.cloudera.org:8080/17847
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iadb9233677355b85d424cc3f22b00b5a3bf61c57
Gerrit-Change-Number: 17847
Gerrit-PatchSet: 18
Gerrit-Owner: Daniel Becker <daniel.bec...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <daniel.bec...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Comment-Date: Thu, 10 Mar 2022 15:24:15 +0000
Gerrit-HasComments: Yes

Reply via email to