Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18596 )

Change subject: IMPALA-9496: Allow struct type in the select list for Parquet 
tables
......................................................................


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/18596/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18596/3//COMMIT_MSG@23
PS3, Line 23: The reason is that calling the batched reader in
            :     the member column readers would in fact read in batches, but 
it
            :     won't handle the case when the parent struct is NULL and 
would set
            :     only itself to NULL but not the parent struct.
A relatively simple optimization that comes to mind is to use only the first 
child of a struct in the struct reader and treat the others as normal scalar 
column readers. As I see in StructColumnReader::NextLevels / 
StructColumnReader::ReadValue we are already using only the first child's def 
level to set the nullness of the struct.

The selection of the children to use in the struct reader could be done like 
this:
- if the struct has only scalar children, then the first child would be kept
- if the struct has also struct children, then all struct children but no 
scalar children would be kept


http://gerrit.cloudera.org:8080/#/c/18596/3/be/src/exec/parquet/hdfs-parquet-scanner.cc
File be/src/exec/parquet/hdfs-parquet-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/18596/3/be/src/exec/parquet/hdfs-parquet-scanner.cc@2235
PS3, Line 2235: HasStructColumnReader
This means that if there are struct scanners, then we never use late 
materialization, right? This could be mentioned among the limitations in the 
commit message.


http://gerrit.cloudera.org:8080/#/c/18596/3/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/18596/3/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1014
PS3, Line 1014:     // o a struct, where the struct is also given in the select 
list then skip dictionary
typo



--
To view, visit http://gerrit.cloudera.org:8080/18596
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3e8b4cbc2c4d1dd5fbefb7c87dad8d4e6ac2f452
Gerrit-Change-Number: 18596
Gerrit-PatchSet: 3
Gerrit-Owner: Gabor Kaszab <gaborkas...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Comment-Date: Fri, 10 Jun 2022 14:28:05 +0000
Gerrit-HasComments: Yes

Reply via email to