Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18596 )

Change subject: IMPALA-9496: Allow struct type in the select list for Parquet 
tables
......................................................................


Patch Set 5:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/18596/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18596/3//COMMIT_MSG@23
PS3, Line 23: ple ReadValue() on these readers instead of the
            :     batched version. The reason is that calling the batched 
reader in
            :     the member column readers would in fact read in batches, but 
it
            :     won't handle the case when the parent struct i
> A relatively simple optimization that comes to mind is to use only the firs
Well I'm not completely in favor of the proposed approach. It would be complex 
to understand for a later reader why only one scalar member of a struct is 
added as a children to the struct and why the others aren't. The nested struct 
use case would make this even more complicated.

I opened a Jira with another suggestion for this. Added the Jira ID here to the 
commit msg.


http://gerrit.cloudera.org:8080/#/c/18596/3/be/src/exec/parquet/hdfs-parquet-scanner.cc
File be/src/exec/parquet/hdfs-parquet-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/18596/3/be/src/exec/parquet/hdfs-parquet-scanner.cc@2235
PS3, Line 2235: HasStructColumnReader
> This means that if there are struct scanners, then we never use late materi
Added this to the commit msg.


http://gerrit.cloudera.org:8080/#/c/18596/3/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/18596/3/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1014
PS3, Line 1014:     // of a struct, where the struct is also given in the 
select list then skip dictionary
> typo
Done


http://gerrit.cloudera.org:8080/#/c/18596/1/tests/query_test/test_nested_types.py
File tests/query_test/test_nested_types.py:

http://gerrit.cloudera.org:8080/#/c/18596/1/tests/query_test/test_nested_types.py@153
PS1, Line 153: s
> flake8: E265 block comment should start with '# '
Note for myself: drop this commented line



--
To view, visit http://gerrit.cloudera.org:8080/18596
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3e8b4cbc2c4d1dd5fbefb7c87dad8d4e6ac2f452
Gerrit-Change-Number: 18596
Gerrit-PatchSet: 5
Gerrit-Owner: Gabor Kaszab <gaborkas...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Comment-Date: Thu, 16 Jun 2022 09:59:58 +0000
Gerrit-HasComments: Yes

Reply via email to