Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/18596 )
Change subject: IMPALA-9496: Allow struct type in the select list for Parquet tables ...................................................................... Patch Set 5: (4 comments) http://gerrit.cloudera.org:8080/#/c/18596/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/18596/3//COMMIT_MSG@23 PS3, Line 23: ple ReadValue() on these readers instead of the : batched version. The reason is that calling the batched reader in : the member column readers would in fact read in batches, but it : won't handle the case when the parent struct i > A relatively simple optimization that comes to mind is to use only the firs Well I'm not completely in favor of the proposed approach. It would be complex to understand for a later reader why only one scalar member of a struct is added as a children to the struct and why the others aren't. The nested struct use case would make this even more complicated. I opened a Jira with another suggestion for this. Added the Jira ID here to the commit msg. http://gerrit.cloudera.org:8080/#/c/18596/3/be/src/exec/parquet/hdfs-parquet-scanner.cc File be/src/exec/parquet/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/18596/3/be/src/exec/parquet/hdfs-parquet-scanner.cc@2235 PS3, Line 2235: HasStructColumnReader > This means that if there are struct scanners, then we never use late materi Added this to the commit msg. http://gerrit.cloudera.org:8080/#/c/18596/3/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/18596/3/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1014 PS3, Line 1014: // of a struct, where the struct is also given in the select list then skip dictionary > typo Done http://gerrit.cloudera.org:8080/#/c/18596/1/tests/query_test/test_nested_types.py File tests/query_test/test_nested_types.py: http://gerrit.cloudera.org:8080/#/c/18596/1/tests/query_test/test_nested_types.py@153 PS1, Line 153: s > flake8: E265 block comment should start with '# ' Note for myself: drop this commented line -- To view, visit http://gerrit.cloudera.org:8080/18596 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3e8b4cbc2c4d1dd5fbefb7c87dad8d4e6ac2f452 Gerrit-Change-Number: 18596 Gerrit-PatchSet: 5 Gerrit-Owner: Gabor Kaszab <gaborkas...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Comment-Date: Thu, 16 Jun 2022 09:59:58 +0000 Gerrit-HasComments: Yes