[ 
https://issues.apache.org/jira/browse/DRILL-7380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16938633#comment-16938633
 ] 

ASF GitHub Bot commented on DRILL-7380:
---------------------------------------

KazydubB commented on pull request #1861: DRILL-7380: Query of a field inside 
of an array of structs returns null
URL: https://github.com/apache/drill/pull/1861#discussion_r328619538
 
 

 ##########
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet2/DrillParquetReader.java
 ##########
 @@ -111,20 +110,37 @@ public DrillParquetReader(FragmentContext 
fragmentContext,
     this.numRecordsToRead = initNumRecordsToRead(recordsToRead, 
entry.getRowGroupIndex(), footer);
   }
 
+  /**
+   * Creates projection MessageType from projection columns and given schema.
+   *
+   * @param schema Parquet file schema
+   * @param projectionColumns columns to search
+   * @param columnsNotFound any projection column which wasn't found in schema 
is added to the list
+   * @return projection containing matched columns or null if none column 
matches schema
+   */
   private static MessageType getProjection(MessageType schema,
-                                           Collection<SchemaPath> columns,
+                                           Collection<SchemaPath> 
projectionColumns,
                                            List<SchemaPath> columnsNotFound) {
-    MessageType projection = null;
-
-    String messageName = schema.getName();
-    List<ColumnDescriptor> schemaColumns = schema.getColumns();
-    // parquet type.union() seems to lose ConvertedType info when merging two 
columns that are the same type. This can
-    // happen when selecting two elements from an array. So to work around 
this, we use set of SchemaPath to avoid duplicates
-    // and then merge the types at the end
-    Set<SchemaPath> selectedSchemaPaths = new LinkedHashSet<>();
+    projectionColumns = adaptColumnsToParquetSchema(projectionColumns, schema);
+    List<SchemaPath> schemaColumns = getAllColumnsFrom(schema);
+    Set<SchemaPath> selectedSchemaPaths = 
matchProjectionWithSchemaColumns(projectionColumns, schemaColumns, 
columnsNotFound);
+    MessageType projection = convertSelectedColumnsToMessageType(schema, 
selectedSchemaPaths);
+    return projection;
 
 Review comment:
   nit: `projection` variable may be avoided.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Query of a field inside of an array of structs returns null
> -----------------------------------------------------------
>
>                 Key: DRILL-7380
>                 URL: https://issues.apache.org/jira/browse/DRILL-7380
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.17.0
>            Reporter: Anton Gozhiy
>            Assignee: Igor Guzenko
>            Priority: Major
>         Attachments: customer_complex.zip
>
>
> *Query:*
> {code:sql}
> select t.c_orders[0].o_orderstatus from hive.customer_complex t limit 10;
> {code}
> *Expected results (given from Hive):*
> {noformat}
> OK
> O
> F
> NULL
> O
> O
> NULL
> O
> O
> NULL
> F
> {noformat}
> *Actual results:*
> {noformat}
> null
> null
> null
> null
> null
> null
> null
> null
> null
> null
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to