[jira] [Commented] (DRILL-7170) IllegalStateException: Record count not set for this vector container
[ https://issues.apache.org/jira/browse/DRILL-7170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16938969#comment-16938969 ] ASF GitHub Bot commented on DRILL-7170: --- Ben-Zvi commented on pull request #1859: DRILL-7170: Ignore uninitialized vector containers for OOM error messages URL: https://github.com/apache/drill/pull/1859 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > IllegalStateException: Record count not set for this vector container > - > > Key: DRILL-7170 > URL: https://issues.apache.org/jira/browse/DRILL-7170 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Reporter: Sorabh Hamirwasia >Priority: Major > Fix For: 1.17.0 > > > {code:java} > Query: > /root/drillAutomation/master/framework/resources/Advanced/tpcds/tpcds_sf1/original/maprdb/json/query95.sql > WITH ws_wh AS > ( > SELECT ws1.ws_order_number, > ws1.ws_warehouse_sk wh1, > ws2.ws_warehouse_sk wh2 > FROM web_sales ws1, > web_sales ws2 > WHERE ws1.ws_order_number = ws2.ws_order_number > ANDws1.ws_warehouse_sk <> ws2.ws_warehouse_sk) > SELECT > Count(DISTINCT ws_order_number) AS `order count` , > Sum(ws_ext_ship_cost) AS `total shipping cost` , > Sum(ws_net_profit) AS `total net profit` > FROM web_sales ws1 , > date_dim , > customer_address , > web_site > WHEREd_date BETWEEN '2000-04-01' AND ( > Cast('2000-04-01' AS DATE) + INTERVAL '60' day) > AND ws1.ws_ship_date_sk = d_date_sk > AND ws1.ws_ship_addr_sk = ca_address_sk > AND ca_state = 'IN' > AND ws1.ws_web_site_sk = web_site_sk > AND web_company_name = 'pri' > AND ws1.ws_order_number IN > ( > SELECT ws_order_number > FROM ws_wh) > AND ws1.ws_order_number IN > ( > SELECT wr_order_number > FROM web_returns, > ws_wh > WHERE wr_order_number = ws_wh.ws_order_number) > ORDER BY count(DISTINCT ws_order_number) > LIMIT 100 > Exception: > java.sql.SQLException: SYSTEM ERROR: IllegalStateException: Record count not > set for this vector container > Fragment 2:3 > Please, refer to logs for more information. > [Error Id: 4ed92fce-505b-40ba-ac0e-4a302c28df47 on drill87:31010] > (java.lang.IllegalStateException) Record count not set for this vector > container > > org.apache.drill.shaded.guava.com.google.common.base.Preconditions.checkState():459 > org.apache.drill.exec.record.VectorContainer.getRecordCount():394 > org.apache.drill.exec.record.RecordBatchSizer.():720 > org.apache.drill.exec.record.RecordBatchSizer.():704 > > org.apache.drill.exec.physical.impl.common.HashTableTemplate$BatchHolder.getActualSize():462 > > org.apache.drill.exec.physical.impl.common.HashTableTemplate.getActualSize():964 > > org.apache.drill.exec.physical.impl.common.HashTableTemplate.makeDebugString():973 > > org.apache.drill.exec.physical.impl.common.HashPartition.makeDebugString():601 > > org.apache.drill.exec.physical.impl.join.HashJoinBatch.makeDebugString():1313 > > org.apache.drill.exec.physical.impl.join.HashJoinBatch.executeBuildPhase():1105 > org.apache.drill.exec.physical.impl.join.HashJoinBatch.innerNext():525 > org.apache.drill.exec.record.AbstractRecordBatch.next():186 > org.apache.drill.exec.record.AbstractRecordBatch.next():126 > org.apache.drill.exec.record.AbstractRecordBatch.next():116 > org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63 > > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():141 > org.apache.drill.exec.record.AbstractRecordBatch.next():186 > org.apache.drill.exec.record.AbstractRecordBatch.next():126 > org.apache.drill.exec.test.generated.HashAggregatorGen1068899.doWork():642 > org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.innerNext():296 > org.apache.drill.exec.record.AbstractRecordBatch.next():186 > org.apache.drill.exec.record.AbstractRecordBatch.next():126 > org.apache.drill.exec.record.AbstractRecordBatch.next():116 > org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63 > > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():141 > org.apache.drill.exec.record.AbstractRecordBatch.next():186 > org.apache.drill.exec.physical.impl.BaseRootExec.next():104 > > org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():93 > org.apache.drill.exec.physical.impl.BaseRootExec.next():94 >
[jira] [Commented] (DRILL-7380) Query of a field inside of an array of structs returns null
[ https://issues.apache.org/jira/browse/DRILL-7380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16938758#comment-16938758 ] ASF GitHub Bot commented on DRILL-7380: --- ihuzenko commented on pull request #1861: DRILL-7380: Query of a field inside of an array of structs returns null URL: https://github.com/apache/drill/pull/1861 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Query of a field inside of an array of structs returns null > --- > > Key: DRILL-7380 > URL: https://issues.apache.org/jira/browse/DRILL-7380 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.17.0 >Reporter: Anton Gozhiy >Assignee: Igor Guzenko >Priority: Major > Labels: ready-to-commit > Attachments: customer_complex.zip > > > *Query:* > {code:sql} > select t.c_orders[0].o_orderstatus from hive.customer_complex t limit 10; > {code} > *Expected results (given from Hive):* > {noformat} > OK > O > F > NULL > O > O > NULL > O > O > NULL > F > {noformat} > *Actual results:* > {noformat} > null > null > null > null > null > null > null > null > null > null > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (DRILL-7381) Query to a map field returns nulls with hive native reader
[ https://issues.apache.org/jira/browse/DRILL-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Igor Guzenko updated DRILL-7381: Labels: ready-to-commit (was: ) > Query to a map field returns nulls with hive native reader > -- > > Key: DRILL-7381 > URL: https://issues.apache.org/jira/browse/DRILL-7381 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.17.0 >Reporter: Anton Gozhiy >Assignee: Igor Guzenko >Priority: Major > Labels: ready-to-commit > Attachments: customer_complex.zip > > > *Query:* > {code:sql} > select t.c_nation.n_region.r_name from hive.customer_complex t limit 5 > {code} > *Expected results:* > {noformat} > AFRICA > MIDDLE EAST > AMERICA > MIDDLE EAST > AMERICA > {noformat} > *Actual results:* > {noformat} > null > null > null > null > null > {noformat} > *Workaround:* > {code:sql} > set store.hive.optimize_scan_with_native_readers = false; > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (DRILL-7380) Query of a field inside of an array of structs returns null
[ https://issues.apache.org/jira/browse/DRILL-7380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Igor Guzenko updated DRILL-7380: Labels: ready-to-commit (was: ) > Query of a field inside of an array of structs returns null > --- > > Key: DRILL-7380 > URL: https://issues.apache.org/jira/browse/DRILL-7380 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.17.0 >Reporter: Anton Gozhiy >Assignee: Igor Guzenko >Priority: Major > Labels: ready-to-commit > Attachments: customer_complex.zip > > > *Query:* > {code:sql} > select t.c_orders[0].o_orderstatus from hive.customer_complex t limit 10; > {code} > *Expected results (given from Hive):* > {noformat} > OK > O > F > NULL > O > O > NULL > O > O > NULL > F > {noformat} > *Actual results:* > {noformat} > null > null > null > null > null > null > null > null > null > null > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (DRILL-7380) Query of a field inside of an array of structs returns null
[ https://issues.apache.org/jira/browse/DRILL-7380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16938671#comment-16938671 ] ASF GitHub Bot commented on DRILL-7380: --- ihuzenko commented on issue #1861: DRILL-7380: Query of a field inside of an array of structs returns null URL: https://github.com/apache/drill/pull/1861#issuecomment-535528310 @KazydubB , I've addressed comments, please check again. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Query of a field inside of an array of structs returns null > --- > > Key: DRILL-7380 > URL: https://issues.apache.org/jira/browse/DRILL-7380 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.17.0 >Reporter: Anton Gozhiy >Assignee: Igor Guzenko >Priority: Major > Attachments: customer_complex.zip > > > *Query:* > {code:sql} > select t.c_orders[0].o_orderstatus from hive.customer_complex t limit 10; > {code} > *Expected results (given from Hive):* > {noformat} > OK > O > F > NULL > O > O > NULL > O > O > NULL > F > {noformat} > *Actual results:* > {noformat} > null > null > null > null > null > null > null > null > null > null > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (DRILL-7174) Expose complex to Json control in the Drill C++ Client
[ https://issues.apache.org/jira/browse/DRILL-7174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Volodymyr Vysotskyi updated DRILL-7174: --- Labels: ready-to-commit (was: ) > Expose complex to Json control in the Drill C++ Client > -- > > Key: DRILL-7174 > URL: https://issues.apache.org/jira/browse/DRILL-7174 > Project: Apache Drill > Issue Type: Task >Reporter: Rob Wu >Priority: Minor > Labels: ready-to-commit > Fix For: 1.17.0 > > > Arjun Gupta will be supplying a patch for this > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (DRILL-7380) Query of a field inside of an array of structs returns null
[ https://issues.apache.org/jira/browse/DRILL-7380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16938652#comment-16938652 ] ASF GitHub Bot commented on DRILL-7380: --- KazydubB commented on pull request #1861: DRILL-7380: Query of a field inside of an array of structs returns null URL: https://github.com/apache/drill/pull/1861#discussion_r328628279 ## File path: logical/src/main/java/org/apache/drill/common/expression/SchemaPath.java ## @@ -48,13 +48,20 @@ private final NameSegment rootSegment; public SchemaPath(SchemaPath path) { -super(path.getPosition()); -this.rootSegment = path.rootSegment; +this(path.rootSegment, path.getPosition()); } public SchemaPath(NameSegment rootSegment) { -super(ExpressionPosition.UNKNOWN); -this.rootSegment = rootSegment; +this(rootSegment, ExpressionPosition.UNKNOWN); + } + + /** + * @deprecated Use {@link #SchemaPath(NameSegment)} + * or {@link #SchemaPath(NameSegment, ExpressionPosition)} instead + */ + @Deprecated Review comment: Oops, missed it. Leave it as is, then. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Query of a field inside of an array of structs returns null > --- > > Key: DRILL-7380 > URL: https://issues.apache.org/jira/browse/DRILL-7380 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.17.0 >Reporter: Anton Gozhiy >Assignee: Igor Guzenko >Priority: Major > Attachments: customer_complex.zip > > > *Query:* > {code:sql} > select t.c_orders[0].o_orderstatus from hive.customer_complex t limit 10; > {code} > *Expected results (given from Hive):* > {noformat} > OK > O > F > NULL > O > O > NULL > O > O > NULL > F > {noformat} > *Actual results:* > {noformat} > null > null > null > null > null > null > null > null > null > null > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (DRILL-7174) Expose complex to Json control in the Drill C++ Client
[ https://issues.apache.org/jira/browse/DRILL-7174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16938651#comment-16938651 ] ASF GitHub Bot commented on DRILL-7174: --- agozhiy commented on issue #1814: DRILL-7174: Expose complex to Json control in the Drill C++ Client URL: https://github.com/apache/drill/pull/1814#issuecomment-535513377 @vvysotskyi, I built the native client and ran tests, everything works ok. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Expose complex to Json control in the Drill C++ Client > -- > > Key: DRILL-7174 > URL: https://issues.apache.org/jira/browse/DRILL-7174 > Project: Apache Drill > Issue Type: Task >Reporter: Rob Wu >Priority: Minor > Fix For: 1.17.0 > > > Arjun Gupta will be supplying a patch for this > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (DRILL-7380) Query of a field inside of an array of structs returns null
[ https://issues.apache.org/jira/browse/DRILL-7380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16938637#comment-16938637 ] ASF GitHub Bot commented on DRILL-7380: --- ihuzenko commented on pull request #1861: DRILL-7380: Query of a field inside of an array of structs returns null URL: https://github.com/apache/drill/pull/1861#discussion_r328625157 ## File path: logical/src/main/java/org/apache/drill/common/expression/SchemaPath.java ## @@ -48,13 +48,20 @@ private final NameSegment rootSegment; public SchemaPath(SchemaPath path) { -super(path.getPosition()); -this.rootSegment = path.rootSegment; +this(path.rootSegment, path.getPosition()); } public SchemaPath(NameSegment rootSegment) { -super(ExpressionPosition.UNKNOWN); -this.rootSegment = rootSegment; +this(rootSegment, ExpressionPosition.UNKNOWN); + } + + /** + * @deprecated Use {@link #SchemaPath(NameSegment)} + * or {@link #SchemaPath(NameSegment, ExpressionPosition)} instead + */ + @Deprecated Review comment: Was already deprecated at line 101 on the left side. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Query of a field inside of an array of structs returns null > --- > > Key: DRILL-7380 > URL: https://issues.apache.org/jira/browse/DRILL-7380 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.17.0 >Reporter: Anton Gozhiy >Assignee: Igor Guzenko >Priority: Major > Attachments: customer_complex.zip > > > *Query:* > {code:sql} > select t.c_orders[0].o_orderstatus from hive.customer_complex t limit 10; > {code} > *Expected results (given from Hive):* > {noformat} > OK > O > F > NULL > O > O > NULL > O > O > NULL > F > {noformat} > *Actual results:* > {noformat} > null > null > null > null > null > null > null > null > null > null > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (DRILL-7380) Query of a field inside of an array of structs returns null
[ https://issues.apache.org/jira/browse/DRILL-7380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16938631#comment-16938631 ] ASF GitHub Bot commented on DRILL-7380: --- KazydubB commented on pull request #1861: DRILL-7380: Query of a field inside of an array of structs returns null URL: https://github.com/apache/drill/pull/1861#discussion_r327965662 ## File path: logical/src/main/java/org/apache/drill/common/expression/SchemaPath.java ## @@ -48,13 +48,20 @@ private final NameSegment rootSegment; public SchemaPath(SchemaPath path) { -super(path.getPosition()); -this.rootSegment = path.rootSegment; +this(path.rootSegment, path.getPosition()); } public SchemaPath(NameSegment rootSegment) { -super(ExpressionPosition.UNKNOWN); -this.rootSegment = rootSegment; +this(rootSegment, ExpressionPosition.UNKNOWN); + } + + /** + * @deprecated Use {@link #SchemaPath(NameSegment)} + * or {@link #SchemaPath(NameSegment, ExpressionPosition)} instead + */ + @Deprecated Review comment: Why deprecate? It's rather a convenient way to create `SchemaPath`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Query of a field inside of an array of structs returns null > --- > > Key: DRILL-7380 > URL: https://issues.apache.org/jira/browse/DRILL-7380 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.17.0 >Reporter: Anton Gozhiy >Assignee: Igor Guzenko >Priority: Major > Attachments: customer_complex.zip > > > *Query:* > {code:sql} > select t.c_orders[0].o_orderstatus from hive.customer_complex t limit 10; > {code} > *Expected results (given from Hive):* > {noformat} > OK > O > F > NULL > O > O > NULL > O > O > NULL > F > {noformat} > *Actual results:* > {noformat} > null > null > null > null > null > null > null > null > null > null > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (DRILL-7380) Query of a field inside of an array of structs returns null
[ https://issues.apache.org/jira/browse/DRILL-7380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16938629#comment-16938629 ] ASF GitHub Bot commented on DRILL-7380: --- KazydubB commented on pull request #1861: DRILL-7380: Query of a field inside of an array of structs returns null URL: https://github.com/apache/drill/pull/1861#discussion_r328614853 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet2/DrillParquetReader.java ## @@ -233,28 +288,26 @@ public void setup(OperatorContext context, OutputMutator output) throws Executio this.operatorContext = context; schema = footer.getFileMetaData().getSchema(); MessageType projection; + final List columnsNotFound = new ArrayList<>(getColumns().size()); if (isStarQuery()) { projection = schema; } else { -columnsNotFound = new ArrayList<>(); projection = getProjection(schema, getColumns(), columnsNotFound); if (projection == null) { projection = schema; } -if (columnsNotFound != null && columnsNotFound.size() > 0) { - nullFilledVectors = new ArrayList<>(); - for (SchemaPath col: columnsNotFound) { +if (columnsNotFound.size() > 0) { + nullFilledVectors = new ArrayList<>(columnsNotFound.size()); + for (SchemaPath col : columnsNotFound) { // col.toExpr() is used here as field name since we don't want to see these fields in the existing maps nullFilledVectors.add( - (NullableIntVector) output.addField(MaterializedField.create(col.toExpr(), - org.apache.drill.common.types.Types.optional(TypeProtos.MinorType.INT)), -(Class) TypeHelper.getValueVectorClass(TypeProtos.MinorType.INT, - TypeProtos.DataMode.OPTIONAL))); - } - if (columnsNotFound.size() == getColumns().size()) { -noColumnsFound = true; +(NullableIntVector) output.addField(MaterializedField.create(col.toExpr(), + org.apache.drill.common.types.Types.optional(TypeProtos.MinorType.INT)), +(Class) TypeHelper.getValueVectorClass(TypeProtos.MinorType.INT, Review comment: This `TypeHelper.getValueVectorClass(...)` may be changed to `NullableIntVector.class`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Query of a field inside of an array of structs returns null > --- > > Key: DRILL-7380 > URL: https://issues.apache.org/jira/browse/DRILL-7380 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.17.0 >Reporter: Anton Gozhiy >Assignee: Igor Guzenko >Priority: Major > Attachments: customer_complex.zip > > > *Query:* > {code:sql} > select t.c_orders[0].o_orderstatus from hive.customer_complex t limit 10; > {code} > *Expected results (given from Hive):* > {noformat} > OK > O > F > NULL > O > O > NULL > O > O > NULL > F > {noformat} > *Actual results:* > {noformat} > null > null > null > null > null > null > null > null > null > null > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (DRILL-7380) Query of a field inside of an array of structs returns null
[ https://issues.apache.org/jira/browse/DRILL-7380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16938632#comment-16938632 ] ASF GitHub Bot commented on DRILL-7380: --- KazydubB commented on pull request #1861: DRILL-7380: Query of a field inside of an array of structs returns null URL: https://github.com/apache/drill/pull/1861#discussion_r328618759 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet2/DrillParquetReader.java ## @@ -111,20 +110,37 @@ public DrillParquetReader(FragmentContext fragmentContext, this.numRecordsToRead = initNumRecordsToRead(recordsToRead, entry.getRowGroupIndex(), footer); } + /** + * Creates projection MessageType from projection columns and given schema. + * + * @param schema Parquet file schema + * @param projectionColumns columns to search + * @param columnsNotFound any projection column which wasn't found in schema is added to the list + * @return projection containing matched columns or null if none column matches schema + */ private static MessageType getProjection(MessageType schema, - Collection columns, + Collection projectionColumns, List columnsNotFound) { -MessageType projection = null; - -String messageName = schema.getName(); -List schemaColumns = schema.getColumns(); -// parquet type.union() seems to lose ConvertedType info when merging two columns that are the same type. This can -// happen when selecting two elements from an array. So to work around this, we use set of SchemaPath to avoid duplicates -// and then merge the types at the end -Set selectedSchemaPaths = new LinkedHashSet<>(); +projectionColumns = adaptColumnsToParquetSchema(projectionColumns, schema); +List schemaColumns = getAllColumnsFrom(schema); +Set selectedSchemaPaths = matchProjectionWithSchemaColumns(projectionColumns, schemaColumns, columnsNotFound); +MessageType projection = convertSelectedColumnsToMessageType(schema, selectedSchemaPaths); +return projection; + } -// get a list of modified columns which have the array elements removed from the schema path since parquet schema doesn't include array elements -// or if field is (Parquet's) MAP then array/name segments are removed from the schema as well as obtaining elements by key is handled in EvaluationVisitor. + /** + * This method adjusts collection of SchemaPath projection columns to better match columns in given + * schema. It does few things to reach the goal: + *- skips ArraySegments if present; Review comment: nit: enumerate the cases in HTML's ``? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Query of a field inside of an array of structs returns null > --- > > Key: DRILL-7380 > URL: https://issues.apache.org/jira/browse/DRILL-7380 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.17.0 >Reporter: Anton Gozhiy >Assignee: Igor Guzenko >Priority: Major > Attachments: customer_complex.zip > > > *Query:* > {code:sql} > select t.c_orders[0].o_orderstatus from hive.customer_complex t limit 10; > {code} > *Expected results (given from Hive):* > {noformat} > OK > O > F > NULL > O > O > NULL > O > O > NULL > F > {noformat} > *Actual results:* > {noformat} > null > null > null > null > null > null > null > null > null > null > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (DRILL-7380) Query of a field inside of an array of structs returns null
[ https://issues.apache.org/jira/browse/DRILL-7380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16938630#comment-16938630 ] ASF GitHub Bot commented on DRILL-7380: --- KazydubB commented on pull request #1861: DRILL-7380: Query of a field inside of an array of structs returns null URL: https://github.com/apache/drill/pull/1861#discussion_r328612943 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet2/DrillParquetReader.java ## @@ -233,28 +288,26 @@ public void setup(OperatorContext context, OutputMutator output) throws Executio this.operatorContext = context; schema = footer.getFileMetaData().getSchema(); MessageType projection; + final List columnsNotFound = new ArrayList<>(getColumns().size()); if (isStarQuery()) { projection = schema; } else { -columnsNotFound = new ArrayList<>(); projection = getProjection(schema, getColumns(), columnsNotFound); if (projection == null) { projection = schema; } -if (columnsNotFound != null && columnsNotFound.size() > 0) { - nullFilledVectors = new ArrayList<>(); - for (SchemaPath col: columnsNotFound) { +if (columnsNotFound.size() > 0) { Review comment: Change to `!columnsNotFound.isEmpty()` :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Query of a field inside of an array of structs returns null > --- > > Key: DRILL-7380 > URL: https://issues.apache.org/jira/browse/DRILL-7380 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.17.0 >Reporter: Anton Gozhiy >Assignee: Igor Guzenko >Priority: Major > Attachments: customer_complex.zip > > > *Query:* > {code:sql} > select t.c_orders[0].o_orderstatus from hive.customer_complex t limit 10; > {code} > *Expected results (given from Hive):* > {noformat} > OK > O > F > NULL > O > O > NULL > O > O > NULL > F > {noformat} > *Actual results:* > {noformat} > null > null > null > null > null > null > null > null > null > null > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (DRILL-7380) Query of a field inside of an array of structs returns null
[ https://issues.apache.org/jira/browse/DRILL-7380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16938633#comment-16938633 ] ASF GitHub Bot commented on DRILL-7380: --- KazydubB commented on pull request #1861: DRILL-7380: Query of a field inside of an array of structs returns null URL: https://github.com/apache/drill/pull/1861#discussion_r328619538 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet2/DrillParquetReader.java ## @@ -111,20 +110,37 @@ public DrillParquetReader(FragmentContext fragmentContext, this.numRecordsToRead = initNumRecordsToRead(recordsToRead, entry.getRowGroupIndex(), footer); } + /** + * Creates projection MessageType from projection columns and given schema. + * + * @param schema Parquet file schema + * @param projectionColumns columns to search + * @param columnsNotFound any projection column which wasn't found in schema is added to the list + * @return projection containing matched columns or null if none column matches schema + */ private static MessageType getProjection(MessageType schema, - Collection columns, + Collection projectionColumns, List columnsNotFound) { -MessageType projection = null; - -String messageName = schema.getName(); -List schemaColumns = schema.getColumns(); -// parquet type.union() seems to lose ConvertedType info when merging two columns that are the same type. This can -// happen when selecting two elements from an array. So to work around this, we use set of SchemaPath to avoid duplicates -// and then merge the types at the end -Set selectedSchemaPaths = new LinkedHashSet<>(); +projectionColumns = adaptColumnsToParquetSchema(projectionColumns, schema); +List schemaColumns = getAllColumnsFrom(schema); +Set selectedSchemaPaths = matchProjectionWithSchemaColumns(projectionColumns, schemaColumns, columnsNotFound); +MessageType projection = convertSelectedColumnsToMessageType(schema, selectedSchemaPaths); +return projection; Review comment: nit: `projection` variable may be avoided. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Query of a field inside of an array of structs returns null > --- > > Key: DRILL-7380 > URL: https://issues.apache.org/jira/browse/DRILL-7380 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.17.0 >Reporter: Anton Gozhiy >Assignee: Igor Guzenko >Priority: Major > Attachments: customer_complex.zip > > > *Query:* > {code:sql} > select t.c_orders[0].o_orderstatus from hive.customer_complex t limit 10; > {code} > *Expected results (given from Hive):* > {noformat} > OK > O > F > NULL > O > O > NULL > O > O > NULL > F > {noformat} > *Actual results:* > {noformat} > null > null > null > null > null > null > null > null > null > null > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)