[jira] [Commented] (DRILL-7170) IllegalStateException: Record count not set for this vector container

2019-09-26 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16938969#comment-16938969
 ] 

ASF GitHub Bot commented on DRILL-7170:
---

Ben-Zvi commented on pull request #1859: DRILL-7170: Ignore uninitialized 
vector containers for OOM error messages
URL: https://github.com/apache/drill/pull/1859
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> IllegalStateException: Record count not set for this vector container
> -
>
> Key: DRILL-7170
> URL: https://issues.apache.org/jira/browse/DRILL-7170
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: Sorabh Hamirwasia
>Priority: Major
> Fix For: 1.17.0
>
>
> {code:java}
> Query: 
> /root/drillAutomation/master/framework/resources/Advanced/tpcds/tpcds_sf1/original/maprdb/json/query95.sql
> WITH ws_wh AS
> (
> SELECT ws1.ws_order_number,
> ws1.ws_warehouse_sk wh1,
> ws2.ws_warehouse_sk wh2
> FROM   web_sales ws1,
> web_sales ws2
> WHERE  ws1.ws_order_number = ws2.ws_order_number
> ANDws1.ws_warehouse_sk <> ws2.ws_warehouse_sk)
> SELECT
> Count(DISTINCT ws_order_number) AS `order count` ,
> Sum(ws_ext_ship_cost)   AS `total shipping cost` ,
> Sum(ws_net_profit)  AS `total net profit`
> FROM web_sales ws1 ,
> date_dim ,
> customer_address ,
> web_site
> WHEREd_date BETWEEN '2000-04-01' AND  (
> Cast('2000-04-01' AS DATE) + INTERVAL '60' day)
> AND  ws1.ws_ship_date_sk = d_date_sk
> AND  ws1.ws_ship_addr_sk = ca_address_sk
> AND  ca_state = 'IN'
> AND  ws1.ws_web_site_sk = web_site_sk
> AND  web_company_name = 'pri'
> AND  ws1.ws_order_number IN
> (
> SELECT ws_order_number
> FROM   ws_wh)
> AND  ws1.ws_order_number IN
> (
> SELECT wr_order_number
> FROM   web_returns,
> ws_wh
> WHERE  wr_order_number = ws_wh.ws_order_number)
> ORDER BY count(DISTINCT ws_order_number)
> LIMIT 100
> Exception:
> java.sql.SQLException: SYSTEM ERROR: IllegalStateException: Record count not 
> set for this vector container
> Fragment 2:3
> Please, refer to logs for more information.
> [Error Id: 4ed92fce-505b-40ba-ac0e-4a302c28df47 on drill87:31010]
>   (java.lang.IllegalStateException) Record count not set for this vector 
> container
> 
> org.apache.drill.shaded.guava.com.google.common.base.Preconditions.checkState():459
> org.apache.drill.exec.record.VectorContainer.getRecordCount():394
> org.apache.drill.exec.record.RecordBatchSizer.():720
> org.apache.drill.exec.record.RecordBatchSizer.():704
> 
> org.apache.drill.exec.physical.impl.common.HashTableTemplate$BatchHolder.getActualSize():462
> 
> org.apache.drill.exec.physical.impl.common.HashTableTemplate.getActualSize():964
> 
> org.apache.drill.exec.physical.impl.common.HashTableTemplate.makeDebugString():973
> 
> org.apache.drill.exec.physical.impl.common.HashPartition.makeDebugString():601
> 
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.makeDebugString():1313
> 
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.executeBuildPhase():1105
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.innerNext():525
> org.apache.drill.exec.record.AbstractRecordBatch.next():186
> org.apache.drill.exec.record.AbstractRecordBatch.next():126
> org.apache.drill.exec.record.AbstractRecordBatch.next():116
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():141
> org.apache.drill.exec.record.AbstractRecordBatch.next():186
> org.apache.drill.exec.record.AbstractRecordBatch.next():126
> org.apache.drill.exec.test.generated.HashAggregatorGen1068899.doWork():642
> org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.innerNext():296
> org.apache.drill.exec.record.AbstractRecordBatch.next():186
> org.apache.drill.exec.record.AbstractRecordBatch.next():126
> org.apache.drill.exec.record.AbstractRecordBatch.next():116
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():141
> org.apache.drill.exec.record.AbstractRecordBatch.next():186
> org.apache.drill.exec.physical.impl.BaseRootExec.next():104
> 
> org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():93
> org.apache.drill.exec.physical.impl.BaseRootExec.next():94
> 

[jira] [Commented] (DRILL-7380) Query of a field inside of an array of structs returns null

2019-09-26 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16938758#comment-16938758
 ] 

ASF GitHub Bot commented on DRILL-7380:
---

ihuzenko commented on pull request #1861: DRILL-7380: Query of a field inside 
of an array of structs returns null
URL: https://github.com/apache/drill/pull/1861
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Query of a field inside of an array of structs returns null
> ---
>
> Key: DRILL-7380
> URL: https://issues.apache.org/jira/browse/DRILL-7380
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Anton Gozhiy
>Assignee: Igor Guzenko
>Priority: Major
>  Labels: ready-to-commit
> Attachments: customer_complex.zip
>
>
> *Query:*
> {code:sql}
> select t.c_orders[0].o_orderstatus from hive.customer_complex t limit 10;
> {code}
> *Expected results (given from Hive):*
> {noformat}
> OK
> O
> F
> NULL
> O
> O
> NULL
> O
> O
> NULL
> F
> {noformat}
> *Actual results:*
> {noformat}
> null
> null
> null
> null
> null
> null
> null
> null
> null
> null
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7381) Query to a map field returns nulls with hive native reader

2019-09-26 Thread Igor Guzenko (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Guzenko updated DRILL-7381:

Labels: ready-to-commit  (was: )

> Query to a map field returns nulls with hive native reader
> --
>
> Key: DRILL-7381
> URL: https://issues.apache.org/jira/browse/DRILL-7381
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Anton Gozhiy
>Assignee: Igor Guzenko
>Priority: Major
>  Labels: ready-to-commit
> Attachments: customer_complex.zip
>
>
> *Query:*
> {code:sql}
> select t.c_nation.n_region.r_name from hive.customer_complex t limit 5
> {code}
> *Expected results:*
> {noformat}
> AFRICA
> MIDDLE EAST
> AMERICA
> MIDDLE EAST
> AMERICA
> {noformat}
> *Actual results:*
> {noformat}
> null
> null
> null
> null
> null
> {noformat}
> *Workaround:*
> {code:sql}
> set store.hive.optimize_scan_with_native_readers = false;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7380) Query of a field inside of an array of structs returns null

2019-09-26 Thread Igor Guzenko (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Guzenko updated DRILL-7380:

Labels: ready-to-commit  (was: )

> Query of a field inside of an array of structs returns null
> ---
>
> Key: DRILL-7380
> URL: https://issues.apache.org/jira/browse/DRILL-7380
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Anton Gozhiy
>Assignee: Igor Guzenko
>Priority: Major
>  Labels: ready-to-commit
> Attachments: customer_complex.zip
>
>
> *Query:*
> {code:sql}
> select t.c_orders[0].o_orderstatus from hive.customer_complex t limit 10;
> {code}
> *Expected results (given from Hive):*
> {noformat}
> OK
> O
> F
> NULL
> O
> O
> NULL
> O
> O
> NULL
> F
> {noformat}
> *Actual results:*
> {noformat}
> null
> null
> null
> null
> null
> null
> null
> null
> null
> null
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7380) Query of a field inside of an array of structs returns null

2019-09-26 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16938671#comment-16938671
 ] 

ASF GitHub Bot commented on DRILL-7380:
---

ihuzenko commented on issue #1861: DRILL-7380: Query of a field inside of an 
array of structs returns null
URL: https://github.com/apache/drill/pull/1861#issuecomment-535528310
 
 
   @KazydubB , I've addressed comments, please check again. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Query of a field inside of an array of structs returns null
> ---
>
> Key: DRILL-7380
> URL: https://issues.apache.org/jira/browse/DRILL-7380
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Anton Gozhiy
>Assignee: Igor Guzenko
>Priority: Major
> Attachments: customer_complex.zip
>
>
> *Query:*
> {code:sql}
> select t.c_orders[0].o_orderstatus from hive.customer_complex t limit 10;
> {code}
> *Expected results (given from Hive):*
> {noformat}
> OK
> O
> F
> NULL
> O
> O
> NULL
> O
> O
> NULL
> F
> {noformat}
> *Actual results:*
> {noformat}
> null
> null
> null
> null
> null
> null
> null
> null
> null
> null
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7174) Expose complex to Json control in the Drill C++ Client

2019-09-26 Thread Volodymyr Vysotskyi (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Volodymyr Vysotskyi updated DRILL-7174:
---
Labels: ready-to-commit  (was: )

> Expose complex to Json control in the Drill C++ Client
> --
>
> Key: DRILL-7174
> URL: https://issues.apache.org/jira/browse/DRILL-7174
> Project: Apache Drill
>  Issue Type: Task
>Reporter: Rob Wu
>Priority: Minor
>  Labels: ready-to-commit
> Fix For: 1.17.0
>
>
> Arjun Gupta will be supplying a patch for this
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7380) Query of a field inside of an array of structs returns null

2019-09-26 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16938652#comment-16938652
 ] 

ASF GitHub Bot commented on DRILL-7380:
---

KazydubB commented on pull request #1861: DRILL-7380: Query of a field inside 
of an array of structs returns null
URL: https://github.com/apache/drill/pull/1861#discussion_r328628279
 
 

 ##
 File path: 
logical/src/main/java/org/apache/drill/common/expression/SchemaPath.java
 ##
 @@ -48,13 +48,20 @@
   private final NameSegment rootSegment;
 
   public SchemaPath(SchemaPath path) {
-super(path.getPosition());
-this.rootSegment = path.rootSegment;
+this(path.rootSegment, path.getPosition());
   }
 
   public SchemaPath(NameSegment rootSegment) {
-super(ExpressionPosition.UNKNOWN);
-this.rootSegment = rootSegment;
+this(rootSegment, ExpressionPosition.UNKNOWN);
+  }
+
+  /**
+   * @deprecated Use {@link #SchemaPath(NameSegment)}
+   * or {@link #SchemaPath(NameSegment, ExpressionPosition)} instead
+   */
+  @Deprecated
 
 Review comment:
   Oops, missed it. Leave it as is, then.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Query of a field inside of an array of structs returns null
> ---
>
> Key: DRILL-7380
> URL: https://issues.apache.org/jira/browse/DRILL-7380
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Anton Gozhiy
>Assignee: Igor Guzenko
>Priority: Major
> Attachments: customer_complex.zip
>
>
> *Query:*
> {code:sql}
> select t.c_orders[0].o_orderstatus from hive.customer_complex t limit 10;
> {code}
> *Expected results (given from Hive):*
> {noformat}
> OK
> O
> F
> NULL
> O
> O
> NULL
> O
> O
> NULL
> F
> {noformat}
> *Actual results:*
> {noformat}
> null
> null
> null
> null
> null
> null
> null
> null
> null
> null
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7174) Expose complex to Json control in the Drill C++ Client

2019-09-26 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16938651#comment-16938651
 ] 

ASF GitHub Bot commented on DRILL-7174:
---

agozhiy commented on issue #1814: DRILL-7174: Expose complex to Json control in 
the Drill C++ Client
URL: https://github.com/apache/drill/pull/1814#issuecomment-535513377
 
 
   @vvysotskyi, I built the native client and ran tests, everything works ok.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Expose complex to Json control in the Drill C++ Client
> --
>
> Key: DRILL-7174
> URL: https://issues.apache.org/jira/browse/DRILL-7174
> Project: Apache Drill
>  Issue Type: Task
>Reporter: Rob Wu
>Priority: Minor
> Fix For: 1.17.0
>
>
> Arjun Gupta will be supplying a patch for this
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7380) Query of a field inside of an array of structs returns null

2019-09-26 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16938637#comment-16938637
 ] 

ASF GitHub Bot commented on DRILL-7380:
---

ihuzenko commented on pull request #1861: DRILL-7380: Query of a field inside 
of an array of structs returns null
URL: https://github.com/apache/drill/pull/1861#discussion_r328625157
 
 

 ##
 File path: 
logical/src/main/java/org/apache/drill/common/expression/SchemaPath.java
 ##
 @@ -48,13 +48,20 @@
   private final NameSegment rootSegment;
 
   public SchemaPath(SchemaPath path) {
-super(path.getPosition());
-this.rootSegment = path.rootSegment;
+this(path.rootSegment, path.getPosition());
   }
 
   public SchemaPath(NameSegment rootSegment) {
-super(ExpressionPosition.UNKNOWN);
-this.rootSegment = rootSegment;
+this(rootSegment, ExpressionPosition.UNKNOWN);
+  }
+
+  /**
+   * @deprecated Use {@link #SchemaPath(NameSegment)}
+   * or {@link #SchemaPath(NameSegment, ExpressionPosition)} instead
+   */
+  @Deprecated
 
 Review comment:
   Was already deprecated at line 101 on the left side. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Query of a field inside of an array of structs returns null
> ---
>
> Key: DRILL-7380
> URL: https://issues.apache.org/jira/browse/DRILL-7380
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Anton Gozhiy
>Assignee: Igor Guzenko
>Priority: Major
> Attachments: customer_complex.zip
>
>
> *Query:*
> {code:sql}
> select t.c_orders[0].o_orderstatus from hive.customer_complex t limit 10;
> {code}
> *Expected results (given from Hive):*
> {noformat}
> OK
> O
> F
> NULL
> O
> O
> NULL
> O
> O
> NULL
> F
> {noformat}
> *Actual results:*
> {noformat}
> null
> null
> null
> null
> null
> null
> null
> null
> null
> null
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7380) Query of a field inside of an array of structs returns null

2019-09-26 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16938631#comment-16938631
 ] 

ASF GitHub Bot commented on DRILL-7380:
---

KazydubB commented on pull request #1861: DRILL-7380: Query of a field inside 
of an array of structs returns null
URL: https://github.com/apache/drill/pull/1861#discussion_r327965662
 
 

 ##
 File path: 
logical/src/main/java/org/apache/drill/common/expression/SchemaPath.java
 ##
 @@ -48,13 +48,20 @@
   private final NameSegment rootSegment;
 
   public SchemaPath(SchemaPath path) {
-super(path.getPosition());
-this.rootSegment = path.rootSegment;
+this(path.rootSegment, path.getPosition());
   }
 
   public SchemaPath(NameSegment rootSegment) {
-super(ExpressionPosition.UNKNOWN);
-this.rootSegment = rootSegment;
+this(rootSegment, ExpressionPosition.UNKNOWN);
+  }
+
+  /**
+   * @deprecated Use {@link #SchemaPath(NameSegment)}
+   * or {@link #SchemaPath(NameSegment, ExpressionPosition)} instead
+   */
+  @Deprecated
 
 Review comment:
   Why deprecate? It's rather a convenient way to create `SchemaPath`.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Query of a field inside of an array of structs returns null
> ---
>
> Key: DRILL-7380
> URL: https://issues.apache.org/jira/browse/DRILL-7380
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Anton Gozhiy
>Assignee: Igor Guzenko
>Priority: Major
> Attachments: customer_complex.zip
>
>
> *Query:*
> {code:sql}
> select t.c_orders[0].o_orderstatus from hive.customer_complex t limit 10;
> {code}
> *Expected results (given from Hive):*
> {noformat}
> OK
> O
> F
> NULL
> O
> O
> NULL
> O
> O
> NULL
> F
> {noformat}
> *Actual results:*
> {noformat}
> null
> null
> null
> null
> null
> null
> null
> null
> null
> null
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7380) Query of a field inside of an array of structs returns null

2019-09-26 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16938629#comment-16938629
 ] 

ASF GitHub Bot commented on DRILL-7380:
---

KazydubB commented on pull request #1861: DRILL-7380: Query of a field inside 
of an array of structs returns null
URL: https://github.com/apache/drill/pull/1861#discussion_r328614853
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet2/DrillParquetReader.java
 ##
 @@ -233,28 +288,26 @@ public void setup(OperatorContext context, OutputMutator 
output) throws Executio
   this.operatorContext = context;
   schema = footer.getFileMetaData().getSchema();
   MessageType projection;
+  final List columnsNotFound = new 
ArrayList<>(getColumns().size());
 
   if (isStarQuery()) {
 projection = schema;
   } else {
-columnsNotFound = new ArrayList<>();
 projection = getProjection(schema, getColumns(), columnsNotFound);
 if (projection == null) {
   projection = schema;
 }
-if (columnsNotFound != null && columnsNotFound.size() > 0) {
-  nullFilledVectors = new ArrayList<>();
-  for (SchemaPath col: columnsNotFound) {
+if (columnsNotFound.size() > 0) {
+  nullFilledVectors = new ArrayList<>(columnsNotFound.size());
+  for (SchemaPath col : columnsNotFound) {
 // col.toExpr() is used here as field name since we don't want to 
see these fields in the existing maps
 nullFilledVectors.add(
-  (NullableIntVector) 
output.addField(MaterializedField.create(col.toExpr(),
-  
org.apache.drill.common.types.Types.optional(TypeProtos.MinorType.INT)),
-(Class) 
TypeHelper.getValueVectorClass(TypeProtos.MinorType.INT,
-  TypeProtos.DataMode.OPTIONAL)));
-  }
-  if (columnsNotFound.size() == getColumns().size()) {
-noColumnsFound = true;
+(NullableIntVector) 
output.addField(MaterializedField.create(col.toExpr(),
+
org.apache.drill.common.types.Types.optional(TypeProtos.MinorType.INT)),
+(Class) 
TypeHelper.getValueVectorClass(TypeProtos.MinorType.INT,
 
 Review comment:
   This `TypeHelper.getValueVectorClass(...)` may be changed to 
`NullableIntVector.class`.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Query of a field inside of an array of structs returns null
> ---
>
> Key: DRILL-7380
> URL: https://issues.apache.org/jira/browse/DRILL-7380
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Anton Gozhiy
>Assignee: Igor Guzenko
>Priority: Major
> Attachments: customer_complex.zip
>
>
> *Query:*
> {code:sql}
> select t.c_orders[0].o_orderstatus from hive.customer_complex t limit 10;
> {code}
> *Expected results (given from Hive):*
> {noformat}
> OK
> O
> F
> NULL
> O
> O
> NULL
> O
> O
> NULL
> F
> {noformat}
> *Actual results:*
> {noformat}
> null
> null
> null
> null
> null
> null
> null
> null
> null
> null
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7380) Query of a field inside of an array of structs returns null

2019-09-26 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16938632#comment-16938632
 ] 

ASF GitHub Bot commented on DRILL-7380:
---

KazydubB commented on pull request #1861: DRILL-7380: Query of a field inside 
of an array of structs returns null
URL: https://github.com/apache/drill/pull/1861#discussion_r328618759
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet2/DrillParquetReader.java
 ##
 @@ -111,20 +110,37 @@ public DrillParquetReader(FragmentContext 
fragmentContext,
 this.numRecordsToRead = initNumRecordsToRead(recordsToRead, 
entry.getRowGroupIndex(), footer);
   }
 
+  /**
+   * Creates projection MessageType from projection columns and given schema.
+   *
+   * @param schema Parquet file schema
+   * @param projectionColumns columns to search
+   * @param columnsNotFound any projection column which wasn't found in schema 
is added to the list
+   * @return projection containing matched columns or null if none column 
matches schema
+   */
   private static MessageType getProjection(MessageType schema,
-   Collection columns,
+   Collection 
projectionColumns,
List columnsNotFound) {
-MessageType projection = null;
-
-String messageName = schema.getName();
-List schemaColumns = schema.getColumns();
-// parquet type.union() seems to lose ConvertedType info when merging two 
columns that are the same type. This can
-// happen when selecting two elements from an array. So to work around 
this, we use set of SchemaPath to avoid duplicates
-// and then merge the types at the end
-Set selectedSchemaPaths = new LinkedHashSet<>();
+projectionColumns = adaptColumnsToParquetSchema(projectionColumns, schema);
+List schemaColumns = getAllColumnsFrom(schema);
+Set selectedSchemaPaths = 
matchProjectionWithSchemaColumns(projectionColumns, schemaColumns, 
columnsNotFound);
+MessageType projection = convertSelectedColumnsToMessageType(schema, 
selectedSchemaPaths);
+return projection;
+  }
 
-// get a list of modified columns which have the array elements removed 
from the schema path since parquet schema doesn't include array elements
-// or if field is (Parquet's) MAP then array/name segments are removed 
from the schema as well as obtaining elements by key is handled in 
EvaluationVisitor.
+  /**
+   * This method adjusts collection of SchemaPath projection columns to better 
match columns in given
+   * schema. It does few things to reach the goal:
+   *- skips ArraySegments if present;
 
 Review comment:
   nit: enumerate the cases in HTML's  ``?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Query of a field inside of an array of structs returns null
> ---
>
> Key: DRILL-7380
> URL: https://issues.apache.org/jira/browse/DRILL-7380
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Anton Gozhiy
>Assignee: Igor Guzenko
>Priority: Major
> Attachments: customer_complex.zip
>
>
> *Query:*
> {code:sql}
> select t.c_orders[0].o_orderstatus from hive.customer_complex t limit 10;
> {code}
> *Expected results (given from Hive):*
> {noformat}
> OK
> O
> F
> NULL
> O
> O
> NULL
> O
> O
> NULL
> F
> {noformat}
> *Actual results:*
> {noformat}
> null
> null
> null
> null
> null
> null
> null
> null
> null
> null
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7380) Query of a field inside of an array of structs returns null

2019-09-26 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16938630#comment-16938630
 ] 

ASF GitHub Bot commented on DRILL-7380:
---

KazydubB commented on pull request #1861: DRILL-7380: Query of a field inside 
of an array of structs returns null
URL: https://github.com/apache/drill/pull/1861#discussion_r328612943
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet2/DrillParquetReader.java
 ##
 @@ -233,28 +288,26 @@ public void setup(OperatorContext context, OutputMutator 
output) throws Executio
   this.operatorContext = context;
   schema = footer.getFileMetaData().getSchema();
   MessageType projection;
+  final List columnsNotFound = new 
ArrayList<>(getColumns().size());
 
   if (isStarQuery()) {
 projection = schema;
   } else {
-columnsNotFound = new ArrayList<>();
 projection = getProjection(schema, getColumns(), columnsNotFound);
 if (projection == null) {
   projection = schema;
 }
-if (columnsNotFound != null && columnsNotFound.size() > 0) {
-  nullFilledVectors = new ArrayList<>();
-  for (SchemaPath col: columnsNotFound) {
+if (columnsNotFound.size() > 0) {
 
 Review comment:
   Change to `!columnsNotFound.isEmpty()` :)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Query of a field inside of an array of structs returns null
> ---
>
> Key: DRILL-7380
> URL: https://issues.apache.org/jira/browse/DRILL-7380
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Anton Gozhiy
>Assignee: Igor Guzenko
>Priority: Major
> Attachments: customer_complex.zip
>
>
> *Query:*
> {code:sql}
> select t.c_orders[0].o_orderstatus from hive.customer_complex t limit 10;
> {code}
> *Expected results (given from Hive):*
> {noformat}
> OK
> O
> F
> NULL
> O
> O
> NULL
> O
> O
> NULL
> F
> {noformat}
> *Actual results:*
> {noformat}
> null
> null
> null
> null
> null
> null
> null
> null
> null
> null
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7380) Query of a field inside of an array of structs returns null

2019-09-26 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16938633#comment-16938633
 ] 

ASF GitHub Bot commented on DRILL-7380:
---

KazydubB commented on pull request #1861: DRILL-7380: Query of a field inside 
of an array of structs returns null
URL: https://github.com/apache/drill/pull/1861#discussion_r328619538
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet2/DrillParquetReader.java
 ##
 @@ -111,20 +110,37 @@ public DrillParquetReader(FragmentContext 
fragmentContext,
 this.numRecordsToRead = initNumRecordsToRead(recordsToRead, 
entry.getRowGroupIndex(), footer);
   }
 
+  /**
+   * Creates projection MessageType from projection columns and given schema.
+   *
+   * @param schema Parquet file schema
+   * @param projectionColumns columns to search
+   * @param columnsNotFound any projection column which wasn't found in schema 
is added to the list
+   * @return projection containing matched columns or null if none column 
matches schema
+   */
   private static MessageType getProjection(MessageType schema,
-   Collection columns,
+   Collection 
projectionColumns,
List columnsNotFound) {
-MessageType projection = null;
-
-String messageName = schema.getName();
-List schemaColumns = schema.getColumns();
-// parquet type.union() seems to lose ConvertedType info when merging two 
columns that are the same type. This can
-// happen when selecting two elements from an array. So to work around 
this, we use set of SchemaPath to avoid duplicates
-// and then merge the types at the end
-Set selectedSchemaPaths = new LinkedHashSet<>();
+projectionColumns = adaptColumnsToParquetSchema(projectionColumns, schema);
+List schemaColumns = getAllColumnsFrom(schema);
+Set selectedSchemaPaths = 
matchProjectionWithSchemaColumns(projectionColumns, schemaColumns, 
columnsNotFound);
+MessageType projection = convertSelectedColumnsToMessageType(schema, 
selectedSchemaPaths);
+return projection;
 
 Review comment:
   nit: `projection` variable may be avoided.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Query of a field inside of an array of structs returns null
> ---
>
> Key: DRILL-7380
> URL: https://issues.apache.org/jira/browse/DRILL-7380
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Anton Gozhiy
>Assignee: Igor Guzenko
>Priority: Major
> Attachments: customer_complex.zip
>
>
> *Query:*
> {code:sql}
> select t.c_orders[0].o_orderstatus from hive.customer_complex t limit 10;
> {code}
> *Expected results (given from Hive):*
> {noformat}
> OK
> O
> F
> NULL
> O
> O
> NULL
> O
> O
> NULL
> F
> {noformat}
> *Actual results:*
> {noformat}
> null
> null
> null
> null
> null
> null
> null
> null
> null
> null
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)