[jira] [Commented] (DRILL-6622) UNION on tpcds sf100 tables hit SYSTEM ERROR: NullPointerException

2018-07-21 Thread salim achouche (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16551888#comment-16551888
 ] 

salim achouche commented on DRILL-6622:
---

Alright, just fixed this issue; there were two bugs in the Aggregator batch 
sizing logic:

Issue I
 * The aggregator runs in a loop to consume all input batches
 * The loop was updating the batch sizing stats after they were consumed
 * Assume output-row-count is 1 and we receive a batch with at least 32k + 1 
records
 * The code would create 32k output batches (one per incoming record) and then 
fails because of overflow
 * Fix - Now updating the batch sizing logic when a non-empty batch is received 
and before the processing loop

Issue II
 * The Aggregator has two main modules: AggregatorBatch and Aggregator objects
 * Both share the same "incoming" record batch instance
 * Though there is logic to spill incoming batches when under pressure
 * The batch sizing logic was not aware that when batches are spilled the 
shared "incoming" object instance will diverge; that is, the Aggregator object 
will mutate the incoming object
 * The batch sizer was being invoked with a stale "incoming" object (the one 
from the AggregatorBatch)
 * Fix - Update the  Aggregator code to always pass the active incoming object 
explicitly 

 

> UNION on tpcds sf100 tables hit SYSTEM ERROR: NullPointerException
> ---
>
> Key: DRILL-6622
> URL: https://issues.apache.org/jira/browse/DRILL-6622
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Codegen
>Affects Versions: 1.14.0
>Reporter: Vitalii Diravka
>Assignee: salim achouche
>Priority: Blocker
> Fix For: 1.14.0
>
> Attachments: 
> MD4208_id_05_1_id_24b2a6f9-ed66-b97e-594d-f116cd3fdd23.json, 
> MD4208_id_05_3_id_24b2ad9c-4568-a476-bbf6-2e17441078b1.json
>
>
> {code}
> SELECT c_customer_id FROM customer 
> UNION
> SELECT ca_address_id FROM customer_address 
> UNION
> SELECT cd_credit_rating FROM customer_demographics 
> UNION
> SELECT hd_buy_potential FROM household_demographics 
> UNION
> SELECT i_item_id FROM item 
> UNION
> SELECT p_promo_id FROM promotion 
> UNION
> SELECT t_time_id FROM time_dim 
> UNION
> SELECT d_date_id FROM date_dim 
> UNION
> SELECT s_store_id FROM store 
> UNION
> SELECT w_warehouse_id FROM warehouse 
> UNION
> SELECT sm_ship_mode_id FROM ship_mode 
> UNION
> SELECT r_reason_id FROM reason 
> UNION
> SELECT cc_call_center_id FROM call_center 
> UNION
> SELECT web_site_id FROM web_site 
> UNION
> SELECT wp_web_page_id FROM web_page 
> UNION
> SELECT cp_catalog_page_id FROM catalog_page;
> {code}
> hit the following error:
> {code}
> Caused by: java.lang.NullPointerException: null
> at 
> org.apache.drill.exec.expr.fn.impl.ByteFunctionHelpers.compare(ByteFunctionHelpers.java:96)
>  ~[vector-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
> at 
> org.apache.drill.exec.test.generated.HashTableGen3$BatchHolder.isKeyMatchInternalBuild(BatchHolder.java:171)
>  ~[na:na]
> at 
> org.apache.drill.exec.physical.impl.common.HashTableTemplate$BatchHolder.isKeyMatch(HashTableTemplate.java:218)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.common.HashTableTemplate$BatchHolder.access$1000(HashTableTemplate.java:120)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.common.HashTableTemplate.put(HashTableTemplate.java:650)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
> at 
> org.apache.drill.exec.test.generated.HashAggregatorGen0.checkGroupAndAggrValues(HashAggTemplate.java:1372)
>  ~[na:na]
> at 
> org.apache.drill.exec.test.generated.HashAggregatorGen0.doWork(HashAggTemplate.java:599)
>  ~[na:na]
> at 
> org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.innerNext(HashAggBatch.java:268)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:172)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.union.UnionAllRecordBatch$UnionInputIterator.next(UnionAllRecordBatch.java:381)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
> {code}
> [~dechanggu] found that the issue is absent in Drill 1.13.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6611) Add [meta]-[Enter] js handler for query form submission

2018-07-21 Thread Bob Rudis (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16551810#comment-16551810
 ] 

Bob Rudis commented on DRILL-6611:
--

Cool. I'll re-sync my fork and submit a PR this weekend.

 

> Add [meta]-[Enter] js handler for query form submission
> ---
>
> Key: DRILL-6611
> URL: https://issues.apache.org/jira/browse/DRILL-6611
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.14.0
>Reporter: Bob Rudis
>Priority: Minor
>
> The new ACE-based SQL query editor is great. Being able to submit the form 
> without using a mouse would be even better.
> Adding:
>  
> {noformat}
> document.getElementById('queryForm')
>  .addEventListener('keydown', function(e) {
>  if (!(e.keyCode == 13 && e.metaKey)) return;
>  if (e.target.form) doSubmitQueryWithUserName();
> });
> {noformat}
> {{to ./exec/java-exec/src/main/resources/rest/query/query.ftl adds such 
> support.}}
> I can file a PR with the code if desired.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6373) Refactor the Result Set Loader to prepare for Union, List support

2018-07-21 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16551787#comment-16551787
 ] 

ASF GitHub Bot commented on DRILL-6373:
---

vrozov commented on issue #1244: DRILL-6373: Refactor Result Set Loader for 
Union, List support
URL: https://github.com/apache/drill/pull/1244#issuecomment-406811356
 
 
   @paul-rogers The change to `NullableVarCharVector` that was part of your PR 
exposed a bug that is part of the existing code. Please see 
`AbstractMapVector.java:48`. The problem is that a newly created map vector 
references a newly created deep copy(clone) of the passed `MaterializedField` 
while newly created child vectors of the map vector references children of 
**another** instance of the `MaterializeField`. If that **other** instance is 
completely immutable it would not cause any problem (except that now there is 
no reason to create deep copy), but with your changes there was an attempt to 
mutate from different threads the **other** instance as that **other** instance 
is used to create multiple outgoing vectors. I hope that this explains what I 
mean when referring to **existing** bug.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Refactor the Result Set Loader to prepare for Union, List support
> -
>
> Key: DRILL-6373
> URL: https://issues.apache.org/jira/browse/DRILL-6373
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.13.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
> Attachments: 6373_Functional_Fail_07_13_1300.txt, 
> drill-6373-with-6585-fix-functional-failure.txt
>
>
> As the next step in merging the "batch sizing" enhancements, refactor the 
> {{ResultSetLoader}} and related classes to prepare for Union and List 
> support. This fix follows the refactoring of the column accessors for the 
> same purpose. Actual Union and List support is to follow in a separate PR.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6622) UNION on tpcds sf100 tables hit SYSTEM ERROR: NullPointerException

2018-07-21 Thread Pritesh Maker (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16551780#comment-16551780
 ] 

Pritesh Maker commented on DRILL-6622:
--

[~ppenumarthy] any recommendations here?

> UNION on tpcds sf100 tables hit SYSTEM ERROR: NullPointerException
> ---
>
> Key: DRILL-6622
> URL: https://issues.apache.org/jira/browse/DRILL-6622
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Codegen
>Affects Versions: 1.14.0
>Reporter: Vitalii Diravka
>Assignee: salim achouche
>Priority: Blocker
> Fix For: 1.14.0
>
> Attachments: 
> MD4208_id_05_1_id_24b2a6f9-ed66-b97e-594d-f116cd3fdd23.json, 
> MD4208_id_05_3_id_24b2ad9c-4568-a476-bbf6-2e17441078b1.json
>
>
> {code}
> SELECT c_customer_id FROM customer 
> UNION
> SELECT ca_address_id FROM customer_address 
> UNION
> SELECT cd_credit_rating FROM customer_demographics 
> UNION
> SELECT hd_buy_potential FROM household_demographics 
> UNION
> SELECT i_item_id FROM item 
> UNION
> SELECT p_promo_id FROM promotion 
> UNION
> SELECT t_time_id FROM time_dim 
> UNION
> SELECT d_date_id FROM date_dim 
> UNION
> SELECT s_store_id FROM store 
> UNION
> SELECT w_warehouse_id FROM warehouse 
> UNION
> SELECT sm_ship_mode_id FROM ship_mode 
> UNION
> SELECT r_reason_id FROM reason 
> UNION
> SELECT cc_call_center_id FROM call_center 
> UNION
> SELECT web_site_id FROM web_site 
> UNION
> SELECT wp_web_page_id FROM web_page 
> UNION
> SELECT cp_catalog_page_id FROM catalog_page;
> {code}
> hit the following error:
> {code}
> Caused by: java.lang.NullPointerException: null
> at 
> org.apache.drill.exec.expr.fn.impl.ByteFunctionHelpers.compare(ByteFunctionHelpers.java:96)
>  ~[vector-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
> at 
> org.apache.drill.exec.test.generated.HashTableGen3$BatchHolder.isKeyMatchInternalBuild(BatchHolder.java:171)
>  ~[na:na]
> at 
> org.apache.drill.exec.physical.impl.common.HashTableTemplate$BatchHolder.isKeyMatch(HashTableTemplate.java:218)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.common.HashTableTemplate$BatchHolder.access$1000(HashTableTemplate.java:120)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.common.HashTableTemplate.put(HashTableTemplate.java:650)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
> at 
> org.apache.drill.exec.test.generated.HashAggregatorGen0.checkGroupAndAggrValues(HashAggTemplate.java:1372)
>  ~[na:na]
> at 
> org.apache.drill.exec.test.generated.HashAggregatorGen0.doWork(HashAggTemplate.java:599)
>  ~[na:na]
> at 
> org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.innerNext(HashAggBatch.java:268)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:172)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.union.UnionAllRecordBatch$UnionInputIterator.next(UnionAllRecordBatch.java:381)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
> {code}
> [~dechanggu] found that the issue is absent in Drill 1.13.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6613) Refactor MaterializedField

2018-07-21 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16551560#comment-16551560
 ] 

ASF GitHub Bot commented on DRILL-6613:
---

sohami commented on a change in pull request #1383: DRILL-6613: Refactor 
MaterializedField
URL: https://github.com/apache/drill/pull/1383#discussion_r204204275
 
 

 ##
 File path: 
exec/vector/src/main/java/org/apache/drill/exec/record/MaterializedField.java
 ##
 @@ -49,39 +54,79 @@ private MaterializedField(String name, MajorType type, 
LinkedHashSet(size));
+  }
+
+  private  void copyFrom(Collection source, Function transformation) {
+Preconditions.checkState(children.isEmpty());
+source.forEach(child -> children.add(transformation.apply(child)));
+  }
+
+  public static MaterializedField create(String name, MajorType type) {
+return new MaterializedField(name, type, 0);
+  }
+
   public static MaterializedField create(SerializedField serField) {
-LinkedHashSet children = new LinkedHashSet<>();
-for (SerializedField sf : serField.getChildList()) {
-  children.add(MaterializedField.create(sf));
+MaterializedField field = new 
MaterializedField(serField.getNamePart().getName(), serField.getMajorType(), 
serField.getChildCount());
+if (OFFSETS_FIELD.equals(field)) {
+  return OFFSETS_FIELD;
 }
-return new MaterializedField(serField.getNamePart().getName(), 
serField.getMajorType(), children);
+field.copyFrom(serField.getChildList(), MaterializedField::create);
+return field;
   }
 
-  /**
-   * Create and return a serialized field based on the current state.
-   */
-  public SerializedField getSerializedField() {
-SerializedField.Builder serializedFieldBuilder = getAsBuilder();
-for(MaterializedField childMaterializedField : getChildren()) {
-  
serializedFieldBuilder.addChild(childMaterializedField.getSerializedField());
+  public MaterializedField copy() {
+return copy(getName(), getType());
+  }
+
+  public MaterializedField copy(MajorType type) {
+return copy(name, type);
+  }
+
+  public MaterializedField copy(String name) {
+return copy(name, getType());
+  }
+
+  public MaterializedField copy(String name, final MajorType type) {
+if (this == OFFSETS_FIELD) {
+  return this;
 }
-return serializedFieldBuilder.build();
+MaterializedField field = new MaterializedField(name, type, 
getChildren().size());
+field.copyFrom(getChildren(), MaterializedField::copy);
 
 Review comment:
   There is a case in `UnnestRecordBatch` where we have to keep a copy of 
Materialized field along with it's children of incoming repeated type vector 
which may be repeated map as well. This is to detect if there is any schema 
change with new incoming batch as compared to previous one.
   
   Also in case of creating a Map vector in output container one can call 
`container.addOrGet(field, null) ` if the field contains children information 
then this call will take care of adding all the children vectors in Map vector 
as well. So you don't have to explicitly add child vectors.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Refactor MaterializedField
> --
>
> Key: DRILL-6613
> URL: https://issues.apache.org/jira/browse/DRILL-6613
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Vlad Rozov
>Assignee: Vlad Rozov
>Priority: Minor
>
> {{MaterializedField}} does not need to implement {{clone()}} and should use 
> constructor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)